playwright_browser_skill

This skill enables advanced browser automation with Playwright to handle dynamic pages, mimic user interactions, and extract data efficiently.

Python

11

GitHub Stars

7

Bundled Files

2 months ago

Catalog Refreshed

4 months ago

First Indexed

Readme & install

Copy the install command, review bundled files from the catalogue, and read any extended description pulled from the listing source.

Installation

Preview and clipboard use veilstrat where the catalogue uses aiagentskills.

npx veilstrat add skill arcaneorion/alice-single --skill playwright_browser

automator.py4.6 KB
browser_tool.py2.3 KB
download_images.py3.4 KB
scraper.py3.9 KB
search.py2.7 KB
SKILL.md7.2 KB
snapshot.py2.1 KB

Overview

This skill provides advanced browser automation built on Playwright for reliably interacting with modern web pages. It handles dynamic JavaScript rendering, user-interaction simulation, targeted data extraction, page snapshots, and bulk image downloads. It is optimized for container environments and asynchronous workflows to support scalable scraping and automation tasks.

How this skill works

The skill launches a headless Chromium instance with container-friendly flags and executes scripted actions or command-line tasks. It can wait for elements, run chained interactions (click, fill, select, evaluate), capture screenshots or PDFs, and extract data via CSS selectors into JSON. Image extraction locates img tags and downloads originals using alt/title-based filenames and configurable limits.

When to use it

Scraping pages that require JavaScript rendering or client-side navigation
Automating form submissions, logins, or multi-step user flows
Capturing full-page screenshots or exporting pages to PDF for reporting
Bulk downloading images from a page with controlled concurrency
Running structured search tasks (Baidu) and returning JSON results

Best practices

Use wait or wait_for actions for dynamic content to ensure accurate extraction
Limit concurrency and set sensible timeouts when processing many pages to reduce memory use
Provide explicit CSS selectors to extract precise fields and use multiple option for lists
Add small random delays between interactions on sites with anti-bot defenses
Run Playwright install steps and ensure Chromium binaries are available in the environment

Example use cases

Login to a dashboard, navigate to reports, take screenshots and export a PDF snapshot
Run a Baidu query and return top N structured results as JSON for downstream analysis
Scrape product pages by extracting title, price, images, and download the image assets to a folder
Automate an end-to-end workflow: fill forms, handle popups, submit, and capture confirmation screenshot
Crawl galleries and batch-download images with filename generation from alt/title

FAQ

Use an automator action chain: fill username, fill password, click login, then wait_for a post-login selector before proceeding.

How can I extract dynamically generated content?

Use the --wait parameter or include wait/wait_for actions in automator so Playwright waits for the target element to render before extracting.

Why did some images fail to download?

Failures can be caused by anti-hotlinking, lazy loading, base64 encoding, or network restrictions; add delays, use the original image URL when available, or configure a proxy if needed.