web-fetch_skill

This skill fetches and extracts clean, readable content from web pages using Jina Reader, returning title, text, and metadata for analysis.

TypeScript

30

GitHub Stars

1

Bundled Files

2 months ago

Catalog Refreshed

4 months ago

First Indexed

Readme & install

Copy the install command, review bundled files from the catalogue, and read any extended description pulled from the listing source.

Installation

Preview and clipboard use veilstrat where the catalogue uses aiagentskills.

npx veilstrat add skill vaayne/agent-kit --skill web-fetch

SKILL.md2.3 KB

Overview

This skill fetches and extracts clean, readable content from any public URL using the Jina Reader API. It returns structured JSON containing title, markdown-formatted content, metadata, and token-usage info optimized for downstream LLM processing. Use it when you need reliable article text extraction or to convert web pages into text for analysis or summarization.

How this skill works

The skill identifies and validates the URL, sends a fetch request to the Jina Reader API, and parses the returned document into a compact JSON payload. It extracts title, description, the main article body (cleaned and formatted as markdown), plus metadata such as source URL and token counts. Errors like invalid URLs, timeouts, or HTTP failures are surfaced via stderr and nonzero exit codes.

When to use it

You need to read or analyze the main content of a webpage.
Extracting article text from blog posts, news pages, or documentation.
Preparing web content for summarization, question answering, or indexing.
Automating data collection from reference pages or public APIs.
Converting web pages to clean markdown for further NLP pipelines.

Best practices

Always validate and normalize URLs before invoking the skill to avoid avoidable errors.
Increase timeout for large or slow-loading pages (default is 30 seconds).
Handle non-200 HTTP responses and retries in calling logic for robustness.
Inspect token usage metadata if you plan to chain many extractions to control costs.
Sanitize or filter extracted content if downstream systems require stricter input control.

Example use cases

Fetch a news article to generate a concise summary for a daily briefing.
Extract API docs or changelogs to feed into a developer assistant.
Pull product pages for automated competitor analysis or feature extraction.
Convert long-form blog posts into markdown to create vector embeddings for search.
Retrieve academic or technical articles for citation extraction and indexing.

FAQ

No API key is required; the skill uses the public Jina Reader API as provided.

What happens if the page is behind authentication or blocked?

The fetch will fail with an HTTP error; the skill reports the error details on stderr and returns a nonzero exit code.

Can I adjust request timeout?

Yes — the skill accepts a timeout parameter (default 30 seconds) to accommodate slow pages.