Web Analysis

Provides web search, crawling, and summarization via SearXNG, Creeper, and DeepSeek-powered summaries.
  • typescript

0

GitHub Stars

typescript

Language

3 months ago

First Indexed

3 weeks ago

Catalog Refreshed

Documentation & install

Readme and setup notes from the catalogue, plus a client-ready config you can copy for your MCP host.

Installation

Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "himly0302-web-analysis-mcp": {
      "command": "node",
      "args": [
        "/path/to/web-analysis-mcp/dist/index.js"
      ],
      "env": {
        "CREEPER_PATH": "/home/lyf/workspaces/creeper",
        "SUMMARY_API_KEY": "sk-your-key",
        "DOMAIN_BLACKLIST": "pinterest.com,facebook.com",
        "DOMAIN_WHITELIST": "github.com,stackoverflow.com",
        "FILTER_LLM_MODEL": "glm-4.5-air",
        "SEARXNG_BASE_URL": "http://127.0.0.1:8086",
        "FILTER_LLM_API_KEY": "sk-your-filter-api-key",
        "FILTER_LLM_ENABLED": "true",
        "FILTER_MAX_RESULTS": "8",
        "FILTER_LLM_BASE_URL": "https://api.deepseek.com"
      }
    }
  }
}

Web Analysis MCP combines SearXNG-powered search with Creeper crawling, then uses an LLM to summarize pages and return concise results. It helps you analyze web content without hitting token limits while filtering, classifying, and caching results for efficiency.

How to use

You run this MCP on your side and connect it from your MCP client. The server fetches pages using SearXNG, processes them with Creeper, and then passes the content to an LLM for topic classification and summary. You get high-quality results with domain filtering, deduplication, and selective result curation. Use the included web_search tool through your MCP client to perform intelligent searches, gather pages, and receive summarized outputs ready for your workflow.

How to install

Prerequisites: Node.js and npm installed on your system. You also need Python to run Creeper and access to external services for summarization.

# 1. Install the MCP package/library
# (Clone the MCP source and install dependencies)
git clone <repository-url>
cd web-analysis-mcp
npm install

# 2. Deploy required external services
# SearXNG search engine
# Run locally on port 8086
# You can adapt as needed

docker run -d -p 8086:8080 searxng/searxng

# Creeper crawler
# Clone the Creeper project and install dependencies
 git clone https://github.com/ihub-tech/Creeper.git /home/lyf/workspaces/creeper
 cd /home/lyf/workspaces/creeper && pip install -r requirements.txt

# 3. Configure environment
cp .env.example .env
# Edit .env and set required values:
SEARXNG_BASE_URL=http://127.0.0.1:8086
CREEPER_PATH=/home/lyf/workspaces/creeper
SUMMARY_API_KEY=sk-your-deepseek-key-here

# Optional filtering
FILTER_MAX_RESULTS=8
DOMAIN_BLACKLIST=pinterest.com,facebook.com
DOMAIN_WHITELIST=github.com,stackoverflow.com

# Optional LLM filtering
FILTER_LLM_ENABLED=true
FILTER_LLM_API_KEY=sk-your-filter-api-key
FILTER_LLM_BASE_URL=https://api.deepseek.com
FILTER_LLM_MODEL=glm-4.5-air

# 4. Start the MCP in development or production mode
# Development
npm run dev

# Production
npm run build && npm start

Runtime configuration and MCP endpoints

The MCP can be run as a local stdio server, or you can point an external client at a remote HTTP endpoint if you host the MCP elsewhere.

{
  "mcpServers": {
    "web-analysis": {
      "command": "node",
      "args": ["/path/to/web-analysis-mcp/dist/index.js"],
      "env": {
        "SEARXNG_BASE_URL": "http://127.0.0.1:8086",
        "CREEPER_PATH": "/home/lyf/workspaces/creeper",
        "SUMMARY_API_KEY": "sk-your-key"
      }
    }
  }
}

Environment and tooling

Configure environment variables to enable search, crawling, and summarization. You can enable or disable LLM-based filtering and provide API keys for your summarization provider.

Troubleshooting

Common issues include connectivity to SearXNG, Creeper errors, and debug logging. Start by checking the services are accessible and that environment variables are correctly set.

Notes on configuration and security

  • Use domain whitelists and blacklists to control which sites are considered during searches. - When enabling LLM-based filtering, ensure your API keys are kept secure and not committed to version control. - Use the appropriate environment file (.env) to manage sensitive values.

Tools and endpoints

The server exposes a smart search tool that combines search, crawl, and summarize capabilities under the web_analysis MCP. Your MCP client can invoke the web_search tool to perform queries, fetch results, and obtain summarized outputs.

Available tools

web_search

Smart search tool that combines searching, crawling, and summarization into a single workflow

Built by
VeilStrat
AI signals for GTM teams
© 2026 VeilStrat. All rights reserved.All systems operational