PDF Splitter

Provides random access to PDF contents by loading PDFs and extracting pages, searching text, and rendering pages.

typescript

3

GitHub Stars

typescript

Language

5 months ago

First Indexed

2 months ago

Catalog Refreshed

Documentation & install

Readme and setup notes from the catalogue, plus a client-ready config you can copy for your MCP host.

Installation

Add the following to your MCP client configuration file.

Configuration

View docs

{
  "mcpServers": {
    "espresso3389-pdf-splitter-mcp": {
      "command": "bun",
      "args": [
        "run",
        "/full/path/to/pdf-splitter-mcp/src/index.ts"
      ]
    }
  }
}

PDF Splitter MCP Server provides random access to PDF contents, enabling selective extraction of pages, text, and images while minimizing processing and token costs. You integrate it with an MCP client to load PDFs, run page-level operations, and render pages as images for analysis, previews, or OCR workflows.

How to use

You enable and use the PDF Splitter MCP Server by connecting an MCP client that can talk to an MCP endpoint. Load a PDF to memory, then perform targeted actions such as extracting a single page, extracting a range, searching for text, listing images, or rendering pages as images. You can load multiple PDFs and switch between them using their IDs.

Typical usage flow:

Load a PDF by path or URL to obtain a PDF ID and page count.
Use extract_page to pull content from a specific page by pdfId and pageNumber.
Use extract_range to grab text from a page range.
Use search_pdf to find keywords or patterns (supports plain text and regex).
Use get_pdf_info to access metadata for a loaded PDF.
Use render_page to create an image of a page at a chosen DPI and format (PNG/JPEG).
Use extract_images to pull embedded images, optionally saving them to files.
Use render_pages to generate thumbnails or multi-page image sets.
List loaded PDFs with list_loaded_pdfs to monitor your active sessions.

To configure an MCP client for this server, provide a stdio connection that launches the MCP server locally. The following configuration is an example for Gemini CLI, using Bun to run the TypeScript entry point directly.

{
  "mcpServers": {
    "pdf_splitter": {
      "command": "bun",
      "args": ["run", "/full/path/to/pdf-splitter-mcp/src/index.ts"]
    }
  }
}

Available tools

load_pdf

Load a PDF file into memory (supports URLs) and return a PDF ID with its page count.

extract_page

Extract content from a specific page using the loaded PDF ID and page number.

extract_range

Extract content from a range of pages given a PDF ID, startPage, and endPage.

search_pdf

Search for text within the PDF with optional case sensitivity and regex support.

get_pdf_info

Retrieve metadata and information about a loaded PDF.

list_loaded_pdfs

List all PDFs currently loaded in memory along with their IDs and page counts.

extract_outline

Extract the document outline or table of contents with page references.

list_images

List all images in the PDF with metadata such as page, index, dimensions, and format.

extract_images

Extract images as base64 data and optionally save them to files with an output path pattern.

extract_image

Extract a specific image by page and image index, returning base64 data or saving to a file.

render_page

Render a single PDF page as an image at a chosen DPI and format.

render_pages

Render multiple pages as images in batch, with options for DPI, format, and output saving.