MCP Image Recognition Server

Provides image recognition capabilities using multiple model providers via an MCP server with URL or Base64 image inputs.

python

GitHub Stars

python

Language

3 months ago

First Indexed

3 weeks ago

Catalog Refreshed

Documentation & install

Readme and setup notes from the catalogue, plus a client-ready config you can copy for your MCP host.

Installation

Add the following to your MCP client configuration file.

Configuration

View docs

{
  "mcpServers": {
    "glasses666-mcp-image-recognition-py": {
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/absolute/path/to/mcp-image-recognition-py",
        "server.py"
      ],
      "env": {
        "DEFAULT_MODEL": "gemini-1.5-flash",
        "GEMINI_API_KEY": "YOUR_GEMINI_API_KEY",
        "OPENAI_API_KEY": "YOUR_OPENAI_API_KEY",
        "OPENAI_BASE_URL": "https://api.openai.com/v1"
      }
    }
  }
}

You run a Python-based MCP server that adds image recognition capabilities to your MCP ecosystem. It supports describing images or answering questions about them and can switch between multiple model providers to balance speed, cost, and accuracy. It accepts image URLs or Base64 data, making it flexible for various upload flows.

How to use

You connect this MCP server to an MCP client and use the dedicated tool to analyze images. The core function is recognize_image, which lets you pass an image and optionally a prompt or a specific model to obtain a descriptive response about the image. You can supply the image as an HTTP/HTTPS URL or as a Base64 string.

How to install

Prerequisites: you need Python 3.10 or higher and an API key for at least one model provider (Google Gemini, OpenAI, Aliyun DashScope, etc.).

Option A: Using uv (Recommended) install path

Clone the repository
Create and edit the environment file with your keys
Run the server
Optional: run via ephemeral execution

Command blocks below show the exact steps to run the server using uv or uvx, as well as a standard Python setup. Copy one complete block at a time.

Linux / macOS (uv run)

{
  "mcpServers": {
    "image_recog": {
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/absolute/path/to/mcp-image-recognition-py",
        "server.py"
      ],
      "env": {
        "GEMINI_API_KEY": "YOUR_GEMINI_API_KEY",
        "OPENAI_API_KEY": "YOUR_OPENAI_API_KEY",
        "OPENAI_BASE_URL": "https://api.openai.com/v1",
        "DEFAULT_MODEL": "gemini-1.5-flash"
      }
    }
  }
}

Linux / macOS (standard Python)

# Clone the repo
git clone https://github.com/glasses666/mcp-image-recognition-py.git
cd mcp-image-recognition-py

# Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env and add your API keys

# Run the server
python server.py

Windows (standard Python)

# Clone the repository
git clone https://github.com/glasses666/mcp-image-recognition-py.git
cd mcp-image-recognition-py

# Create and activate virtual environment
python -m venv venv
.\venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
copy .env.example .env
# Edit .env and add your API keys

# Run the server
python server.py

Option B: Using uvx (ephemeral execution)

You can run without cloning the repository by using uvx to fetch and run the MCP server from git. You still provide environment variables.

uvx --from git+https://github.com/glasses666/mcp-image-recognition-py mcp-image-recognition

Standard Python (pip)

If you prefer a traditional Python setup, follow these steps.

Linux / macOS

# Clone the repo
git clone https://github.com/glasses666/mcp-image-recognition-py.git
cd mcp-image-recognition-py

# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env and add your API keys

# Run the server
python server.py

Windows (pip)

# Clone the repo
git clone https://github.com/glasses666/mcp-image-recognition-py.git
cd mcp-image-recognition-py

# Create and activate virtual environment
python -m venv venv
.\venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
copy .env.example .env
# Edit .env and add your API keys

# Run the server
python server.py

Configuration

Create a .env file in the project root based on the example and fill in your API keys for the providers you plan to use.

Usage reference for image recognition

The server exposes a function named recognize_image that accepts an image input and optional parameters to customize the response.

Security and keys

Protect your API keys. Do not commit .env files to version control. Use environment-specific configurations for production deployments.

Troubleshooting tips

If the server fails to start, check that the environment file contains valid keys for at least one provider and that Python 3.10+ is being used. Ensure network access to the provider endpoints is allowed.

Notes and examples

The server supports multiple providers to balance speed and cost. You can replace the default model by setting DEFAULT_MODEL in your environment file.

Additional details

Tools available through the MCP endpoint include recognize_image, which processes an image from a URL or Base64 data and returns a descriptive response.

Available tools

recognize_image

Analyzes an image provided as a URL or Base64 data and returns a descriptive text or answer about the image. Optional prompt and model override allow custom instructions per request.

Built by

VeilStrat

AI signals for GTM teams