- Home
- MCP servers
- Ragscore
Ragscore
- python
28
GitHub Stars
python
Language
2 months ago
First Indexed
3 weeks ago
Catalog Refreshed
Documentation & install
Readme and setup notes from the catalogue, plus a client-ready config you can copy for your MCP host.
You can generate QA datasets and evaluate your RAG system locally using simple, private workflows. This MCP server enables you to produce QA pairs from your documents and test how well your retrieval augmented generation setup answers questions, all without exposing data to external services. It emphasizes privacy, speed, and compatibility with local or cloud-backed LLMs.
How to use
You interact with the RAGScore MCP server by issuing two core actions through an MCP client or CLI: generate QA pairs from your document collection and evaluate your RAG system against the generated or provided gold questions and answers.
How to install
Prerequisites: you need Python and pip available on your system. You will also need an MCP client to communicate with the server, or you can use the provided CLI if supported by your setup.
Install the core package and optional features with Python’s package manager.
Configuration and usage notes
The MCP server exposes HTTP endpoints for evaluating a RAG system. You can point your client to the endpoint at http://localhost:8000/query. You can generate QA pairs from documents located in your docs directory and then run evaluation against the same endpoint.
Optional environment variable support includes an OpenAI API key if you plan to use the OpenAI provider. The key should be set in your environment as OPENAI_API_KEY, for example: export OPENAI_API_KEY="sk-...".
Examples and quick start
Generate QA pairs from documents and evaluate them against your RAG endpoint.
Troubleshooting
If you encounter connection issues, verify that the MCP server is running and reachable at the configured URL. Ensure your documents are accessible and that your RAG endpoint at the query URL is responding.
Available tools
generate_qas
Generate QA question-answer pairs from your documents to build a QA dataset used for evaluation.
evaluate_rag
Evaluate your RAG system by comparing model answers to gold QA pairs and produce accuracy and failure details.
quick_test
A notebook-friendly Python API to audit RAG performance, visualize results, and inspect failures.