MCP server by rx105204902-ctrl
RAGFlow MCP Server
English | Chinese
RAGFlow MCP Server exposes selected RAGFlow capabilities through the Model Context Protocol (MCP). It lets MCP clients list datasets, ingest documents, and retrieve knowledge from a configured RAGFlow backend through Streamable HTTP or SSE transports.
Key Features
- Exposes RAGFlow dataset listing, document ingestion, and retrieval as MCP tools.
- Supports both Streamable HTTP (
/mcp) and legacy SSE (/sse) transports. - Supports
self-hostmode with one server-side API key andhostmode with per-request credentials. - Restricts document ingestion to trusted DeerFlow file capability URLs.
- Provides local, Docker, and Docker Compose startup paths.
- Keeps API keys out of the image and source-controlled files.
Table of Contents
- Tech Stack
- Project Structure
- Prerequisites
- Quick Start
- Configuration
- MCP Endpoints
- MCP Tools
- Client Examples
- Docker
- Testing
- Architecture
- Security
- Troubleshooting
Tech Stack
| Area | Technology |
| --- | --- |
| Language | Python 3.12+ |
| MCP framework | FastMCP 3.x |
| MCP SDK | mcp>=1.24.0 |
| HTTP client | HTTPX |
| ASGI server | Uvicorn |
| ASGI middleware | Starlette |
| CLI | Click |
| Validation | Pydantic |
| Environment loading | python-dotenv |
| Container runtime | Docker / Docker Compose |
Project Structure
.
|-- server/
| `-- server.py # FastMCP server, RAGFlow connector, ASGI app, CLI entrypoint
|-- client/
| |-- client.py # SSE client example
| `-- streamable_http_client.py # Streamable HTTP client example
|-- tests/
| `-- test_document_ingest_file_uri.py
|-- Dockerfile
|-- docker-compose.yml
|-- docker-entrypoint.sh
|-- requirements.txt
|-- DOCKER.md
|-- README.md
`-- README.zh.md
Prerequisites
For local development:
- Python 3.12 or newer.
uvfor the one-shot startup command, orpipwith a virtual environment.- A reachable RAGFlow backend.
- A RAGFlow API key when using
self-hostmode.
For Docker:
- Docker Engine or Docker Desktop.
- Docker Compose v2 when using
docker compose.
Quick Start
Run with uv
Step 1: install uv if it is not already installed.
python -m pip install uv
Step 2: create the local environment and install dependencies.
uv venv .venv
uv pip install -r requirements.txt
Step 3: start the MCP server.
uv run python server/server.py \
--host=0.0.0.0 \
--port=9388 \
--base-url=http://127.0.0.1:9380 \
--mode=self-host \
--api-key=ragflow-your-api-key
After startup, the server exposes:
- Streamable HTTP:
http://localhost:9388/mcp - SSE:
http://localhost:9388/sse
Run with a virtual environment
PowerShell:
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -r requirements.txt
python server/server.py `
--host=0.0.0.0 `
--port=9388 `
--base-url=http://127.0.0.1:9380 `
--mode=self-host `
--api-key=ragflow-your-api-key
Bash:
python -m venv .venv
source .venv/bin/activate
python -m pip install -r requirements.txt
python server/server.py \
--host=0.0.0.0 \
--port=9388 \
--base-url=http://127.0.0.1:9380 \
--mode=self-host \
--api-key=ragflow-your-api-key
Configuration
The server accepts CLI options and environment variables. The application loads .env and then lets environment variables override CLI values.
| CLI option | Environment variable | Code default | Docker default | Description |
| --- | --- | --- | --- | --- |
| --host | RAGFLOW_MCP_HOST | 127.0.0.1 | 0.0.0.0 | Bind address for the MCP server. |
| --port | RAGFLOW_MCP_PORT | 9382 | 9388 | Bind port for the MCP server. |
| --base-url | RAGFLOW_MCP_BASE_URL | http://127.0.0.1:9380 | http://127.0.0.1:9380 | RAGFlow backend base URL. |
| --mode | RAGFLOW_MCP_LAUNCH_MODE | self-host | self-host | Launch mode: self-host or host. |
| --api-key | RAGFLOW_MCP_HOST_API_KEY | empty | required | RAGFlow API key for self-host mode. |
| --transport-sse-enabled / --no-transport-sse-enabled | RAGFLOW_MCP_TRANSPORT_SSE_ENABLED | true | true | Enable or disable SSE transport. |
| --transport-streamable-http-enabled / --no-transport-streamable-http-enabled | RAGFLOW_MCP_TRANSPORT_STREAMABLE_ENABLED | true | true | Enable or disable Streamable HTTP transport. |
| --json-response / --no-json-response | RAGFLOW_MCP_JSON_RESPONSE | true | true | Use JSON responses for Streamable HTTP. |
| none | RAGFLOW_MCP_FILE_URI_ALLOWED_BASE_URLS | empty | empty | Comma-separated trusted DeerFlow file capability base URLs. |
Launch Modes
self-host mode:
- The server starts with one RAGFlow API key.
- Every MCP request uses that server-side key when calling RAGFlow.
--api-keyorRAGFLOW_MCP_HOST_API_KEYis required.
host mode:
- The server acts as a multi-tenant gateway.
- Each HTTP request must include
Authorization: Bearer ...,api_key, orx-api-key. - The request token is forwarded to RAGFlow.
Example .env
Do not commit .env files with real secrets.
RAGFLOW_MCP_HOST=0.0.0.0
RAGFLOW_MCP_PORT=9388
RAGFLOW_MCP_BASE_URL=http://127.0.0.1:9380
RAGFLOW_MCP_LAUNCH_MODE=self-host
RAGFLOW_MCP_HOST_API_KEY=ragflow-your-api-key
RAGFLOW_MCP_TRANSPORT_SSE_ENABLED=true
RAGFLOW_MCP_TRANSPORT_STREAMABLE_ENABLED=true
RAGFLOW_MCP_JSON_RESPONSE=true
MCP Endpoints
| Endpoint | Transport | Default | Purpose |
| --- | --- | --- | --- |
| /mcp | Streamable HTTP | enabled | Primary endpoint for modern MCP clients. |
| /sse | SSE | enabled | Legacy SSE transport endpoint. |
When both transports are enabled, the server combines both route sets into one Starlette application.
MCP Tools
list_datasets
Lists RAGFlow datasets accessible to the active RAGFlow API key.
| Argument | Type | Required | Default | Description |
| --- | --- | --- | --- | --- |
| page | integer | no | 1 | Dataset page number. |
| page_size | integer | no | 30 | Number of datasets to return. Maximum is 1000. |
| id | string | no | null | Optional dataset ID filter. |
| name | string | no | null | Optional dataset name filter. |
Example:
{
"page": 1,
"page_size": 30
}
document_ingest
Downloads files from trusted DeerFlow file capability URLs, uploads them to a RAGFlow dataset, and submits parsing tasks.
| Argument | Type | Required | Description |
| --- | --- | --- | --- |
| dataset_id | string | yes | Target RAGFlow dataset ID. |
| files | array | yes | File descriptors. Each item must include file_uri; filename is optional. |
Example:
{
"dataset_id": "dataset-1",
"files": [
{
"file_uri": "https://gateway.example/api/file-capabilities/token"
}
]
}
File URI validation rules:
file_urimust use HTTP or HTTPS.- Credentials, query strings, and fragments are rejected.
- Path traversal, local file paths, and arbitrary external URLs are rejected.
- If
RAGFLOW_MCP_FILE_URI_ALLOWED_BASE_URLSis empty, the defaults are:http://localhost:8001/api/file-capabilitieshttp://127.0.0.1:8001/api/file-capabilities
- Set
RAGFLOW_MCP_FILE_URI_ALLOWED_BASE_URLSto allow a custom DeerFlow gateway.
Example:
RAGFLOW_MCP_FILE_URI_ALLOWED_BASE_URLS=https://gateway.example,https://files.example/api
retrieval
Retrieves relevant chunks from RAGFlow for a question.
| Argument | Type | Required | Default | Description |
| --- | --- | --- | --- | --- |
| question | string | yes | none | Query text. |
| dataset_ids | array | no | [] | Dataset filter. If omitted, the server resolves accessible datasets. |
| document_ids | array | no | [] | Document filter. |
| page | integer | no | 1 | Result page number. |
| page_size | integer | no | 10 | Results per page. Maximum is 100. |
| similarity_threshold | float | no | 0.2 | Minimum similarity threshold. |
| vector_similarity_weight | float | no | 0.3 | Vector similarity weight. |
| keyword | boolean | no | false | Enable keyword search. |
| top_k | integer | no | 1024 | Maximum candidates before ranking. |
| rerank_id | string | no | null | Optional reranking model ID. |
| force_refresh | boolean | no | false | Force metadata cache refresh. |
Example:
{
"dataset_ids": ["dataset-1"],
"document_ids": [],
"question": "How to install neovim?",
"page": 1,
"page_size": 10
}
Client Examples
The client/ directory contains two minimal examples.
Run the Streamable HTTP client:
python client/streamable_http_client.py
Run the SSE client:
python client/client.py
For host mode, include one of these headers in the client:
headers = {"Authorization": "Bearer ragflow-your-api-key"}
headers = {"api_key": "ragflow-your-api-key"}
The example clients currently use port 9382. If you start the server with the Docker defaults in this repository, use port 9388.
Docker
More Docker details are available in DOCKER.md.
Build the image:
docker build -t ragflow-mcp-server:latest .
Run with the default container port:
docker run --rm \
--name ragflow-mcp-server \
-p 9388:9388 \
-e RAGFLOW_MCP_HOST_API_KEY=ragflow-your-api-key \
ragflow-mcp-server:latest
If the RAGFlow backend runs on the host machine, Docker Desktop users should use host.docker.internal:
docker run --rm \
--name ragflow-mcp-server \
-p 9388:9388 \
-e RAGFLOW_MCP_BASE_URL=http://host.docker.internal:9380 \
-e RAGFLOW_MCP_HOST_API_KEY=ragflow-your-api-key \
ragflow-mcp-server:latest
Run with Docker Compose:
export RAGFLOW_MCP_HOST_API_KEY=ragflow-your-api-key
export RAGFLOW_MCP_BASE_URL=http://host.docker.internal:9380
docker compose up --build
PowerShell:
$env:RAGFLOW_MCP_HOST_API_KEY = "ragflow-your-api-key"
$env:RAGFLOW_MCP_BASE_URL = "http://host.docker.internal:9380"
docker compose up --build
Stop Compose services:
docker compose down
Testing
Install runtime and test dependencies:
python -m pip install -r requirements.txt pytest pytest-asyncio
Run all tests:
pytest
Run the current focused test file:
pytest tests/test_document_ingest_file_uri.py
The tests use lightweight stubs for FastMCP internals, so the file URI validation and upload behavior can be tested without a running MCP server or RAGFlow backend.
Architecture
Startup Flow
- Click parses CLI options in
server/server.py. python-dotenvloads.env.- Environment variables override CLI values.
- FastMCP registers tools and lifespan context.
create_starlette_app()creates an ASGI app for/mcp,/sse, or both.- Uvicorn serves the ASGI app.
RAGFlow Connector
RAGFlowConnector owns communication with RAGFlow:
- It maintains one async HTTPX client.
- It forwards bearer credentials to RAGFlow.
- It maps backend failures into MCP tool errors.
- It caches dataset and document metadata for retrieval mapping.
- It uploads document bytes with multipart form data.
Authentication Flow
self-host mode:
MCP client -> MCP server -> RAGFlow
uses startup API key
host mode:
MCP client -> request auth header -> MCP server -> RAGFlow
Document Ingestion Flow
document_ingest
-> validate dataset_id and files
-> validate file_uri against allowed DeerFlow capability bases
-> fetch file bytes from the DeerFlow gateway
-> infer or validate the filename
-> upload bytes to RAGFlow
-> submit the parse task
-> return accepted and submitted counts
Retrieval Flow
retrieval
-> validate query and filters
-> resolve accessible datasets when dataset_ids is empty
-> load document metadata when needed
-> call the RAGFlow retrieval API
-> return MCP text content
Security
- Never commit real RAGFlow API keys.
- Use
.env, Docker Compose environment variables, Docker secrets, or deployment-platform secrets. document_ingestintentionally rejects arbitrary URLs and local paths.- The Docker image runs as a non-root user.
- The Docker healthcheck only verifies that the TCP port accepts connections.
- If this service is exposed outside localhost, prefer
hostmode with per-request credentials and TLS in front of the service.
Troubleshooting
--api-key is required when --mode is 'self-host'
The server started in self-host mode without an API key.
Fix:
python server/server.py \
--mode=self-host \
--api-key=ragflow-your-api-key
or:
RAGFLOW_MCP_HOST_API_KEY=ragflow-your-api-key python server/server.py
The container cannot reach 127.0.0.1:9380
Inside a container, 127.0.0.1 points to the container, not the host machine.
Use:
-e RAGFLOW_MCP_BASE_URL=http://host.docker.internal:9380
On Linux, add:
--add-host=host.docker.internal:host-gateway
Missing or invalid authorization header
The server is likely running in host mode and the client did not send credentials.
Use one of:
Authorization: Bearer ragflow-your-api-key
api_key: ragflow-your-api-key
file_uri is not from an allowed DeerFlow file capability endpoint
The ingest URL does not match the allowed DeerFlow capability bases.
Set:
RAGFLOW_MCP_FILE_URI_ALLOWED_BASE_URLS=https://gateway.example
Docker cannot connect to the daemon
Start Docker Desktop or Docker Engine, then run:
docker build -t ragflow-mcp-server:latest .