MCP server by ferrebouwens
Lucidient Research MCP v1.0
The Limitless Research Operating System - A universal backend where any agent, tool, or workflow can perform research at any scale without architectural changes.
Quick Start
# Start all services
docker compose up -d
# Verify health
curl http://localhost:3001/health
# Scale workers
docker compose up -d --scale worker=5
Architecture
Agent → FastMCP (port 3001) → Redis Queue → Worker Pool
↓ ↓
Postgres MinIO (S3)
(Metadata) (Storage)
↓ ↓
ChromaDB Ollama
(Vectors) (Embeddings)
Services
| Service | Port | Purpose | |---------|------|---------| | MCP Router | 3001 | FastMCP HTTP/SSE endpoint | | Postgres | 5433 | Job metadata, namespaces | | Redis | 6379 | Job queue, pub/sub, progress | | MinIO | 9000/9001 | Object storage (S3-compatible) | | Chroma | 8000 | Vector database | | Ollama | 11434 | Local embedding inference |
Submit a Crawl Job
curl -X POST http://localhost:3001/messages \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "start_crawl_job",
"params": {
"namespace": "my_research",
"seeds": ["https://example.com"],
"config": {
"max_depth": 2,
"max_pages": 10
}
},
"id": 1
}'
Query Results
# Check job status
curl -X POST http://localhost:3001/messages \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "job_status",
"params": {"job_id": "YOUR_JOB_ID"},
"id": 2
}'
# Semantic search
curl -X POST http://localhost:3001/messages \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "query_index",
"params": {
"namespace": "my_research",
"query": "What are the key findings?",
"retrieval": {"top_k": 10}
},
"id": 3
}'
Scale Workers
# Scale to 10 workers
docker compose up -d --scale worker=10
# Or edit docker-compose.yml deploy.replicas
API Reference
All 22 tools are exposed via FastMCP with HTTP/SSE transport:
Category A: Direct Fetch
fetch_page- Single-page fetch
Category B: Async Jobs
start_crawl_job- Queue large-scale crawljob_status- Poll progressjob_results- Paginated resultscancel_job- Cancel running jobdelete_job- Delete job + storage
Category C: Browser Automation
browser_session_create- Persistent Playwright contextbrowser_navigate- Navigate to URLbrowser_interact- Click/type/scrollbrowser_extract- Extract databrowser_screenshot- Capture screenshotbrowser_close- Close session
Category D: Extraction
extract_structured- Structured extractionextract_entities- Entity extractionconvert_document- PDF/DOCX ingestion
Category E: RAG
embed_job- Chunk + vectorizequery_index- Semantic searchget_chunk- Retrieve chunkrag_query- Context window assembly
Category F: Management
list_jobs- List all jobsget_storage_stats- Storage statisticshealth_check- Service health
Configuration
All settings via environment variables:
| Variable | Default | Description |
|----------|---------|-------------|
| POSTGRES_URL | postgresql://postgres:postgres@postgres:5432/mcp | Database connection |
| REDIS_URL | redis://redis:6379 | Redis connection |
| MINIO_ENDPOINT | minio:9000 | MinIO endpoint |
| CHROMA_HOST | chroma | ChromaDB host |
| OLLAMA_HOST | http://ollama:11434 | Ollama endpoint |
| DEFAULT_EMBED_MODEL | embed-gemma:300m | Embedding model |
Development
# Run tests (individual test files to avoid event loop issues)
cd mcp
python -m pytest tests/test_manage.py::test_health_check -v
python -m pytest tests/test_crawl.py::test_start_crawl_job -v
python -m pytest tests/test_fetch.py::test_fetch_page -v
# Import verification
python -c "from server import app; print('OK')"
cd ../worker && python -c "from worker import main; print('OK')"
Design Principles
- No Artificial Caps -
max_pages=0means unlimited - No Data Through MCP - Responses contain only
job_idand pointers - Async Everything - All operations are background jobs
- Namespace Isolation - Every agent/project has its own storage prefix
- Composable Primitives - Atomic operations: fetch, store, chunk, embed, query
License
Lucidient Systems - Proprietary