MCP server by Magic-Man-us
Pydantic Documentation MCP Server
A Model Context Protocol (MCP) server providing local-first access to Pydantic and Pydantic AI documentation with BM25-powered full-text search.
Features
- Local-first architecture with offline-only mode (configurable)
- BM25 full-text search across all documentation
- Pre-processed JSONL data included for fast setup
- Intelligent path resolution (no hardcoded paths)
- Complete coverage of Pydantic v2 and Pydantic AI documentation
- Path validation and security controls
Requirements
- Python 3.12+
- uv package manager
- ~15MB disk space (with indices)
Quick Start
# Clone repository
git clone <repository-url>
cd mcp_pydantic_docs
# Create virtual environment and install dependencies
uv sync
# The server will auto-build indices on first run
# Or manually build them:
uv run python -m mcp_pydantic_docs.indexer
# Verify installation
uv run mcp-pydantic-docs
Note: The server automatically builds search indices from included JSONL files on first startup if they don't exist. This typically takes 5-10 seconds.
Installation
Development Setup
- Create virtual environment and install dependencies:
cd mcp_pydantic_docs
uv sync
This creates .venv/ and installs all dependencies from pyproject.toml.
- Build search indices:
uv run python -m mcp_pydantic_docs.indexer
Generates:
- data/pydantic_all_bm25.pkl(~3.2MB)
- data/pydantic_all_records.pkl(~6MB)
- Test the server:
uv run mcp-pydantic-docs
Production Deployment
Option 1: Direct execution with uv
uv --directory /path/to/mcp_pydantic_docs run mcp-pydantic-docs
Option 2: Build and install wheel
# Build distribution
uv build
# Install wheel
uv pip install dist/mcp_pydantic_docs-0.1.0-py3-none-any.whl
# Run
mcp-pydantic-docs
Option 3: Install in editable mode
uv pip install -e /path/to/mcp_pydantic_docs
MCP Client Configuration
Add to your MCP settings (e.g., cline_mcp_settings.json):
{
  "mcpServers": {
    "pydantic-docs": {
      "disabled": false,
      "timeout": 60,
      "type": "stdio",
      "command": "uv",
      "args": [
        "--directory",
        "/absolute/path/to/mcp_pydantic_docs",
        "run",
        "mcp-pydantic-docs"
      ]
    }
  }
}
Replace /absolute/path/to/mcp_pydantic_docs with your installation path.
Architecture
Directory Structure
mcp_pydantic_docs/
├── pyproject.toml              # Package configuration
├── uv.lock                     # Locked dependencies
├── mcp_pydantic_docs/          # Source code
│   ├── __init__.py
│   ├── mcp.py                  # MCP server implementation
│   ├── indexer.py              # BM25 index builder
│   ├── normalize.py            # HTML to JSONL converter
│   └── setup.py                # Setup utilities
├── data/                       # Search data (in git: JSONL only)
│   ├── pydantic.jsonl          # Pydantic docs (2.9MB)
│   ├── pydantic_ai.jsonl       # Pydantic AI docs (3.3MB)
│   ├── pydantic_all_bm25.pkl   # BM25 index (generated)
│   └── pydantic_all_records.pkl # Document records (generated)
├── docs_raw/                   # Raw HTML (not in git)
│   ├── pydantic/
│   └── pydantic_ai/
└── docs_md/                    # Markdown cache (not in git)
Path Resolution
The server automatically locates data directories:
- Searches up from mcp.pyfordata/ordocs_raw/
- Falls back to relative paths from package directory
- Can be overridden with environment variables:
- PDA_DOC_ROOT- Path to Pydantic v2 HTML docs
- PDA_DOC_ROOT_AI- Path to Pydantic AI HTML docs
- PDA_DATA_DIR- Path to data directory
 
Available Tools
Health Checks
health.ping
Returns: "pong"
health.validate
Returns: {
  "valid": bool,
  "message": str,
  "bm25_present": bool,
  "records_present": bool,
  "bm25_size_mb": float,
  "records_size_mb": float
}
Documentation Access
pydantic.search
Parameters:
  - query: str (search query)
  - k: int = 10 (number of results)
Returns: SearchResponse {
  "results": [
    {
      "title": str,
      "url": str,
      "anchor": str | null,
      "snippet": str
    }
  ]
}
pydantic.get
Parameters:
  - path_or_url: str (relative path or full URL)
Returns: GetResponse {
  "url": str,
  "path": str,
  "text": str,
  "html": str
}
pydantic.section
Parameters:
  - path_or_url: str
  - anchor: str (section ID)
Returns: SectionResponse {
  "url": str,
  "path": str,
  "anchor": str,
  "section": str,
  "truncated": bool
}
pydantic.api
Parameters:
  - symbol: str (e.g., "BaseModel", "TypeAdapter")
  - anchor: str | null (optional section)
Returns: dict {
  "symbol": str,
  "url": str,
  "section": str | "text": str
}
Administration
pydantic.mode
Returns: {
  "offline_only": bool,
  "doc_root": str,
  "doc_root_ai": str,
  "data_dir": str,
  "bm25_present": bool,
  "counts": {
    "pydantic_html": int,
    "pydantic_ai_html": int
  },
  "display_bases": dict
}
admin.cache_status
Returns: {
  "paths": dict,
  "documentation": dict,
  "jsonl_data": dict,t
  "search_indices": dict,
  "offline_mode": bool
}
admin.rebuild_indices
Returns: {
  "success": bool,
  "message": str,
  "bm25_size_mb": float,
  "records_size_mb": float
}
Updating Documentation
Rebuild Indices
uv run python -m mcp_pydantic_docs.indexer
Download Latest Documentation
# Check current status
uv run python -m mcp_pydantic_docs.setup --status
# Download and build indices
uv run python -m mcp_pydantic_docs.setup --download --build-index
# Force re-download
uv run python -m mcp_pydantic_docs.setup --download --force
# Clean cache
uv run python -m mcp_pydantic_docs.setup --clean
Security
Offline Mode (Default)
- OFFLINE_ONLY = Truein- mcp.py
- Blocks all HTTP/HTTPS requests except known base URLs as identifiers
- File path validation prevents directory traversal
- All content served from local cache
Enabling Online Fallback
Edit mcp_pydantic_docs/mcp.py:
OFFLINE_ONLY = False  # Allow remote fetching
Note: Online mode is not recommended for production use.
Development
Running Tests
uv run pytest
Code Quality
# Format code
uv run black mcp_pydantic_docs/
# Lint
uv run ruff check mcp_pydantic_docs/
# Type check
uv run mypy mcp_pydantic_docs/
Building Package
# Build wheel and sdist
uv build
# Output: dist/mcp_pydantic_docs-0.1.0-py3-none-any.whl
#         dist/mcp_pydantic_docs-0.1.0.tar.gz
Git Strategy
Included in Repository
- Source code
- uv.lock(reproducible builds)
- data/*.jsonl(~6MB, pre-processed data)
- Documentation and configuration
Excluded from Repository
- .venv/(virtual environment)
- data/*.pkl(binary indices, rebuilt from JSONL)
- docs_raw/(45MB HTML, downloadable)
- docs_md/(derived data)
Troubleshooting
Search indices not found
uv run python -m mcp_pydantic_docs.indexer
Wrong Python version
Ensure Python 3.12+ is active:
uv python list
uv python install 3.12
Path resolution fails
Set explicit paths:
export PDA_DATA_DIR=/path/to/mcp_pydantic_docs/data
export PDA_DOC_ROOT=/path/to/mcp_pydantic_docs/docs_raw/pydantic
export PDA_DOC_ROOT_AI=/path/to/mcp_pydantic_docs/docs_raw/pydantic_ai
MCP connection issues
Verify server runs standalone:
uv run mcp-pydantic-docs
# Should start and listen on stdio
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines on:
- Development setup
- Code style and standards
- Testing requirements
- Pull request process
- Commit message conventions
For bugs and feature requests, please open an issue on GitHub.