MCP server by sinryo
daizo-mcp
An MCP (Model Context Protocol) server that provides AI assistants with direct access to Buddhist text databases including CBETA, Pāli Tipitaka, and SAT. Built in Rust for high performance text search and retrieval.
What You Can Do
Ask your AI assistant to:
- Search by title: "Find the Lotus Sutra in CBETA"
- Search by content: "Search for texts mentioning '阿弥陀' across all CBETA texts"
- Retrieve specific texts: "Show me chapter 1 of DN 1 from the Pāli Canon"
- Explore by topic: "What does the Majjhima Nikaya say about meditation?"
- Pattern search: "Find all occurrences of 'nibbana' or 'vipassana' in Tipitaka texts"
- Search & Focus: "Find where 'Dhammacakkappavattana' appears, then show me the 10 lines before and 200 lines after"
The AI can search across thousands of Buddhist texts in real-time and provide accurate citations.
See also: Japanese README | Traditional Chinese README
Prerequisites
Git is required for downloading Buddhist text repositories.
Install Git: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
Quick Install
curl -fsSL https://raw.githubusercontent.com/sinryo/daizo-mcp/main/scripts/bootstrap.sh | bash -s -- --yes --write-path
This automatically:
- Builds the binaries
- Downloads CBETA and Tipitaka text repositories (~2-3GB)
- Builds search indexes
- Registers with Claude Code and Codex if available
Manual Setup
- Build:
cargo build --release
- Install:
scripts/install.sh --prefix "$HOME/.daizo" --write-path
Add to MCP Clients
Claude Code CLI:
claude mcp add daizo /path/to/DAIZO_DIR/bin/daizo-mcp
Codex CLI - add to ~/.codex/config.toml
:
[mcp_servers.daizo]
command = "/path/to/DAIZO_DIR/bin/daizo-mcp"
Data Sources
- CBETA (Chinese Buddhist texts): https://github.com/cbeta-org/xml-p5
- Pāli Tipitaka (romanized): https://github.com/VipassanaTech/tipitaka-xml
- SAT (online database): Additional search capability
CLI Usage
Search Commands
# Title-based search
daizo-cli cbeta-title-search --query "楞伽經" --json
daizo-cli tipitaka-title-search --query "dn 1" --json
# Fast content search (with line numbers)
daizo-cli cbeta-search --query "阿弥陀" --max-results 10
daizo-cli tipitaka-search --query "nibbana|vipassana" --max-results 15
Fetch Commands
# Retrieve specific texts
daizo-cli cbeta-fetch --id T0858 --part 1 --max-chars 4000 --json
daizo-cli tipitaka-fetch --id e0101n.mul --max-chars 2000 --json
# Line-based context retrieval (after search)
daizo-cli cbeta-fetch --id T0858 --line-number 342 --context-before 10 --context-after 200
daizo-cli tipitaka-fetch --id s0305m.mul --line-number 158 --context-before 5 --context-after 100
Management
daizo-cli doctor --verbose # Check installation
daizo-cli index-rebuild --source all # Rebuild indexes
daizo-cli version # Show version
MCP Tools
The MCP server provides these tools for AI assistants:
Search Tools
- cbeta_title_search: Title-based search in CBETA corpus
- cbeta_search: Fast regex content search across CBETA texts (returns line numbers)
- tipitaka_title_search: Title-based search in Tipitaka corpus
- tipitaka_search: Fast regex content search across Tipitaka texts (returns line numbers)
- sat_search: Additional online database search
Fetch Tools
- cbeta_fetch: Retrieve CBETA text by ID with options for specific parts/sections
- Line-based retrieval:
lineNumber
,contextBefore
,contextAfter
parameters
- Line-based retrieval:
- tipitaka_fetch: Retrieve Tipitaka text by ID with section support
- Line-based retrieval:
lineNumber
,contextBefore
,contextAfter
parameters
- Line-based retrieval:
- sat_fetch, sat_pipeline: Additional database retrieval tools
Search & Focus Workflow
- Use
*_search
to find content and get line numbers - Use
*_fetch
withlineNumber
to get focused context around matches
Utility Tools
- index_rebuild: Rebuild search indexes (auto-downloads data if needed)
Features
- Fast Search: Parallel regex search across entire text corpora with line number tracking
- Smart Retrieval: Context-aware text extraction with fetch hints and flexible line-based context
- Search & Focus: Find content, then retrieve customizable context (e.g., 10 lines before, 200 after)
- Multiple Formats: Support for TEI P5 XML, plain text, and structured data
- Automatic Data Management: Downloads and updates text repositories automatically
- Caching: Intelligent caching for online queries
Environment
- DAIZO_DIR: Base directory (default: ~/.daizo)
- Data: xml-p5/, tipitaka-xml/romn/
- Cache: cache/
- Binaries: bin/
License
MIT OR Apache-2.0 © 2025 Shinryo Taniguchi
Contributing
Issues and PRs welcome. Please include daizo-cli doctor --verbose
output with bug reports.