MCP server that delegates agentic tasks to Ollama models — giving them access to all Claude Code tools
nuntius-mcp
nuntius (Latin) — messenger, herald; one who carries word between those who cannot speak directly.
nuntius-mcp is an MCP server for Claude Code that delegates agentic tasks to local or cloud Ollama models. Claude invokes a single run_agent tool; the Ollama model runs autonomously in a tool-use loop with access to all MCP tools configured in Claude Code — web search, vector memory, browser automation, file I/O, bash, and more. When it's done, the result comes back to Claude as a string.
Claude offloads the heavy lifting. The local model does the work.
How It Works
Claude Code ──run_agent──▶ nuntius-mcp (MCP server)
│
├─ reads ~/.claude.json + plugin caches
├─ discovers all configured MCP tools
└─▶ Ollama model (agentic loop)
│
┌────────┴────────┐
│ tool call? │
│ yes → execute │
│ no → return ◀──┘
│
Claude Code ◀──── final answer ────────────┘
On each run_agent call, nuntius-mcp reads your Claude Code MCP configuration, proxies every enabled tool to the Ollama model under a server__tool namespace (e.g. searxng__search, playwright__browser_screenshot), then runs the model in a tool-use loop until it produces a final answer or hits the iteration limit.
Features
- Full MCP proxy — automatically surfaces every MCP tool Claude Code has configured; no manual wiring
- Server namespacing — tools are named
server__toolso there are never collisions - Opt-in or opt-out — include/exclude specific MCP servers per call or globally in config
- Model aliases — short names (
gpt,fast, etc.) resolve to full Ollama model strings - Built-in fallbacks — file I/O and bash tools when no MCP server covers the capability
- Run logging — every completed run appended to
logs/runs.jsonl - Dual-mode — same package runs as an MCP stdio server or a web UI config manager
- Optional web UI — dashboard, model aliases, tool defaults, and endpoint settings
Requirements
- Python 3.11+
- Ollama running and reachable
- Claude Code (the MCP server is invoked as a subprocess by Claude Code)
- Optional: SearXNG for web search, Qdrant for vector memory
Installation
# Development install (recommended)
uv pip install -e ".[dev]"
# Or with pip
pip install -e ".[dev]"
Configuration
Edit config/config.toml. The file is hot-reloaded on each run_agent call — no restart needed after UI changes.
[server]
log_level = "info"
max_iterations = 20
[ollama]
base_url = "http://your-ollama-host:11434"
[searxng]
base_url = "http://your-searxng-host:8888"
[ui]
host = "127.0.0.1" # set "0.0.0.0" to expose on LAN (no auth implemented)
port = 7860
[models]
default = "gpt" # alias used when model="" in run_agent
[models.aliases]
gpt = "gpt-oss:120b"
glm = "glm-5:cloud"
kimi = "kimi-k2.5:cloud"
max = "minimax-m2.7:cloud"
qwen = "qwen3-coder:480b-cloud"
[tools]
enable_all = false # set true to enable all tools for every call
allowed = [] # pre-enable specific tools globally, e.g. ["searxng__search"]
[mcp]
# exclude = ["some-server"] # opt-out specific servers from being proxied
# include = ["searxng"] # uncomment to switch to opt-in mode (only named servers)
Claude Code Integration
Add to ~/.claude/settings.json under mcpServers:
{
"mcpServers": {
"nuntius-mcp": {
"command": "python",
"args": ["-m", "src"],
"cwd": "/path/to/nuntius-mcp",
"env": { "NUNTIUS_MODE": "mcp" }
}
}
}
Or with uv:
{
"mcpServers": {
"nuntius-mcp": {
"command": "uv",
"args": ["run", "python", "-m", "src"],
"cwd": "/path/to/nuntius-mcp"
}
}
}
After saving, restart Claude Code. The run_agent tool will appear in Claude's tool list.
run_agent Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| prompt | str | required | The task or question for the agent |
| model | str | "" | Model alias or raw Ollama model name. Empty = config default |
| enable_all | bool | false | Enable every discovered tool |
| tools | list[str] | [] | Tool names or server prefixes to enable. "searxng" enables all searxng__* tools |
| max_iterations | int | 0 | Agentic loop iteration limit. 0 = use config value |
| system_prompt | str | "" | Optional system prompt override |
Example — web research with a local model:
run_agent(
prompt="Summarise the three most recent papers on speculative decoding",
model="smart",
tools=["searxng"]
)
Example — file task with bash:
run_agent(
prompt="Find all TODO comments in src/ and write a summary to docs/todos.md",
enable_all=true
)
Example — Ollama cloud model:
Ollama supports cloud-hosted models via the :cloud tag. These work identically to local models — just pass the full model name or a configured alias:
run_agent(
prompt="Review this codebase for security issues and write a findings report",
model="kimi-k2.5:cloud",
tools=["searxng", "qdrant"],
system_prompt="You are a security-focused code reviewer. Be thorough and cite line numbers."
)
MCP Tool Proxy
At startup, nuntius-mcp reads MCP server configurations from:
~/.claude.json— global and project-level servers synced by Claude Code~/.claude/plugins/cache/**/.mcp.json— plugin-installed servers
Each server is connected via stdio and its tools are listed. Tools are registered as server__tool_name (double underscore). The ollama-mcp and nuntius-mcp servers are always excluded to prevent recursion.
Scoping tools per call:
# Enable all tools from the searxng server
tools=["searxng"]
# Enable specific tools by full name
tools=["searxng__search", "qdrant__search"]
# Enable everything
enable_all=true
Note: Large tool lists (40–60 tools across many servers) can degrade smaller Ollama models. Use
tools=["server_name"]to scope what the model sees.
Optional Web UI
Start the UI server alongside the MCP server:
NUNTIUS_MODE=ui python -m src
# Serves on http://127.0.0.1:7860
The MCP server itself has zero dependency on the UI — it runs fine without it.
| Page | Description | |------|-------------| | Dashboard | Recent agent runs — model, prompt snippet, status, duration, iterations | | Models | Add/edit/delete model aliases; set the default model | | Tools | Toggle per-tool global defaults | | Settings | Ollama endpoint, max iterations, log level |
Config changes via the UI write directly to config/config.toml and take effect on the next run_agent call.
Docker
# MCP server only
docker compose up nuntius-mcp -d
# MCP server + web UI
docker compose --profile ui up -d
Important: Claude Code invokes the MCP server as a local subprocess via stdio. The Docker container cannot receive that stdio connection and is not used for normal operation. It is provided for integration testing and self-hosted UI deployments.
network_mode: "host" is used so the container can reach LAN services (Ollama, SearXNG, etc.). This requires Linux — it does not work on Docker Desktop (Mac/Windows).
Copy .env.example to .env and set APP_UID / APP_GID to match your user before starting.
Development
# Run all tests
pytest tests/ -v
# Run a single test
pytest tests/test_agent.py::test_agent_no_tools_returns_content -v
# Inspect the MCP tool list interactively
mcp dev src/server.py
Project structure:
nuntius-mcp/
├── config/
│ └── config.toml # runtime config (edit or use UI)
├── logs/
│ └── runs.jsonl # one JSON line per agent run
├── src/
│ ├── __main__.py # entrypoint — routes to mcp or ui mode
│ ├── server.py # FastMCP stdio server + run_agent tool
│ ├── agent.py # Ollama agentic loop
│ ├── config.py # config loading + model alias resolution
│ └── tools/
│ ├── registry.py # tool registry + enable/disable logic
│ ├── mcp_proxy.py # MCP server discovery + tool forwarding
│ ├── file_ops.py # read_file, write_file, edit_file, glob_files, grep_content
│ └── bash.py # bash (off by default)
├── tests/
├── docker-compose.yml
├── Dockerfile
└── pyproject.toml