Containerized agentic system that supports MCP for local usage.
MCP Ollama Studio
Local-first MCP client with a LangGraph agent, OpenAI-compatible v1 completion interface, and a Dockerized ROCm Ollama inference stack.
Current Stack
This repo currently runs 3 containers:
ollama(ROCm image + mounted host models)backend(FastAPI + LangGraph + MCP adapters)frontend(React app served by Nginx)
What Is Implemented
- LangGraph ReAct agent orchestration with MCP tools
- OpenAI-compatible completion endpoint (
/api/v1/chat/completions) - Streaming SSE with token chunks +
traceevents +[DONE] - Stream safety guard: if no assistant tokens are produced, backend emits a fallback summary from tool findings (or
event: errorwhen no findings exist) - MCP registry loaded from separated JSON schemas
- Stdio MCP runtime safety:
pythoncommands are resolved to the backend interpreter (sys.executable) - Tool trace cleanup: noisy Fetch wrappers are normalized to readable snippets
- Default no-auth MCP set:
- DeepWiki (streamable HTTP)
- Fetch (stdio)
- Time (stdio)
- Frontend Studio with:
- full-height left tools rail
- full-height center chat workspace
- full-height right collapsible reasoning rail
- markdown answer rendering
- collapsible model thinking blocks
- themed custom scrollbars
- streaming auto-scroll
- persistent reasoning trace history across turns
- OpenAPI docs, Swagger, ReDoc
Repository Structure
.
├── docker-compose.yml
├── .env.template
├── ollama/
│ ├── Dockerfile
│ └── entrypoint.sh
├── backend/
│ ├── Dockerfile
│ ├── pyproject.toml
│ ├── src/
│ │ ├── main.py
│ │ ├── core/
│ │ │ ├── settings.py
│ │ │ └── mcp_servers/
│ │ │ ├── 01-deepwiki.json
│ │ │ ├── 02-fetch.json
│ │ │ └── 03-time.json
│ │ ├── routes/
│ │ │ ├── health.py
│ │ │ ├── mcp.py
│ │ │ └── chat.py
│ │ ├── services/
│ │ │ ├── mcp_registry_service.py
│ │ │ ├── agent_service.py
│ │ │ └── chat_service.py
│ │ ├── models/
│ │ │ ├── mcp.py
│ │ │ └── chat.py
│ │ ├── tools/
│ │ │ ├── llm_client_factory.py
│ │ │ └── prompt_loader.py
│ │ └── prompts/
│ │ └── system_prompt.md
│ └── tests/
│ ├── test_chat_service.py
│ ├── test_message_utils.py
│ └── test_mcp_registry_service.py
└── frontend/
├── Dockerfile
├── src/
│ ├── App.tsx
│ ├── components/
│ │ ├── layout/
│ │ ├── features/
│ │ │ ├── ChatStudio.tsx
│ │ │ └── studio/
│ │ │ ├── StudioToolsSidebar.tsx
│ │ │ ├── StudioChatPanel.tsx
│ │ │ └── StudioReasoningSidebar.tsx
│ │ └── ui/
│ ├── services/
│ ├── hooks/
│ ├── lib/
│ ├── stores/
│ └── types/
└── nginx.conf
Inference + Agent Flow
Frontend -> FastAPI /api/v1/chat/completions -> ChatService -> AgentService -> LangGraph create_react_agent -> MCP tools + ChatOpenAI-compatible model -> SSE back to frontend
LLM adapter is provider-agnostic as long as endpoint is OpenAI-compatible (/v1 semantics).
API Endpoints
GET /healthGET /api/v1/mcp/serversPOST /api/v1/chat/completionsstream=false: JSON response withchoices+reasoning_tracestream=true:text/event-streamevent: trace+ JSON trace payloadevent: errorwhen the run completes without an assistant text answerdata: {chat.completion.chunk}token chunksdata: [DONE]
Docs:
- Swagger:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc - OpenAPI JSON:
http://localhost:8000/openapi.json
MCP Schema Catalog
Each MCP server config is defined in its own schema file under backend/src/core/mcp_servers/.
Each schema includes:
- identity (
name,label) - transport (
streamable_httporstdio) - runtime config (
urlorcommand/args/env) - human-facing
instructionsused by the agent
Environment
Copy and edit:
cp .env.template .env
Required values in this project:
OLLAMA_MODELS_DIR(host path mount, example:/home/hector/models/ollama)OLLAMA_MODELLLM_BASE_URL(default:http://ollama:11434/v1)LLM_API_KEY(default for local compose:ollama)LLM_MODELVITE_API_BASE_URL(default:http://localhost:8000)
Run (Docker Compose)
docker compose up --build
Service URLs:
- Frontend:
http://localhost:5173 - Backend:
http://localhost:8000 - Ollama API:
http://localhost:11434
Dev Commands
Backend:
cd backend
UV_CACHE_DIR=/tmp/uv-cache uv sync --frozen
UV_CACHE_DIR=/tmp/uv-cache uv run uvicorn src.main:app --host 0.0.0.0 --port 8000 --reload
Frontend:
cd frontend
npm install
npm run dev
Tests and Quality
Backend:
cd backend
UV_CACHE_DIR=/tmp/uv-cache uv run ruff check src tests
UV_CACHE_DIR=/tmp/uv-cache uv run pytest -q
Frontend:
cd frontend
npm run lint
npm run test
npm run build
Demo Prompts
Use Time MCP and tell me current time in Tokyo and New York.Use Fetch MCP and summarize https://modelcontextprotocol.io in 4 bullets.Use DeepWiki MCP and explain this repo: langchain-ai/langgraph.
UI Validation Checklist
- Open
http://localhost:5173 - Confirm the left tools rail is full-height and can collapse/expand
- Confirm the right reasoning rail is full-height and can collapse/expand
- Send a prompt and verify streaming tokens + trace events
- Confirm the composer stays inside the center chat panel at all times
- Verify footer links to
/docs,/redoc,/openapi.json