MCP server by paresh-panda
MCP NL Ops — Proper MCP Architecture
Natural language questions about your Kind cluster, backed by the actual Model Context Protocol — not a keyword router or direct function calls.
How it actually works (MCP)
Browser → /api/ask → MCP Client (app.py)
|
1. list_tools() ←──SSE/JSON-RPC── MCP Server (mcp_server.py)
2. Ollama decides which tools to call
3. call_tool(name, args) ──────────► MCP Server executes
4. tool result fed back to Ollama
5. Ollama streams final answer
The LLM (Ollama) owns tool selection — it sees the tool schemas and decides what to call based on the question. No keyword matching.
vs. the wrong way
| Dimension | Keyword router (wrong) | MCP (this repo) |
|---|---|---|
| Tool selection | if "pod" in question | Ollama reads schemas, reasons |
| Tool execution | Direct Python call | JSON-RPC to MCP server |
| Tool discovery | Hardcoded in app | list_tools() at runtime |
| Extensible? | Edit app.py | Add @mcp.tool() to server only |
| Claude Desktop? | No | Yes — point it at port 8001 |
Kind cluster setup (recommended)
Everything runs inside a local Kind cluster —
no port-forwards needed, host ports are mapped in kind-config.yaml.
Prerequisites
# macOS
brew install kind kubectl docker
# Linux (Fedora/RHEL)
sudo dnf install -y kind kubectl
# Docker: https://docs.docker.com/engine/install/
One-command setup
./setup.sh # creates cluster, builds image, deploys everything
After ~3–5 minutes (model pull on first run):
| Service | URL | |------------|------------------------------| | UI / API | http://localhost:8765 | | MCP server | http://localhost:8001/sse | | Prometheus | http://localhost:9090 |
Day-2 commands
./setup.sh status # kubectl get pods -n nlops
./setup.sh build # rebuild image and reload into Kind after code changes
./setup.sh deploy # re-apply manifests only (no image rebuild)
./setup.sh down # delete the entire cluster
Cluster layout
kind-config.yaml ← 1 control-plane + 2 workers, host port mappings
k8s/
namespace.yaml ← nlops namespace
mcp-server.yaml ← Deployment + NodePort + RBAC (read-only K8s access)
app.yaml ← Deployment + NodePort (uvicorn, port 8765)
prometheus.yaml ← Deployment + NodePort + ConfigMap
ollama.yaml ← Deployment + initContainer that pulls llama3.1
Dockerfile ← single image used by mcp-server and nlops-app
The MCP server pod uses a ServiceAccount with a ClusterRole that grants
read-only access to pods, deployments, events, and services — nothing else.
Local dev (without Kind)
Requirements
- Kind cluster running with your MLOps drift demo
- Prometheus accessible (port-forward or NodePort)
- Ollama with a tool-calling model:
llama3.1,mistral, orqwen2.5ollama pull llama3.1 # recommended
Run
# Terminal 1 — MCP Server
python mcp_server.py
# → MCP server on http://localhost:8001
# Terminal 2 — App + UI
uvicorn app:app --reload --port 8765
# → open http://localhost:8765
Env vars
export PROMETHEUS_URL=http://localhost:9090 # port-forwarded
export OLLAMA_MODEL=llama3.1 # must support tool calling
export K8S_NAMESPACE=default
Claude Desktop integration (bonus)
The MCP server is a standard FastMCP SSE server.
Add it to ~/.config/Claude/claude_desktop_config.json:
{
"mcpServers": {
"k8s-mlops": {
"url": "http://localhost:8001/sse"
}
}
}
Now Claude Desktop can also query your cluster using the same tools.
Tools registered on the MCP server
| Tool | What Ollama can call it for |
|---|---|
| get_cluster_summary | General health, pod counts, troubled pods |
| list_pods | Pod-level detail, restarts, readiness |
| list_deployments | Replica mismatches, rollout status |
| get_recent_events | Warnings, OOM kills, scheduling failures |
| get_pod_logs | Logs from a specific pod |
| get_model_metrics | Drift score, accuracy, prediction rate |
| get_resource_metrics | CPU, memory, restart totals |
| get_redis_info | Stream depth, consumer lag |
| get_anomalous_metrics | N-sigma scan across all metrics |