Team11 Llamator MCP - MCP Servers

MCP server for llamator: automate LLM red teaming workflows

What this server does
Architecture
Quick start (Docker Compose)
Configuration (env)
HTTP API
MCP API (Streamable HTTP)
Request model
Artifact storage
Repository tests
License 📜

What this server does

This service wraps LLAMATOR into two programmatic interfaces:

HTTP API: queue a test run, poll status, browse/download artifacts.
MCP server (Streamable HTTP): expose LLAMATOR as MCP tools for agent workflows.

It uses Redis to store job state and ARQ workers to execute LLAMATOR runs.

Main usage flow:

Submit a run request (tested model + test plan).
Worker runs LLAMATOR and aggregates results.
Poll job status and optionally download artifacts (logs/reports/etc.).

Architecture

Components:

API container (FastAPI + MCP ASGI app):
- Validates input (validate_test_specs).
- Enqueues ARQ job (run_llamator_job).
- Serves HTTP endpoints under /v1/....
- Mounts MCP app under LLAMATOR_MCP_MCP_MOUNT_PATH (default: /mcp).
Worker container (ARQ):
- Resolves test plan (presets + explicit test specs).
- Runs LLAMATOR (llamator.start_testing) in a thread.
- Persists queued → running → succeeded/failed state into Redis.
Redis:
- Stores job metadata, redacted request, results, errors (TTL-based).

Security:

Optional API key for both HTTP and MCP routes via header X-API-Key.
If LLAMATOR_MCP_API_KEY is empty, authentication is disabled.

Quick start (Docker Compose)

Minimal run:

docker compose up --build

The compose stack includes:

redis (ports: 6379:6379)
api (ports: ${LLAMATOR_MCP_HTTP_PORT:-8000}:${LLAMATOR_MCP_HTTP_PORT:-8000})
worker

Expected external dependency:

An OpenAI-compatible endpoint for attack/judge/tested models (e.g. LM Studio, vLLM, etc.), configured via env.

Configuration (env)

All service settings are read from environment variables prefixed with LLAMATOR_MCP_.

Redis

LLAMATOR_MCP_REDIS_DSN (default: redis://redis:6379/0)
Redis DSN used by API and worker.

Artifacts storage

LLAMATOR_MCP_ARTIFACTS_ROOT (default: /data/artifacts)
Root directory where job artifacts are stored (one subdir per job_id).

API security

LLAMATOR_MCP_API_KEY (default: empty)
If set, requests must include header X-API-Key: <value>.

Attack model (OpenAI-compatible)

LLAMATOR_MCP_ATTACK_OPENAI_BASE_URL (default: http://localhost:1234/v1)
LLAMATOR_MCP_ATTACK_OPENAI_MODEL (default: model-identifier)
LLAMATOR_MCP_ATTACK_OPENAI_API_KEY (default: lm-studio)
LLAMATOR_MCP_ATTACK_OPENAI_TEMPERATURE (default: 0.5)
LLAMATOR_MCP_ATTACK_OPENAI_SYSTEM_PROMPTS (default: built-in prompt)

*_SYSTEM_PROMPTS accepts:

JSON array (preferred), e.g. ["Prompt 1", "Prompt 2"]
or a newline-separated string

Judge model (OpenAI-compatible)

LLAMATOR_MCP_JUDGE_OPENAI_BASE_URL (default: http://localhost:1234/v1)
LLAMATOR_MCP_JUDGE_OPENAI_MODEL (default: model-identifier)
LLAMATOR_MCP_JUDGE_OPENAI_API_KEY (default: lm-studio)
LLAMATOR_MCP_JUDGE_OPENAI_TEMPERATURE (default: 0.1)
LLAMATOR_MCP_JUDGE_OPENAI_SYSTEM_PROMPTS (default: built-in prompt)

Job execution

LLAMATOR_MCP_JOB_TTL_SECONDS (default: 604800)
TTL for job keys in Redis.
LLAMATOR_MCP_RUN_TIMEOUT_SECONDS (default: 3600)
ARQ per-job timeout (worker side).
LLAMATOR_MCP_REPORT_LANGUAGE (default: en, allowed: en|ru)
Default report language used in merged run config.

Logging

LLAMATOR_MCP_LOG_LEVEL (default: INFO)
Python logging level for app and worker.
LLAMATOR_MCP_UVICORN_LOG_LEVEL (default: info)
Uvicorn log level for HTTP entrypoint.

HTTP server

LLAMATOR_MCP_HTTP_HOST (default: 0.0.0.0)
LLAMATOR_MCP_HTTP_PORT (default: 8000)

MCP mounting

LLAMATOR_MCP_MCP_MOUNT_PATH (default: /mcp)
Path where the MCP ASGI app is mounted in FastAPI.
LLAMATOR_MCP_MCP_STREAMABLE_HTTP_PATH (default: /)
Streamable HTTP path exposed by the MCP app (inside mount).

Effective MCP endpoint URL:

http://<host>:<port><LLAMATOR_MCP_MCP_MOUNT_PATH><LLAMATOR_MCP_MCP_STREAMABLE_HTTP_PATH>

With defaults: http://localhost:8000/mcp/.

HTTP API

All routes are under /v1. If LLAMATOR_MCP_API_KEY is set, add: X-API-Key: <key>.

Health

GET /v1/health → {"status":"ok"}

Create a test run

POST /v1/tests/runs → LlamatorTestRunResponse

Request body: LlamatorTestRunRequest (see Request model).

Response (200):

{
  "job_id": "<32-hex>",
  "status": "queued",
  "created_at": "2025-01-01T00:00:00Z"
}

Errors:

400 on validation failure (e.g. duplicate parameter names).
401 if API key is required and invalid/missing.

Get run status

GET /v1/tests/runs/{job_id} → LlamatorJobInfo

Contains:

status: queued | running | succeeded | failed
timestamps
request: redacted request snapshot (no API keys; only api_key_present: true|false)
optional result or error

Errors:

404 if job does not exist.

List artifacts

GET /v1/tests/runs/{job_id}/artifacts

Response:

{
  "job_id": "<job_id>",
  "files": [
    {"path":"logs/run.log","size_bytes":1234,"mtime":1735689600.0}
  ]
}

If the directory does not exist yet: files: [].

Download artifact

GET /v1/tests/runs/{job_id}/artifacts/{path} → file response

The server enforces a safe join to prevent path traversal (.. escapes are rejected).

Errors:

400 invalid path
404 missing job or missing file

MCP API (Streamable HTTP)

The MCP server is mounted under LLAMATOR_MCP_MCP_MOUNT_PATH and uses Streamable HTTP transport.

Tools exposed:

create_llamator_run
- Input: LlamatorTestRunRequest
- Behavior: submit job and await completion (within LLAMATOR_MCP_RUN_TIMEOUT_SECONDS)
- Output: aggregated result dict (see below)
get_llamator_run
- Input: job_id: str
- Behavior: fetch existing job and return aggregated result only if finished
- Output: aggregated result dict

Aggregated result schema (from llamator.start_testing):

{
  "<attack_code_name>": {
    "<metric_name>": 123
  }
}

Headers commonly used by clients in this repo (integration tests):

Accept: application/json, text/event-stream
MCP-Protocol-Version: <version>
Origin: <scheme>://<host>
optionally Mcp-Session-Id: <id> after initialization
optionally X-API-Key: <key> if auth is enabled

Transport detail:

Some MCP handlers may respond with text/event-stream for POST. The server includes an ASGI wrapper that converts single-message SSE responses (event: message + data: <json>) into application/json for POST requests to support clients that expect raw JSON responses.

Request model

`tested_model` (required)

OpenAI-compatible client config:

kind: "openai" (required)
base_url: OpenAI-compatible base URL (required), e.g. http://host:port/v1
model: model identifier (required)
api_key: optional
temperature: optional, [0.0, 2.0]
system_prompts: optional (tuple/list in JSON)
model_description: optional

`plan` (required)

Test plan can be defined via presets and/or explicit test lists:

preset_name: optional preset name supported by LLAMATOR (e.g. all, rus, owasp:llm01)
num_threads: optional, >= 1
basic_tests: optional list of:
- code_name: str
- params: optional list of {name, value} (parameter names must be unique per test)
custom_tests: optional list of:
- import_path: fully-qualified class (import policy: must start with llamator. or llamator_mcp_server.)
- params: optional list of {name, value} (names unique per test)

`run_config` (optional)

User overrides for LLAMATOR run config:

enable_logging: bool
enable_reports: bool
artifacts_path: safe relative path inside the job artifacts directory
debug_level: int in {0,1,2}
report_language: "en" | "ru"

The server merges user config with defaults and stores the resolved artifacts_path as an absolute path under: <LLAMATOR_MCP_ARTIFACTS_ROOT>/<job_id>/....

Minimal example request (HTTP)

{
  "tested_model": {
    "kind": "openai",
    "base_url": "http://host.docker.internal:1234/v1",
    "model": "model-identifier",
    "api_key": "lm-studio"
  },
  "plan": {
    "preset_name": "owasp:llm07",
    "num_threads": 1
  }
}

Artifact storage

By default, each job writes artifacts into:

<LLAMATOR_MCP_ARTIFACTS_ROOT>/<job_id>/

If run_config.artifacts_path is provided, it must be a safe relative path and is resolved inside the job root.

The HTTP API exposes:

listing metadata for all files under job root
downloading a single file by relative path (safe-joined server-side)

Repository tests

The repository ships integration tests that exercise the running server (no unit tests in this repo).

Location:

tests/integration/test_http_api.py
tests/integration/test_mcp_api.py

What they verify:

HTTP API tests

GET /v1/health returns {"status":"ok"}
POST /v1/tests/runs creates a job and returns job_id + initial status
GET /v1/tests/runs/{job_id} returns LlamatorJobInfo and redacts secrets (api_key_present)
GET /v1/tests/runs/does-not-exist returns 404
GET /v1/tests/runs/{job_id}/artifacts returns a schema-compatible listing
GET /v1/tests/runs/{job_id}/artifacts/../secrets.txt is rejected (400)
duplicate parameter names in test params cause 400 validation error

MCP API tests

MCP tools/list contains create_llamator_run and get_llamator_run
create_llamator_run returns a non-empty aggregated result dict
result can be returned either in structuredContent or as JSON in content[].text

Test configuration:

tests/.env.test defines the integration defaults:
- timeouts and polling intervals
- MCP protocol version header value
- minimal request payload defaults (preset name, num threads, tested model base_url/model/api_key)

License 📜

This project is licensed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license. See the LICENSE file for details.

MCP Servers

Contents

What this server does

Architecture

Quick start (Docker Compose)

Configuration (env)

Redis

Artifacts storage

API security

Attack model (OpenAI-compatible)

Judge model (OpenAI-compatible)

Job execution

Logging

HTTP server

MCP mounting

HTTP API

Health

Create a test run

Get run status

List artifacts

Download artifact

MCP API (Streamable HTTP)

Request model

`tested_model` (required)

`plan` (required)

`run_config` (optional)

Minimal example request (HTTP)

Artifact storage

Repository tests

HTTP API tests

MCP API tests

License 📜

Install Package (if required)

Cursor configuration (mcp.json)

Contents

What this server does

Architecture

Quick start (Docker Compose)

Configuration (env)

Redis

Artifacts storage

API security

Attack model (OpenAI-compatible)

Judge model (OpenAI-compatible)

Job execution

Logging

HTTP server

MCP mounting

HTTP API

Health

Create a test run

Get run status

List artifacts

Download artifact

MCP API (Streamable HTTP)

Request model

tested_model (required)

plan (required)

run_config (optional)

Minimal example request (HTTP)

Artifact storage

Repository tests

HTTP API tests

MCP API tests

License 📜

Install Package (if required)

Cursor configuration (mcp.json)

`tested_model` (required)

`plan` (required)

`run_config` (optional)