TTS macOS MCP Server
tts-mcp-server
Local text-to-speech for Claude Code via MCP. Runs Kokoro-82M on Apple Silicon through MLX-audio, giving Claude the ability to speak task notifications and arbitrary text aloud.
Requirements: macOS on Apple Silicon (M1+), Python 3.12+.
Setup
cd ~/projects/tts-mcp-server
uv venv && uv pip install -e .
# Pre-download the TTS model (~200 MB, one-time):
uv run tts-mcp-init
Register with Claude Code (user-wide, available in all projects):
claude mcp add --transport stdio --scope user tts -- \
/path/to/tts-mcp-server/.venv/bin/tts-mcp-server
Verify:
claude mcp list # should show tts: ✓ Connected
Tools
notify(message)
Quick task-completion alert. Speaks at 1.2x speed with the default voice
(af_heart). Designed for short messages like "Build finished" or "Tests
passed".
speak(text, voice?, speed?)
Full TTS with voice and speed control.
| Parameter | Default | Range / Options |
| --------- | ------------ | --------------------- |
| text | (required) | Any string |
| voice | af_heart | See voices |
| speed | 1.2 | 0.5 -- 2.0 |
Voices
Kokoro ships 54 presets. A useful subset:
| ID | Description |
| ----------- | ------------------------- |
| af_heart | American female (default) |
| af_bella | American female |
| af_nova | American female |
| am_adam | American male |
| am_echo | American male |
| bf_emma | British female |
| bm_george | British male |
Full list: prefix af_ / am_ (American), bf_ / bm_ (British),
jf_ / jm_ (Japanese), zf_ / zm_ (Chinese).
Architecture
Claude Code ──stdio──> FastMCP server ──> MLX-audio/Kokoro ──> afplay
- Lazy loading: The model, spacy G2P pipeline, and Metal shaders all initialize on the first tool call (~6 s). This keeps the MCP handshake instant so health checks pass. Subsequent calls run in ~0.1 s.
- No persistent daemon: Claude Code spawns the server on session start and kills it on exit. No LaunchAgent needed.
- Temp files: Audio is written to a temp WAV, played with
afplay, then deleted. No disk accumulation.
Performance (M2 Max)
| Metric | First call | Subsequent | | ---------------------- | ---------------------- | ---------- | | Latency (short phrase) | ~6 s | ~0.1 s | | Memory | ~420 MB | ~420 MB | | CPU | < 5% (GPU-accelerated) | < 5% |
Troubleshooting
Server shows ✗ Failed to connect: The model is loading during the health
check. This was fixed by deferring model load to first tool call. If you still
see this, ensure main() calls mcp.run() immediately (before any model
loading).
First call is slow (~6 s): Expected. Spacy G2P pipeline and Metal shader compilation happen once per server lifetime. After that, calls are sub-200 ms.
First call hangs indefinitely: The spacy en_core_web_sm model is missing.
Misaki's G2P calls spacy.cli.download() at runtime, which shells out to pip
-- but uv-managed venvs don't have pip, so it hangs forever. This model is
declared as a dependency in pyproject.toml, so uv pip install -e . should
handle it. If you somehow end up without it, reinstall.
afplay not found: You're not on macOS. Replace the afplay call in
_play() with your platform's audio player (e.g., paplay on Linux, sox
cross-platform).
Model download fails: Pre-download with the init script, or manually via
huggingface-cli:
uv run tts-mcp-init
# or:
.venv/bin/huggingface-cli download mlx-community/Kokoro-82M-bf16
License
MIT