Voice Mode for Claude Code (and other MCP Hosts)
voice-mcp - Voice Mode for Claude Code
A Model Context Protocol (MCP) server that enables voice interactions with Claude and other LLMs. Requires only an OpenAI API key and microphone/speakers.
π₯οΈ Compatibility
Runs on: Linux β’ macOS β’ Windows (WSL) | Python: 3.10+ | Tested: Ubuntu 24.04 LTS, Fedora 42
β¨ Features
- ποΈ Voice conversations with Claude - ask questions and hear responses
- π Multiple transports - local microphone or LiveKit room-based communication
- π£οΈ OpenAI-compatible - works with any STT/TTS service (local or cloud)
- β‘ Real-time - low-latency voice interactions with automatic transport selection
- π§ MCP Integration - seamless with Claude Desktop and other MCP clients
π― Simple Requirements
All you need to get started:
- π OpenAI API Key (or compatible service) - for speech-to-text and text-to-speech
- π€ Computer with microphone and speakers OR βοΈ LiveKit server (LiveKit Cloud or self-hosted)
Quick Start
Setup for Claude Code:
export OPENAI_API_KEY=your-openai-key
claude mcp add voice-mcp uvx voice-mcp
claude
Try: "Let's have a voice conversation"
Example Usage
Once configured, try these prompts with Claude:
"Let's have a voice conversation"
"Ask me about my day using voice"
"Tell me a joke"
(Claude will speak and wait for your response)"Say goodbye"
(Claude will speak without waiting)
The new converse
function makes voice interactions more natural - it automatically waits for your response by default.
Claude Desktop Setup
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Using uvx (recommended)
{
"mcpServers": {
"voice-mcp": {
"command": "uvx",
"args": ["voice-mcp"],
"env": {
"OPENAI_API_KEY": "your-openai-key"
}
}
}
}
Using Docker/Podman
{
"mcpServers": {
"voice-mcp": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"--device", "/dev/snd",
"-e", "PULSE_RUNTIME_PATH=/run/user/1000/pulse",
"-v", "/run/user/1000/pulse:/run/user/1000/pulse",
"ghcr.io/mbailey/voice-mcp:latest"
],
"env": {
"OPENAI_API_KEY": "your-openai-key"
}
}
}
}
Using pip install
{
"mcpServers": {
"voice-mcp": {
"command": "voice-mcp",
"env": {
"OPENAI_API_KEY": "your-openai-key"
}
}
}
}
Tools
| Tool | Description | Key Parameters |
|------|-------------|----------------|
| converse
| Have a voice conversation - speak and optionally listen | message
, wait_for_response
(default: true), listen_duration
(default: 10s), transport
(auto/local/livekit) |
| listen_for_speech
| Listen for speech and convert to text | duration
(default: 5s) |
| check_room_status
| Check LiveKit room status and participants | None |
| check_audio_devices
| List available audio input/output devices | None |
Note: The converse
tool is the primary interface for voice interactions, combining speaking and listening in a natural flow.
Configuration
π See docs/configuration.md for complete setup instructions for all MCP hosts
π Ready-to-use config files in config-examples/
Quick Setup
The only required configuration is your OpenAI API key:
export OPENAI_API_KEY="your-key"
Optional Settings
# Custom STT/TTS services (OpenAI-compatible)
export STT_BASE_URL="http://localhost:2022/v1" # Local Whisper
export TTS_BASE_URL="http://localhost:8880/v1" # Local TTS
export TTS_VOICE="nova" # Voice selection
# LiveKit (for room-based communication)
# See docs/livekit/ for setup guide
export LIVEKIT_URL="wss://your-app.livekit.cloud"
export LIVEKIT_API_KEY="your-api-key"
export LIVEKIT_API_SECRET="your-api-secret"
# Debug mode
export VOICE_MCP_DEBUG="true"
Architecture
βββββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββββββ
β Claude/LLM β β LiveKit Server β β Voice Frontend β
β (MCP Client) βββββββΊβ (Optional) βββββββΊβ (Optional) β
βββββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββββββ
β β
β β
βΌ βΌ
βββββββββββββββββββββββ ββββββββββββββββββββ
β Voice MCP Server β β Audio Services β
β β’ converse β β β’ OpenAI APIs β
β β’ listen_for_speechβββββββΊβ β’ Local Whisper β
β β’ check_room_statusβ β β’ Local TTS β
β β’ check_audio_devices ββββββββββββββββββββ
βββββββββββββββββββββββ
Troubleshooting
Common Issues
- No microphone access: Check system permissions for terminal/application
- UV not found: Install with
curl -LsSf https://astral.sh/uv/install.sh | sh
- OpenAI API error: Verify your
OPENAI_API_KEY
is set correctly - No audio output: Check system audio settings and available devices
- Container audio: May need to adjust device paths for your system
Debug Mode
Enable detailed logging and audio file saving:
export VOICE_MCP_DEBUG=true
Debug audio files are saved to: ~/voice-mcp_recordings/
License
MIT