MCP Servers

A collection of Model Context Protocol servers, templates, tools and more.

G

Gemini Multimodal MCP

by @marcius-llmus

GitHub Repo(1 stars)

MCP server by marcius-llmus

Created 3/6/2026

Updated about 6 hours ago

README

multimodal-reader-mcp

MCP server for reading local audio and video files with Google Gen AI and returning structured observations, timelines, and transcripts.

It analyzes a local media file and returns:

a short summary
a timeline of key moments
transcript snippets for spoken or visible text
key observations and notable signals
relevant clues tailored to the user's question
open questions plus a confidence level

Requirements

uv
Python 3.14
GOOGLE_API_KEY

Model configuration

The default model is gemini-2.5-flash.

You can override the default model for all requests by setting:

MULTIMODAL_READER_MODEL

MCP client configuration

Example Cursor MCP config:

{
  "mcpServers": {
    "multimodal-reader": {
      "command": "uvx",
      "args": ["multimodal-reader-mcp"],
      "env": {
        "GOOGLE_API_KEY": "${env:GOOGLE_API_KEY}",
        "MULTIMODAL_READER_MODEL": "gemini-2.5-flash"
      }
    }
  }
}

Tool

The package exposes one MCP tool:

read_media(file_path, question=None)

file_path must be an absolute path to a local media file.

Quick Setup

Installation guide for this server

Install Package (if required)

uvx gemini-multimodal-mcp

Cursor configuration (mcp.json)

{ "mcpServers": { "marcius-llmus-gemini-multimodal-mcp": { "command": "uvx", "args": [ "gemini-multimodal-mcp" ] } } }