GLM Image MCP Server

🚀 Enhanced Model Context Protocol (MCP) server for focused image analysis using OpenRouter and Google Gemini vision models

✨ Features

🎯 Multi-Provider Support

OpenRouter: Access to multiple vision models including x-ai/grok-4-fast:free, Claude, and more
Google Gemini: Direct access to Gemini 2.5 Pro and Flash models
Flexible Switching: Choose provider per request or set environment defaults
Auto-Detection: Automatically detects available API keys and selects the best provider

🔧 Advanced Image Analysis

Basic Analysis: Analyze entire images with customizable prompts
Focused Analysis: Analyze specific aspects (text, faces, objects, colors, layout)
Smart Validation: Robust parameter and image validation with security checks
Error Handling: Comprehensive error reporting and graceful fallbacks

🚀 Performance & Security

Fast Processing: Optimized for quick image analysis
Memory Efficient: Handles large images without memory leaks
Security First: Input validation and sanitization against malicious content
Cross-Platform: Works on Windows, macOS, and Linux

📁 Project Structure

glm-image-mcp/
├── glm-image-mcp.js                    # Main MCP server ⭐
├── package.json                        # Package configuration
├── README.md                          # This file
├── LICENSE                            # MIT License
├── .gitignore                         # Git ignore rules
├── .github/
│   └── workflows/
│       └── test.yml                   # GitHub Actions workflow
├── utils/
│   └── validation.js                  # Input validation utilities
└── examples/
    ├── basic-analysis.js              # Basic usage examples
    └── multi-provider-config.js       # Multi-provider configuration

🚀 Quick Start

Option 1: Install Directly from GitHub (Recommended)

# Install globally using npx (no npm publish needed)
npx github:QuickkApps/GLM-Image-MCP

# Or install globally using git
npm install -g git+https://github.com/QuickkApps/GLM-Image-MCP.git

# Or install locally
npm install git+https://github.com/QuickkApps/GLM-Image-MCP.git

Option 2: Clone from GitHub

git clone https://github.com/QuickkApps/GLM-Image-MCP.git
cd GLM-Image-MCP
npm install

Option 3: Use directly from GitHub (npx)

npx github:QuickkApps/GLM-Image-MCP

🔧 Configuration

1. Set API Keys

Choose one or both providers:

# For OpenRouter (recommended for model variety)
export OPENROUTER_API_KEY="your-openrouter-api-key"
export OPENROUTER_MODEL="x-ai/grok-4-fast:free"

# For Google Gemini (fast and reliable)
export GEMINI_API_KEY="your-gemini-api-key"
export GEMINI_MODEL="gemini-2.5-pro"

2. MCP Client Configuration

Configure your MCP client (like Claude Desktop, GLM, or any MCP-compatible IDE):

{
  "mcpServers": {
    "glm-image-mcp": {
      "command": "npx",
      "args": ["github:QuickkApps/GLM-Image-MCP"],
      "env": {
        "OPENROUTER_API_KEY": "your-openrouter-key",
        "OPENROUTER_MODEL": "x-ai/grok-4-fast:free",
        "GEMINI_API_KEY": "your-gemini-key",
        "GEMINI_MODEL": "gemini-2.5-pro"
      }
    }
  }
}

4. Model Configuration

You can set custom models via environment variables:

# For OpenRouter models
export OPENROUTER_MODEL="anthropic/claude-3-sonnet"
export OPENROUTER_MODEL="openai/gpt-4-vision-preview"
export OPENROUTER_MODEL="x-ai/grok-4-fast:free"

# For Google Gemini models
export GEMINI_MODEL="gemini-1.5-flash"
export GEMINI_MODEL="gemini-2.5-pro"
export GEMINI_MODEL="gemini-1.5-pro"

# Use with npx
OPENROUTER_MODEL="anthropic/claude-3-sonnet" npx github:QuickkApps/GLM-Image-MCP
GEMINI_MODEL="gemini-1.5-flash" npx github:QuickkApps/GLM-Image-MCP

3. Local Development Configuration

For local development:

{
  "mcpServers": {
    "glm-image-mcp": {
      "command": "node",
      "args": ["glm-image-mcp.js"],
      "cwd": "/path/to/glm-image-mcp",
      "env": {
        "OPENROUTER_API_KEY": "your-openrouter-key",
        "GEMINI_API_KEY": "your-gemini-key"
      }
    }
  }
}

🛠️ Available Tools

`analyze_image` - Comprehensive Image Analysis

Analyze images with provider and model selection.

Parameters:

image_path (string, required): Path to image file
prompt (string, required): Analysis prompt
provider (string, optional): "openrouter" or "gemini" (auto-detects if not specified)
model (string, optional): Specific model to use (overrides environment default)

`describe_image` - Quick Image Description

Describe an image in detail with a default descriptive prompt.

Parameters:

image_path (string, required): Path to image file
prompt (string, optional): Custom prompt (uses default if not provided)
provider (string, optional): "openrouter" or "gemini"
model (string, optional): Specific model to use

`focused_analyze_image` - Focused Analysis

Analyze specific aspects of an image with focused prompts.

Parameters:

image_path (string, required): Path to image file
focus_area (string, optional): Specific area ("text", "faces", "objects", "colors", "layout")
prompt (string, optional): Custom focused analysis prompt
provider (string, optional): "openrouter" or "gemini"
model (string, optional): Specific model to use

📊 Usage Examples

Basic Analysis with Auto-Detection

{
  "image_path": "/path/to/image.jpg",
  "prompt": "Describe what you see in this image"
  // Automatically detects available provider
}

OpenRouter with Specific Model

{
  "image_path": "/path/to/image.jpg",
  "prompt": "Analyze this image in detail",
  "provider": "openrouter",
  "model": "anthropic/claude-3-sonnet"
}

Gemini for Fast Analysis

{
  "image_path": "/path/to/image.jpg",
  "prompt": "What objects do you see in this image?",
  "provider": "gemini",
  "model": "gemini-1.5-flash"
}

Focused Analysis

{
  "image_path": "/path/to/document.jpg",
  "focus_area": "text",
  "provider": "gemini"
}

Custom Focused Analysis

{
  "image_path": "/path/to/chart.jpg",
  "prompt": "Extract all data points and trends from this chart",
  "provider": "openrouter",
  "model": "x-ai/grok-4-fast:free"
}

🎯 Provider Comparison

| Feature | OpenRouter | Google Gemini | |---------|-------------|---------------| | Model Variety | 50+ vision models | Gemini 2.5 Pro/Flash | | Speed | Fast | Very Fast | | Cost | Variable (per model) | Competitive | | Accuracy | High | Excellent | | Best For | Model flexibility | Speed & consistency | | Free Models | Yes (grok-4-fast) | Limited quota |

🔧 API Key Setup

OpenRouter API Key

Visit OpenRouter.ai
Sign up and get your API key
Set environment variable: export OPENROUTER_API_KEY="your-key"

Google Gemini API Key

Visit Google AI Studio
Create a new API key
Set environment variable: export GEMINI_API_KEY="your-key"

🧪 Testing

Quick Test

# Test installation
npx glm-image-mcp --help

# Test with sample image (if you have one)
node examples/basic-analysis.js

Integration Test

# Clone and test locally
git clone https://github.com/your-username/glm-image-mcp.git
cd glm-image-mcp
npm install
npm test

🔄 Model Selection Priority

Request model parameter: Overrides all environment defaults
Request provider only: Uses that provider's default model
No parameters: Auto-detects provider based on available API keys
Environment variables: Set defaults when no request parameters provided

🚨 Troubleshooting

Common Issues

Server Won't Start

# Check Node.js version
node --version  # Should be >= 14.0.0

# Check dependencies
npm install

# Test syntax
node -c glm-image-mcp.js

API Key Issues

Error: No API keys found. Please set either GEMINI_API_KEY or OPENROUTER_API_KEY

Solution: Set the correct environment variables

Invalid Provider

Error: Invalid provider: invalid_provider

Solution: Use "openrouter" or "gemini"

Image File Issues

Error: Image file not found: /path/to/image.jpg

Solution: Verify the file path and that the file exists

Unsupported Format

Error: Unsupported image format: .gif. Supported formats: .jpg, .jpeg, .png, .webp, .bmp, .tiff

Solution: Convert image to supported format

🔒 Security Features

✅ API keys are never logged or exposed
✅ Input validation prevents malicious content
✅ Image buffers are validated for format and size
✅ File size limits (50MB max)
✅ Path traversal protection
✅ Comprehensive error handling

📈 Performance

| Metric | Value | |--------|-------| | Startup Time | < 2 seconds | | Analysis Time | 3-10 seconds (depends on image size and model) | | Memory Usage | ~50MB base + image size | | Supported Formats | JPEG, PNG, WebP, BMP, TIFF | | Max File Size | 50MB |

🤝 Integration with MCP Clients

This MCP server works seamlessly with any MCP-compatible client:

Claude Desktop

{
  "mcpServers": {
    "glm-image-mcp": {
      "command": "npx",
      "args": ["glm-image-mcp"]
    }
  }
}

GLM 4.6

Configure MCP settings in your GLM 4.6 interface
Select provider per request or set defaults
Choose model based on your needs
Receive text responses optimized for GLM 4.6 processing

Other MCP Clients

Any MCP-compatible client can use this server with the standard configuration format.

📦 Dependencies

@modelcontextprotocol/sdk (^1.19.1) - MCP framework
node-fetch (^2.6.7) - HTTP requests
sharp (^0.33.0) - Image processing (optional, for enhanced validation)

🚀 Deployment Options

1. npm Package (Recommended)

npm install -g glm-image-mcp
glm-image-mcp

2. Direct from GitHub

npx github:QuickkApps/GLM-Image-MCP

# With custom model
OPENROUTER_MODEL="anthropic/claude-3-sonnet" npx github:QuickkApps/GLM-Image-MCP
GEMINI_MODEL="gemini-1.5-flash" npx github:QuickkApps/GLM-Image-MCP

3. Docker (Coming Soon)

docker run -e OPENROUTER_API_KEY=your-key glm-image-mcp

4. GitHub Actions

Use in CI/CD pipelines with GitHub Actions workflow included.

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

Development Setup

git clone https://github.com/your-username/glm-image-mcp.git
cd glm-image-mcp
npm install
npm test

Pull Request Process

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Model Context Protocol for the MCP framework
OpenRouter for providing access to multiple AI models
Google Gemini for the powerful vision API
The MCP community for feedback and contributions

📞 Support

💬 Discord: arshagor190
🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions

🚀 Simple, reliable, and powerful image analysis for the MCP ecosystem

Made with ❤️ by [QuicKK Apps]

GLM Image MCP Server

✨ Features

🎯 Multi-Provider Support

🔧 Advanced Image Analysis

🚀 Performance & Security

📁 Project Structure

🚀 Quick Start

Option 1: Install Directly from GitHub (Recommended)

Option 2: Clone from GitHub

Option 3: Use directly from GitHub (npx)

🔧 Configuration

1. Set API Keys

2. MCP Client Configuration

4. Model Configuration

3. Local Development Configuration

🛠️ Available Tools

analyze_image - Comprehensive Image Analysis

describe_image - Quick Image Description

focused_analyze_image - Focused Analysis

📊 Usage Examples

Basic Analysis with Auto-Detection

OpenRouter with Specific Model

Gemini for Fast Analysis

Focused Analysis

Custom Focused Analysis

🎯 Provider Comparison

🔧 API Key Setup

OpenRouter API Key

Google Gemini API Key

🧪 Testing

Quick Test

Integration Test

🔄 Model Selection Priority

🚨 Troubleshooting

Common Issues

🔒 Security Features

📈 Performance

🤝 Integration with MCP Clients

Claude Desktop

GLM 4.6

Other MCP Clients

📦 Dependencies

🚀 Deployment Options

1. npm Package (Recommended)

2. Direct from GitHub

3. Docker (Coming Soon)

4. GitHub Actions

🤝 Contributing

Development Setup

Pull Request Process

📄 License

🙏 Acknowledgments

📞 Support

Install Package (if required)

Cursor configuration (mcp.json)

`analyze_image` - Comprehensive Image Analysis

`describe_image` - Quick Image Description

`focused_analyze_image` - Focused Analysis