MCP server by AnkTechsol
SarvaData MCP Server
Agent-native data infrastructure from India
A production-ready Model Context Protocol (MCP) server exposing Sarvadata tools for data ingestion, cleaning, embeddings, search, and reporting. Built for LLM agents, Comet Browser, and Hugging Face Spaces.
🚀 Quick Start
Prerequisites
- Python 3.9+
- Pip
Installation
- Clone the repository
git clone <repository-url>
cd sarvadata-mcp
- Install dependencies
pip install -r requirements.txt
- Run the server
python server.py
# or
uvicorn server:app --host 0.0.0.0 --port 5000
The server will start at http://localhost:5000.
🛠️ Available Tools
The server exposes the following tools via MCP:
| Tool | Description | Use Case |
|------|-------------|----------|
| ingest_dataset | Import data from CSV (file or URL) | Data lake population |
| clean_dataset | Remove nulls, normalize schema, deduplicate | Data quality |
| create_embeddings | Generate vector embeddings (Mock/Stub) | Semantic search |
| semantic_search | Query by meaning (Mock/Stub) | Knowledge retrieval |
| generate_report | Create summary/quality/insights reports | Data documentation |
| schema_validator | Validate JSON/CSV structure | Data Governance |
| format_converter | Convert between CSV/JSON/XML | Data Transformation |
| password_generator | Generate secure passwords | Security |
🤖 Using with AI Agents
Tool Discovery
Agents can query /mcp/tools to discover capabilities:
curl http://localhost:5000/mcp/tools
Tool Invocation
Execute a tool:
curl -X POST http://localhost:5000/mcp/call \
-H "Content-Type: application/json" \
-d '{
"tool": "ingest_dataset",
"arguments": {
"source_type": "url",
"source_path": "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
}
}'
Example Agent Workflow
- INGEST: Load raw data -> Returns
dataset_id - CLEAN: Clean the dataset using
dataset_id - REPORT: Generate insights report
☁️ Deployment
Hugging Face Spaces
This repository is ready for deployment to Hugging Face Spaces (Docker).
- Create a new Space on Hugging Face.
- Select Docker as the SDK.
- Push this repository to the Space (or connect via GitHub).
- The server will automatically start on port 7860.
Docker
docker build -t sarvadata-mcp .
docker run -p 5000:5000 sarvadata-mcp
🏗️ Architecture & Development
Project Structure
├── server.py # FastAPI application and MCP endpoints
├── mcp_registry.py # Tool registry and invocation routing
├── tools/ # Tool implementations (Pandas-based)
├── etl/ # Core data processing modules
├── schemas/ # JSON schemas for tools
└── tests/ # Test suite
Contributing
See CONTRIBUTING.md for details on how to add new tools and contribute to the project.
📄 License
MIT License - see LICENSE for details.
About SarvaData Platform
SarvaData is a comprehensive data tools platform featuring 50+ micro-tools plus a complete visual ETL pipeline builder. This MCP server exposes core SarvaData capabilities to AI agents.
Company Information:
- AnkTechSol
- Udyam Registration: UDYAM-MH-26-0439977
- GST: 27MLOPK7764C1ZF
- Website: https://anktechsol.com