MCP Servers

A collection of Model Context Protocol servers, templates, tools and more.

1000 Genomes Project dataset MCP Server

Created 12/20/2025
Updated 2 days ago
Repository documentation and setup instructions

1000 Genomes Project dataset MCP Server

Natural language access to 1000 Genomes Project dataset, hosted online in Dnaerys variant store

Dataset is sequenced & aligned to GRCh38 by New York Genome Center

  • 2504 unrelated samples from the phase three panel
  • additional 698 samples from 602 family trios
    • 3202 samples total (1598 males, 1604 females)
  • dataset details

Key Features

  • real-time access to 138 044 724 unique variants and about 442 billion individual genotypes in 3202 samples

  • variant, sample, and genotype selection based on coordinates, annotations, zygosity

  • filtering by VEP, ClinVar, gnomAD AF and AlphaMissense annotations

  • filtering by inheritance model (de novo, heterozygous dominant, homozygous recessive)

Deployments

Remote MCP service is available online via Streamable HTTP:

  • http://db.dnaerys.org:80/mcp
  • https://db.dnaerys.org:443/mcp

For local build with stdio transport see details below

Architecture

MCP Server is implemented as a Java EE service, accessing 1KGP dataset via gRPC calls to public Dnaerys variant store service.

  • service implementation is based on Quarkus MCP Server
  • provides MCP over Streamable HTTP, HTTP/SSE and STDIO transports

Examples

Many questions below were flagged by Opus 4.5’s safety filters and were left unanswered, hence Sonnet was used for most of them unless specified otherwise. Answers below are from Sonnet 4.5: some from multi-agent research system, some with extended thinking mode, and some from a single-agent system in normal mode.

Incomplete Penetrance & Genetic Resilience

Identify potential modifier variants for well-known pathogenic alleles in TTN - variants that consistently co-occur in the same haplotype block with pathogenic alleles and may alter severity or penetrance. Conduct research for pathogenic alleles documented in the literature. Use KGP dataset of healthy individuals to find potential modifier variants. Start with 100kb for "the same haplotype block" definition, then extend if required. Evaluate statistical significance for the best modifier candidates found. No initial constraints for modifier types.

Identify samples in the KGP dataset that are homozygous for variants classified as 'Pathogenic' in ClinVar for severe autosomal recessive metabolic disorders. For these specific samples, scan their exomes for enrichment of variants in known suppressor genes or alternative metabolic pathways that might compensate for the primary defect. Propose a mechanism of compensation based on pathway analysis.

Select samples carrying known dominant-negative variants in KRT5 or KRT14 genes (Epidermolysis Bullosa) in the KGP. Search for potential cis- or trans-acting rescue modifiers. Specifically, check if these samples carry variants that promote the upregulation of the homologous KRT6 or KRT16 genes (paralog compensation). Can you detect a statistically significant enrichment of 'paralog-boosting' promoter variants in these resilient carriers ?

Structural Intolerance

Which regions in XXXX gene are most likely disease-critical, with strong purifying selection, based on available variation patterns across functional domains in KGP ? Do statistical evaluation.

In what cardiac related genes, e.g. ion channels, variants in KGP dataset near catalytic residues or ligand-binding pockets show strong depletion compared to flanking residues (±20 amino acids) ?

  • results might be some

Reclassification & AlphaMissense Integration

Retrieve all variants in KGP dataset in the voltage-gated sodium channel gene family (SCN1A, SCN2A, SCN5A) currently classified as 'VUS' in ClinVar. Correlate their 'Likely Pathogenic' AlphaMissense classification with their frequency in this healthy cohort. Synthesize a reasoned argument to reclassify a subset of these as 'Likely Benign' based on the logic that pathogenic predictions by AlphaMissense are incompatible with the observed allele frequency in this healthy population.

Oligogenic Burden

Calculate the 'Ciliary Mutational Load' for every individual in the KGP dataset. Aggregate all rare, non-synonymous variants across the entire Bardet-Biedl Syndrome (BBS) gene panel (BBS1 through BBS21). Is there a clear 'cliff' or maximum mutational burden observed in healthy individuals ? Determine if the healthy cohort contains any 'triallelic' carriers (homozygous at one locus, heterozygous at another) and model why they do not display the BBS phenotype.

Protein-Protein Interactions

Analyze samples in the KGP dataset with missense variants located at the 'hinge' or 'head' domains in Cohesin complex genes (SMC1A, SMC3, RAD21). Perform a 'co-evolution' analysis - do samples with a destabilizing mutation in the SMC1A head domain tend to carry a complementary variant in the SMC3 head domain that restores electrostatic compatibility (e.g., a charge swap from Glu->Lys in one and Lys->Glu in the other) ?

  • results might be some

More examples here

Available Tools

Description for 30 tools and parameters can be found here

Installation

Project can be run locally with MCP over stdio and/or http transports

Option A - build & run locally

  • build the project and package it as a single über-jar:
    • jar is located in target/onekgpd-mcp-runner.jar and includes all dependencies
./mvnw package -DskipTests -Dquarkus.package.jar.type=uber-jar
  • run it locally with dev profile
    • both stdio and http transports are enabled
    • http transport is on quarkus.http.port
    • project expects JRE 21 to be available at runtime
java -Dquarkus.profile=dev -jar <full path>/onekgpd-mcp-runner.jar

Option B - build & run in docker

  • in order to run in docker, stdio transport needs to be disabled to prevent application from stopping itself due to closed stdio in containers

    • it's already configured in prod profile
    • it's the default configuration overall
  • build with prod profile

docker build -f Dockerfile -t onekgpd-mcp .
  • run as you prefer, e.g. docker run -p 9000:9000 --name onekgpd-mcp --rm onekgpd-mcp

Connecting with MCP clients

  • to connect via http transport, remote or local, simply direct the client to an appropriate destination, e.g. http://localhost:9000/mcp or https://db.dnaerys.org:443/mcp

  • to connect via stdio transport, MCP client should start application with dev profile and with a full path to the jar file

    • e.g. for Claude Desktop and stdio transport add to claude_desktop_config.json:
{
  "mcpServers": {
    "OneKGPd": {
      "command": "java",
      "args": ["-Dquarkus.profile=dev", "-jar", "/full/path/onekgpd-mcp-runner.jar"]
    }
  }
}

Verification

How many variants exist in 1000 Genome Project ?

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Quick Setup
Installation guide for this server

Installation Command (package not published)

git clone https://github.com/dnaerys/OneKGPd-MCP
Manual Installation: Please check the README for detailed setup instructions and any additional dependencies required.

Cursor configuration (mcp.json)

{ "mcpServers": { "dnaerys-onekgpd-mcp": { "command": "git", "args": [ "clone", "https://github.com/dnaerys/OneKGPd-MCP" ] } } }