Python MCP server for browser-fingerprint web fetch, text extraction, JSON APIs, and link extraction via curl_cffi.
stealth_fetch_mcp
stealth_fetch_mcp is a Python MCP server that fetches and parses web content with
browser-grade TLS fingerprint impersonation using curl_cffi.
It is designed for MCP clients (Claude Code/Desktop, Codex, and similar) that need a more resilient fetch tool when default Python HTTP signatures are blocked.
Why This Server
- Uses
curl_cffiimpersonation profiles (default:chrome) for browser-like TLS/HTTP behavior. - Provides focused fetch tools for HTML, readable text, JSON APIs, and link extraction.
- Adds practical safeguards: truncation, actionable error messages, and strict input validation.
Features
stealth_fetch_page: fetch raw HTML with browser impersonation.stealth_fetch_text: fetch and return cleaned readability-style text.stealth_fetch_json: fetch JSON APIs (GET/POST) and pretty-print JSON.stealth_extract_links: extract links with CSS selector and regex filtering.
All tools are:
- read-only and idempotent
- annotated as open-world MCP tools
- protected by output truncation (
[truncated at {n} chars]) - implemented with centralized, actionable error handling
Requirements
- Python 3.12+
uv/uvx
Project Layout
stealth-fetch-mcp/
├── pyproject.toml
├── README.md
├── CONTRIBUTING.md
├── LICENSE
├── src/
│ └── stealth_fetch_mcp/
│ ├── __init__.py
│ ├── client.py
│ ├── parser.py
│ └── server.py
└── tests/
Local Setup
cd /Users/miller/projects/curl_mcp/stealth-fetch-mcp
uv sync
Run the MCP Server
uv run stealth-fetch-mcp
You can also run directly:
uv run python -m stealth_fetch_mcp.server
Tool Reference
stealth_fetch_page
url(required)impersonate(default:"chrome")headers(optional object)timeout(default:30)follow_redirects(default:true)session_options(optional object;curl_cffi.AsyncSessionconfig)request_options(optional object; per-requestcurl_cfficonfig)max_chars(default:100000)- returns: raw HTML string
stealth_fetch_text
url(required)impersonate(default:"chrome")selector(optional CSS selector)session_options(optional object;curl_cffi.AsyncSessionconfig)request_options(optional object; per-requestcurl_cfficonfig)max_chars(default:50000)- returns: cleaned markdown-ish text content
stealth_fetch_json
url(required)impersonate(default:"chrome")headers(optional object)method("GET"or"POST", default:"GET")body(optional JSON string for POST)session_options(optional object;curl_cffi.AsyncSessionconfig)request_options(optional object; per-requestcurl_cfficonfig)max_chars(default:100000)- returns: pretty-printed JSON string
stealth_extract_links
url(required)impersonate(default:"chrome")selector(default:"a[href]")pattern(optional regex onhref)max_results(default:100)session_options(optional object;curl_cffi.AsyncSessionconfig)request_options(optional object; per-requestcurl_cfficonfig)max_chars(default:100000)- returns: JSON list of
{text, href, absolute_url}
curl_cffi Option Coverage
This MCP now exposes the practical curl_cffi configuration surface through:
session_options: session-level defaults used to create an ephemeralAsyncSessionfor that tool call.request_options: per-request overrides passed intoAsyncSession.request(...).
session_options fields
headers,cookies,authproxies,proxy,proxy_authbase_url,paramsverify(boolor CA bundle pathstr)timeout(floator(connect, read)tuple)trust_env,allow_redirects,max_redirectsimpersonate,ja3,akamai,extra_fpdefault_headers,default_encodinghttp_version(v1|v2|v2tls|v2_prior_knowledge|v3|v3only)debug,interface,certdiscard_cookies,raise_for_statusmax_clientscurl_options(low-level CurlOpt overrides)
request_options fields
params,data,jsonheaders,cookies,authtimeout,allow_redirects,max_redirectsproxies,proxy,proxy_authverify,referer,accept_encodingimpersonate,ja3,akamai,extra_fpdefault_headers,default_encodingquote,http_version,interface,certmax_recv_speed,discard_cookiescurl_options(low-level CurlOpt overrides)
curl_options format
curl_options accepts a list of {option, value} entries:
optioncan be:- CurlOpt name:
"TIMEOUT_MS" - fully qualified name:
"CurlOpt.TIMEOUT_MS" - numeric option id (integer or numeric string)
- CurlOpt name:
valuesupports primitive JSON values (string,number,boolean)
Intentional constraints
request_options.stream=trueis rejected because tool outputs return buffered text/JSON, not streaming frames.- multipart upload/callback-centric request modes are intentionally not exposed in MCP schemas for safety and determinism.
Research Notes
Configuration coverage was built from curl_cffi primary sources:
- API signatures and session/request option docs: API reference
- usage patterns and impersonation behavior: Quick Start
- supported impersonate targets: Impersonate targets
- exact runtime kwargs handling:
curl_cffi/requests/session.py(AsyncSession.__init__,AsyncSession.request)curl_cffi/requests/utils.py(set_curl_options)
Architecture Notes
client.py: sharedAsyncSessionfactory, fetch wrapper, and centralized error mapping.parser.py: HTML cleaning/readability extraction, URL resolution, link extraction.server.py: FastMCP server, lifespan-managed shared session, tool registration, Pydantic models.
MCP Configuration (uvx)
Use uvx --from <local-path> for local development without publishing a package.
Claude Desktop (claude_desktop_config.json)
{
"mcpServers": {
"stealth-fetch": {
"command": "uvx",
"args": ["--from", "/path/to/stealth-fetch-mcp", "stealth-fetch-mcp"]
}
}
}
Claude Code
claude mcp add stealth-fetch -- uvx --from /path/to/stealth-fetch-mcp stealth-fetch-mcp
Codex CLI / App
codex mcp add stealth-fetch -- uvx --from /path/to/stealth-fetch-mcp stealth-fetch-mcp
Development and Testing (TDD)
uv run pytest -q
uv run pytest --cov=stealth_fetch_mcp --cov-report=term-missing
uv run ruff check .
uv run mypy src
Verification Commands
uv run python -c "from stealth_fetch_mcp.server import mcp; print('OK')"
uv run python -m stealth_fetch_mcp.server
uv build
claude mcp add --help
codex mcp add --help
Limitations and Safety
- This server improves transport-level compatibility, but it is not a CAPTCHA solver.
- Always respect site terms of service, robots rules, and rate limits.
- Keep request scopes targeted; avoid scraping sensitive or restricted content.
License
MIT. See LICENSE.
Contributing
See CONTRIBUTING.md.