MCP server giving Claude Desktop / Cursor / Cline ad-free web scraping + search via three tools: fetch_page, fetch_pages_batch, search_web.
ai-first-scraper-mcp
Plug Claude Desktop, Cursor, or Cline straight into an ad-free web scraper + search engine. Three tools, one line of config.
What it does
Adds three tools to any MCP-compatible agent:
| Tool | What it does |
|------|--------------|
| fetch_page | Fetch one URL → return clean Markdown (HTML or PDF). |
| fetch_pages_batch | Fetch up to 25 URLs in parallel → return Markdown for each. |
| search_web | Run a web search and return the top-k result pages already converted to Markdown. |
No more "the model called curl and then tried to parse 80kB of ad HTML." Your agent receives clean Markdown ready to reason about.
Backed by the ai-first-scraper and ai-first-search APIs.
Install
Fastest — uvx (no install, runs from PyPI on demand)
// claude_desktop_config.json / cline_mcp_settings.json / ~/.cursor/mcp.json
{
"mcpServers": {
"ai-first-scraper": {
"command": "uvx",
"args": ["ai-first-scraper-mcp"]
}
}
}
Restart your client (Claude Desktop / Cursor / Cline). The three tools above will appear automatically.
Alternative — pip install
pip install ai-first-scraper-mcp
{
"mcpServers": {
"ai-first-scraper": {
"command": "ai-first-scraper-mcp"
}
}
}
Where the config file lives
| Client | Config path |
|--------|-------------|
| Claude Desktop (macOS) | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Claude Desktop (Windows) | %APPDATA%\Claude\claude_desktop_config.json |
| Cursor | ~/.cursor/mcp.json |
| Cline (VS Code) | ~/Library/Application Support/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json |
Point at your own backend (optional)
By default this server calls the public ai-first-scraper.onrender.com and
ai-first-search.onrender.com instances. If you want to self-host, set env
vars in your MCP config:
{
"mcpServers": {
"ai-first-scraper": {
"command": "uvx",
"args": ["ai-first-scraper-mcp"],
"env": {
"SCRAPER_URL": "https://your-scraper.example.com",
"SEARCH_URL": "https://your-search.example.com",
"AFS_TIMEOUT": "60"
}
}
}
}
Verify it works
Open your MCP client and ask the agent:
"Use the search_web tool to find the top 3 recent articles about MCP and summarize them in 5 bullets each."
You should see the agent call search_web, get back Markdown for each result,
and produce the summary without ever touching raw HTML.
Companion projects
- ai-first-scraper — the per-URL Markdown cleaner this MCP server fans out to.
- ai-first-search — search → scrape → markdown pipeline.
- mcp-rec — record & replay any MCP server's traffic for tests and bug reports.
- llm-cache-proxy — local cache for OpenAI/Anthropic API calls.
- promptlocker — lockfile for prompts.
- context-diff — see what blew up your Claude Code context window.
- agentwatch — overlay for browser AI agents.
Develop locally
git clone https://github.com/yubinkim444/ai-first-scraper-mcp.git
cd ai-first-scraper-mcp
uv sync # or: pip install -e .
ai-first-scraper-mcp # speaks MCP over stdio
To test against a local client, point its MCP config at the same command.
License
MIT © yubinkim444