MCP Servers

A collection of Model Context Protocol servers, templates, tools and more.

M
MCP Web Scraper

A package for scraping websites for use in AI tools. Supports langchaingo and MCP

Created 7/14/2025
Updated 7 days ago
Repository documentation and setup instructions

mcp-web-scraper

This package uses Google Chrome's headless APIs to scrape web pages for AI/LLM agents.

Because it uses Chrome as its default user agent, any sites that require Javascript (for example, single page applications) should also be parsable with this tool.

It supports being called either from Go (go lang) via LangChainGo, or as an MCP server.

MCP Server

First compile the code using go:

go build .

Claude Desktop

{
  "mcpServers": {
    "mcp-web-scraper": {
      "command": "/path/to/mcp-web-scraper",
      "args": []
    }
  }
}

Visual Studio Code

{
  "mcp": {
    "servers": {
      "mcp-web-scraper": {
        "command": "/path/to/mcp-web-scraper",
        "args": []
      }
    }
  }
}

LangChainGo tool

Integration into langchain is easy:

import 	"github.com/lmorg/mcp-web-scraper/langchain"

func example() {
    scraper := langchain.NewScraper()
}

Please consult the langchaingo docs for how to use tools with their libraries.

Fallback Modes

If Google Chrome is not installed

If you do not have Google Chrome installed, then mcp-web-scraper will fallback to use Go's HTTP user agent.

This will work in the majority of cases, however you might not get any content for sites that requires Javascript to render.

Markdown Support

By default this module will look for <article> and convert that to Markdown.

If either the page doesn't present itself as an article of some description (eg not a blog, technical documentation, etc) then this module will fallback to returning HTML.

Any HTML document returned will have specific HTML tags removed (such as <script>, <svg>, and HTML comments) to reduce the tokens required for the LLM to parse

Quick Setup
Installation guide for this server

Installation Command (package not published)

git clone https://github.com/lmorg/mcp-web-scraper
Manual Installation: Please check the README for detailed setup instructions and any additional dependencies required.

Cursor configuration (mcp.json)

{ "mcpServers": { "lmorg-mcp-web-scraper": { "command": "git", "args": [ "clone", "https://github.com/lmorg/mcp-web-scraper" ] } } }