--- title: Open-WebSearch MCP emoji: πŸ”Ž colorFrom: blue colorTo: indigo sdk: docker app_port: 3000 pinned: false ---
# Open-WebSearch MCP Server [![ModelScope](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/Aas-ee/3af09e0f4c7821fb2e9acb96483a5ff0/raw/badge.json&color=%23de5a16)](https://www.modelscope.cn/mcp/servers/Aasee1/open-webSearch) [![Trust Score](https://archestra.ai/mcp-catalog/api/badge/quality/Aas-ee/open-webSearch)](https://archestra.ai/mcp-catalog/aas-ee__open-websearch) [![smithery badge](https://smithery.ai/badge/@Aas-ee/open-websearch)](https://smithery.ai/server/@Aas-ee/open-websearch) ![Version](https://img.shields.io/github/v/release/Aas-ee/open-websearch) ![License](https://img.shields.io/github/license/Aas-ee/open-websearch) ![Issues](https://img.shields.io/github/issues/Aas-ee/open-websearch) **[πŸ‡¨πŸ‡³ δΈ­ζ–‡](./README-zh.md) | πŸ‡ΊπŸ‡Έ English**
A Model Context Protocol (MCP) server based on multi-engine search results, supporting free web search without API keys. ## Features - Web search using multi-engine results - bing - baidu - ~~linux.do~~ temporarily unsupported - csdn - duckduckgo - exa - brave - juejin - HTTP proxy configuration support for accessing restricted resources - No API keys or authentication required - Returns structured results with titles, URLs, and descriptions - Configurable number of results per search - Customizable default search engine - Support for fetching individual article content - csdn - github (README files) - generic HTTP(S) page / Markdown content ## TODO - Support for ~~Bing~~ (already supported), ~~DuckDuckGo~~ (already supported), ~~Exa~~ (already supported), ~~Brave~~ (already supported), Google and other search engines - Support for more blogs, forums, and social platforms - Optimize article content extraction, add support for more sites - ~~Support for GitHub README fetching~~ (already supported) ## Deploy to Hugging Face Spaces This repository can be deployed directly as a **Docker Space**. ### 1. Create a Docker Space Create a new Space on Hugging Face and choose **Docker** as the SDK, or push this repository to an existing Docker Space. ### 2. Required runtime variables In **Space Settings β†’ Variables and secrets**, configure: | Variable | Recommended Value | Description | |----------|-------------------|-------------| | `MODE` | `http` | Run the HTTP server only in Spaces | | `PORT` | `3000` | Must match `app_port` in the YAML header | | `ENABLE_CORS` | `true` | Allows browser-based access if needed | | `CORS_ORIGIN` | `*` | Relaxed CORS for public/demo use | | `DEFAULT_SEARCH_ENGINE` | `duckduckgo` | A reliable default for public Spaces | Optional: | Variable | Example Value | Description | |----------|---------------|-------------| | `ALLOWED_SEARCH_ENGINES` | `duckduckgo,bing,baidu,csdn,juejin` | Restrict allowed engines | | `USE_PROXY` | `false` | Enable HTTP proxy support | | `PROXY_URL` | `http://127.0.0.1:7890` | Proxy URL when proxying is enabled | ### 3. Public endpoints after deployment - Home page: `/` - Health check: `/healthz` - MCP endpoint: `/mcp` - Legacy SSE endpoint: `/sse` After the Space is live, your MCP URL will look like: ```text https://.hf.space/mcp ``` ### 4. Notes - The Space UI loads `/`, so this project now serves a lightweight landing page there. - If you change the runtime port, update both `PORT` and `app_port` to the same value. ## Installation Guide ### NPX Quick Start (Recommended) The fastest way to get started: ```bash # Basic usage npx open-websearch@latest # With environment variables (Linux/macOS) DEFAULT_SEARCH_ENGINE=duckduckgo ENABLE_CORS=true npx open-websearch@latest # Windows PowerShell $env:DEFAULT_SEARCH_ENGINE="duckduckgo"; $env:ENABLE_CORS="true"; npx open-websearch@latest # Windows CMD set MODE=stdio && set DEFAULT_SEARCH_ENGINE=duckduckgo && npx open-websearch@latest # Cross-platform (requires cross-env, Used for local development) npm install -g open-websearch npx cross-env DEFAULT_SEARCH_ENGINE=duckduckgo ENABLE_CORS=true open-websearch ``` **Environment Variables:** | Variable | Default | Options | Description | |----------|-------------------------|---------|-------------| | `ENABLE_CORS` | `false` | `true`, `false` | Enable CORS | | `CORS_ORIGIN` | `*` | Any valid origin | CORS origin configuration | | `DEFAULT_SEARCH_ENGINE` | `bing` | `bing`, `duckduckgo`, `exa`, `brave`, `baidu`, `csdn`, `juejin` | Default search engine | | `USE_PROXY` | `false` | `true`, `false` | Enable HTTP proxy | | `PROXY_URL` | `http://127.0.0.1:7890` | Any valid URL | Proxy server URL | | `MODE` | `both` | `both`, `http`, `stdio` | Server mode: both HTTP+STDIO, HTTP only, or STDIO only | | `PORT` | `3000` | 1-65535 | Server port | | `ALLOWED_SEARCH_ENGINES` | empty (all available) | Comma-separated engine names | Limit which search engines can be used; if the default engine is not in this list, the first allowed engine becomes the default | | `MCP_TOOL_SEARCH_NAME` | `search` | Valid MCP tool name | Custom name for the search tool | | `MCP_TOOL_FETCH_LINUXDO_NAME` | `fetchLinuxDoArticle` | Valid MCP tool name | Custom name for the Linux.do article fetch tool | | `MCP_TOOL_FETCH_CSDN_NAME` | `fetchCsdnArticle` | Valid MCP tool name | Custom name for the CSDN article fetch tool | | `MCP_TOOL_FETCH_GITHUB_NAME` | `fetchGithubReadme` | Valid MCP tool name | Custom name for the GitHub README fetch tool | | `MCP_TOOL_FETCH_JUEJIN_NAME` | `fetchJuejinArticle` | Valid MCP tool name | Custom name for the Juejin article fetch tool | | `MCP_TOOL_FETCH_WEB_NAME` | `fetchWebContent` | Valid MCP tool name | Custom name for generic web/Markdown fetch tool | **Common configurations:** ```bash # Enable proxy for restricted regions USE_PROXY=true PROXY_URL=http://127.0.0.1:7890 npx open-websearch@latest # Full configuration DEFAULT_SEARCH_ENGINE=duckduckgo ENABLE_CORS=true USE_PROXY=true PROXY_URL=http://127.0.0.1:7890 PORT=8080 npx open-websearch@latest ``` ### Local Installation 1. Clone or download this repository 2. Install dependencies: ```bash npm install ``` 3. Build the server: ```bash npm run build ``` 4. Add the server to your MCP configuration: **Cherry Studio:** ```json { "mcpServers": { "web-search": { "name": "Web Search MCP", "type": "streamableHttp", "description": "Multi-engine web search with article fetching", "isActive": true, "baseUrl": "http://localhost:3000/mcp" } } } ``` **VSCode (Claude Dev Extension):** ```json { "mcpServers": { "web-search": { "transport": { "type": "streamableHttp", "url": "http://localhost:3000/mcp" } }, "web-search-sse": { "transport": { "type": "sse", "url": "http://localhost:3000/sse" } } } } ``` **Claude Desktop:** ```json { "mcpServers": { "web-search": { "type": "http", "url": "http://localhost:3000/mcp" }, "web-search-sse": { "type": "sse", "url": "http://localhost:3000/sse" } } } ``` **NPX Command Line Configuration:** ```json { "mcpServers": { "web-search": { "args": [ "open-websearch@latest" ], "command": "npx", "env": { "MODE": "stdio", "DEFAULT_SEARCH_ENGINE": "duckduckgo", "ALLOWED_SEARCH_ENGINES": "duckduckgo,bing,exa" } } } } ``` **Local STDIO Configuration for Cherry Studio (Windows):** ```json { "mcpServers": { "open-websearch-local": { "command": "node", "args": ["C:/path/to/your/project/build/index.js"], "env": { "MODE": "stdio", "DEFAULT_SEARCH_ENGINE": "duckduckgo", "ALLOWED_SEARCH_ENGINES": "duckduckgo,bing,exa" } } } } ``` ### Docker Deployment Quick deployment using Docker Compose: ```bash docker-compose up -d ``` Or use Docker directly: ```bash docker run -d --name web-search -p 3000:3000 -e ENABLE_CORS=true -e CORS_ORIGIN=* ghcr.io/aas-ee/open-web-search:latest ``` Environment variable configuration: | Variable | Default | Options | Description | |----------|-------------------------|---------|-------------| | `ENABLE_CORS` | `false` | `true`, `false` | Enable CORS | | `CORS_ORIGIN` | `*` | Any valid origin | CORS origin configuration | | `DEFAULT_SEARCH_ENGINE` | `bing` | `bing`, `duckduckgo`, `exa`, `brave` | Default search engine | | `USE_PROXY` | `false` | `true`, `false` | Enable HTTP proxy | | `PROXY_URL` | `http://127.0.0.1:7890` | Any valid URL | Proxy server URL | | `PORT` | `3000` | 1-65535 | Server port | Then configure in your MCP client: ```json { "mcpServers": { "web-search": { "name": "Web Search MCP", "type": "streamableHttp", "description": "Multi-engine web search with article fetching", "isActive": true, "baseUrl": "http://localhost:3000/mcp" }, "web-search-sse": { "transport": { "name": "Web Search MCP", "type": "sse", "description": "Multi-engine web search with article fetching", "isActive": true, "url": "http://localhost:3000/sse" } } } } ``` ## Usage Guide The server provides six tools: `search`, `fetchLinuxDoArticle`, `fetchCsdnArticle`, `fetchGithubReadme`, `fetchJuejinArticle`, and `fetchWebContent`. ### search Tool Usage ```typescript { "query": string, // Search query "limit": number, // Optional: Number of results to return (default: 10) "engines": string[] // Optional: Engines to use (bing,baidu,linuxdo,csdn,duckduckgo,exa,brave,juejin) default bing } ``` Usage example: ```typescript use_mcp_tool({ server_name: "web-search", tool_name: "search", arguments: { query: "search content", limit: 3, // Optional parameter engines: ["bing", "csdn", "duckduckgo", "exa", "brave", "juejin"] // Optional parameter, supports multi-engine combined search } }) ``` Response example: ```json [ { "title": "Example Search Result", "url": "https://example.com", "description": "Description text of the search result...", "source": "Source", "engine": "Engine used" } ] ``` ### fetchCsdnArticle Tool Usage Used to fetch complete content of CSDN blog articles. ```typescript { "url": string // URL from CSDN search results using the search tool } ``` Usage example: ```typescript use_mcp_tool({ server_name: "web-search", tool_name: "fetchCsdnArticle", arguments: { url: "https://blog.csdn.net/xxx/article/details/xxx" } }) ``` Response example: ```json [ { "content": "Example search result" } ] ``` ### fetchLinuxDoArticle Tool Usage Used to fetch complete content of Linux.do forum articles. ```typescript { "url": string // URL from linuxdo search results using the search tool } ``` Usage example: ```typescript use_mcp_tool({ server_name: "web-search", tool_name: "fetchLinuxDoArticle", arguments: { url: "https://xxxx.json" } }) ``` Response example: ```json [ { "content": "Example search result" } ] ``` ### fetchGithubReadme Tool Usage Used to fetch README content from GitHub repositories. ```typescript { "url": string // GitHub repository URL (supports HTTPS, SSH formats) } ``` Usage example: ```typescript use_mcp_tool({ server_name: "web-search", tool_name: "fetchGithubReadme", arguments: { url: "https://github.com/Aas-ee/open-webSearch" } }) ``` Supported URL formats: - HTTPS: `https://github.com/owner/repo` - HTTPS with .git: `https://github.com/owner/repo.git` - SSH: `git@github.com:owner/repo.git` - URLs with parameters: `https://github.com/owner/repo?tab=readme` Response example: ```json [ { "content": "
\n\n# Open-WebSearch MCP Server..." } ] ``` ### fetchWebContent Tool Usage Fetch content directly from public HTTP(S) links, including Markdown files (`.md`) and ordinary web pages. ```typescript { "url": string, // Public HTTP(S) URL "maxChars": number // Optional: max returned content length (1000-200000, default 30000) } ``` Usage example: ```typescript use_mcp_tool({ server_name: "web-search", tool_name: "fetchWebContent", arguments: { url: "https://raw.githubusercontent.com/Aas-ee/open-webSearch/main/README.md", maxChars: 12000 } }) ``` Response example: ```json { "url": "https://raw.githubusercontent.com/Aas-ee/open-webSearch/main/README.md", "finalUrl": "https://raw.githubusercontent.com/Aas-ee/open-webSearch/main/README.md", "contentType": "text/plain; charset=utf-8", "title": "", "truncated": false, "content": "# Open-WebSearch MCP Server ..." } ``` ### fetchJuejinArticle Tool Usage Used to fetch complete content of Juejin articles. ```typescript { "url": string // Juejin article URL from search results } ``` Usage example: ```typescript use_mcp_tool({ server_name: "web-search", tool_name: "fetchJuejinArticle", arguments: { url: "https://juejin.cn/post/7520959840199360563" } }) ``` Supported URL format: - `https://juejin.cn/post/{article_id}` Response example: ```json [ { "content": "πŸš€ 开源 AI θ”η½‘ζœη΄’ε·₯ε…·οΌšOpen-WebSearch MCP ε…¨ζ–°ε‡ηΊ§οΌŒζ”―ζŒε€šεΌ•ζ“Ž + 桁式响应..." } ] ``` ## Usage Limitations Since this tool works by scraping multi-engine search results, please note the following important limitations: 1. **Rate Limiting**: - Too many searches in a short time may cause the used engines to temporarily block requests - Recommendations: - Maintain reasonable search frequency - Use the limit parameter judiciously - Add delays between searches when necessary 2. **Result Accuracy**: - Depends on the HTML structure of corresponding engines, may fail when engines update - Some results may lack metadata like descriptions - Complex search operators may not work as expected 3. **Legal Terms**: - This tool is for personal use only - Please comply with the terms of service of corresponding engines - Implement appropriate rate limiting based on your actual use case 4. **Search Engine Configuration**: - Default search engine can be set via the `DEFAULT_SEARCH_ENGINE` environment variable - Supported engines: bing, duckduckgo, exa, brave - The default engine is used when searching specific websites 5. **Proxy Configuration**: - HTTP proxy can be configured when certain search engines are unavailable in specific regions - Enable proxy with environment variable `USE_PROXY=true` - Configure proxy server address with `PROXY_URL` ## Contributing Welcome to submit issue reports and feature improvement suggestions! ### Contributor Guide If you want to fork this repository and publish your own Docker image, you need to make the following configurations: #### GitHub Secrets Configuration To enable automatic Docker image building and publishing, please add the following secrets in your GitHub repository settings (Settings β†’ Secrets and variables β†’ Actions): **Required Secrets:** - `GITHUB_TOKEN`: Automatically provided by GitHub (no setup needed) **Optional Secrets (for Alibaba Cloud ACR):** - `ACR_REGISTRY`: Your Alibaba Cloud Container Registry URL (e.g., `registry.cn-hangzhou.aliyuncs.com`) - `ACR_USERNAME`: Your Alibaba Cloud ACR username - `ACR_PASSWORD`: Your Alibaba Cloud ACR password - `ACR_IMAGE_NAME`: Your image name in ACR (e.g., `your-namespace/open-web-search`) #### CI/CD Workflow The repository includes a GitHub Actions workflow (`.github/workflows/docker.yml`) that automatically: 1. **Trigger Conditions**: - Push to `main` branch - Push version tags (`v*`) - Manual workflow trigger 2. **Build and Push to**: - GitHub Container Registry (ghcr.io) - always enabled - Alibaba Cloud Container Registry - only enabled when ACR secrets are configured 3. **Image Tags**: - `ghcr.io/your-username/open-web-search:latest` - `your-acr-address/your-image-name:latest` (if ACR is configured) #### Fork and Publish Steps: 1. **Fork the repository** to your GitHub account 2. **Configure secrets** (if you need ACR publishing): - Go to Settings β†’ Secrets and variables β†’ Actions in your forked repository - Add the ACR-related secrets listed above 3. **Push changes** to the `main` branch or create version tags 4. **GitHub Actions will automatically build and push** your Docker image 5. **Use your image**, update the Docker command: ```bash docker run -d --name web-search -p 3000:3000 -e ENABLE_CORS=true -e CORS_ORIGIN=* ghcr.io/your-username/open-web-search:latest ``` #### Notes: - If you don't configure ACR secrets, the workflow will only publish to GitHub Container Registry - Make sure your GitHub repository has Actions enabled - The workflow will use your GitHub username (converted to lowercase) as the GHCR image name
## Star History If you find this project helpful, please consider giving it a ⭐ Star! [![Star History Chart](https://api.star-history.com/svg?repos=Aas-ee/open-webSearch&type=Date)](https://www.star-history.com/#Aas-ee/open-webSearch&Date)