Spaces:
Running
Running
File size: 14,360 Bytes
14dfaa4 0b075c6 14dfaa4 534b16f c13c2a7 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 0b075c6 534b16f 0b075c6 534b16f 0b075c6 534b16f 0b075c6 534b16f 0b075c6 534b16f 0b075c6 534b16f 0b075c6 534b16f 0b075c6 534b16f 0b075c6 5d8d23e 0b075c6 5d8d23e 0b075c6 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 0ff29e3 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 5d8d23e 534b16f 0ff29e3 534b16f 0b075c6 534b16f 0b075c6 534b16f 5d8d23e 534b16f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 | ---
title: HuggingFlow
emoji: π¦
colorFrom: green
colorTo: blue
sdk: docker
app_port: 7860
pinned: false
license: mit
secrets:
- name: LLM_MODEL
description: "Model in provider/model-name format β e.g. openai/gpt-4o, anthropic/claude-sonnet-4-5, google/gemini-2.5-flash"
- name: LLM_API_KEY
description: API key for the chosen LLM provider.
- name: HF_TOKEN
description: Hugging Face token (write access) β enables thread backup/restore to a private HF Dataset.
- name: SERPER_API_KEY
description: "Serper API key for real Google Search results (recommended). Free tier: 2,500 queries/month."
- name: AUTH_JWT_SECRET
description: "JWT signing secret β keeps sessions alive across restarts. Generate: openssl rand -base64 32"
- name: CLOUDFLARE_WORKERS_TOKEN
description: "Cloudflare API token β auto-creates an outbound proxy Worker and a keep-awake cron Worker."
---
<div align="center">
# π¦ HuggingFlow
**[DeerFlow](https://github.com/bytedance/deer-flow) research agent β one-click deploy on Hugging Face Spaces**
[](https://huggingface.co/spaces/somratpro/HuggingFlow)
[](https://github.com/somratpro/HuggingFlow)
[](LICENSE)
[](Dockerfile)
*Self-hosted deep-research AI Β· multi-provider LLM Β· streaming SSE Β· dataset backup*
</div>
---
## Table of Contents
- [What is HuggingFlow?](#what-is-huggingflow)
- [Features](#features)
- [Quick Start](#quick-start)
- [Configuration](#configuration)
- [Required Secrets](#required-secrets)
- [Optional Variables](#optional-variables)
- [LLM Providers](#llm-providers)
- [Search Tools](#search-tools)
- [Cloudflare Proxy](#cloudflare-proxy)
- [Data Backup](#data-backup)
- [Stay Alive (Keep-Awake)](#stay-alive-keep-awake)
- [Architecture](#architecture)
- [Local Development](#local-development)
- [Troubleshooting](#troubleshooting)
- [More Projects](#more-projects)
- [Contributing](#contributing)
- [License](#license)
---
## What is HuggingFlow?
HuggingFlow wraps [DeerFlow](https://github.com/bytedance/deer-flow) (ByteDance's open-source deep-research agent) into a single Docker container that runs natively on [Hugging Face Spaces](https://huggingface.co/spaces).
**Zero infra.** Duplicate the Space, add your API keys, done β your own private research agent is live.
DeerFlow conducts multi-step research: it queries search engines, fetches web pages, synthesises findings across sources, and produces structured reports β all driven by the LLM you choose.
---
## Features
- π **One-click deploy** β duplicate the HF Space, add secrets, done
- π§ **Multi-provider LLM** β OpenAI, Anthropic, Google Gemini, DeepSeek, Groq, Mistral, xAI, OpenRouter, Qwen, Moonshot, any OpenAI-compatible endpoint
- π **Pluggable search** β Serper (Google), Tavily, or DuckDuckGo (no key needed)
- πΎ **Dataset backup** β threads auto-sync to a private HF Dataset; restored on restart
- π **Cloudflare outbound proxy** β route backend traffic through a Cloudflare Worker (beats HF Spaces IP blocks on some APIs)
- β° **Keep-Awake cron** β Cloudflare Worker pings `/health` on a schedule to prevent cold starts
- π **Live dashboard** β status page at `/` with service health, model, search, backup and keep-awake tiles
- π **Auth built-in** β DeerFlow v2 JWT auth; create admin at `/setup` on first boot
- β‘ **Pre-built images** β no source compilation; pulls official GHCR images for sub-5-minute builds
- π‘ **Streaming SSE** β real-time agent output streamed to the browser
---
## Quick Start
### Step 1 β Duplicate this Space
[](https://huggingface.co/spaces/somratpro/HuggingFlow?duplicate=true)
### Step 2 β Add required secrets
In your new Space β **Settings β Variables and Secrets**, add at minimum:
| Secret | Description |
|--------|-------------|
| `LLM_MODEL` | Model in `provider/model-name` format β e.g. `openai/gpt-4o` |
| `LLM_API_KEY` | API key for the chosen provider |
> [!TIP]
> Add `HF_TOKEN` (a token with write access to your account) to enable thread backup persistence. Without it, all research threads are lost on restart.
### Step 3 β Wait for build
First build pulls pre-built GHCR images β takes ~5 minutes. Subsequent restarts are instant (no rebuild).
### Step 4 β Create your admin account
Visit `https://<your-space>.hf.space/setup` β create username + password.
### Step 5 β Start researching
Open `/workspace` β you're live π
---
## Configuration
### Required Secrets
| Secret | Description |
|--------|-------------|
| `LLM_MODEL` | Model in `provider/model-name` format β see [LLM Providers](#llm-providers) |
| `LLM_API_KEY` | API key for the chosen provider |
### Optional Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `SERPER_API_KEY` | β | Google Search via Serper β strongly recommended over DuckDuckGo |
| `TAVILY_API_KEY` | β | Alternative web search (used if Serper not set) |
| `JINA_API_KEY` | β | Better web page fetching via Jina AI |
| `AUTH_JWT_SECRET` | auto-generated | JWT signing secret β set this to keep sessions alive across restarts |
| `HF_TOKEN` | β | Your HF token β enables dataset backup/restore |
| `BACKUP_DATASET_NAME` | `huggingflow-backup` | HF dataset repo name for backups (created automatically) |
| `CUSTOM_BASE_URL` | β | OpenAI-compatible API base URL for any custom/self-hosted provider |
| `SYNC_INTERVAL` | `600` | Seconds between HF Dataset backup syncs |
| `BACKEND_READY_TIMEOUT` | `120` | Seconds to wait for backend startup |
| `FRONTEND_READY_TIMEOUT` | `120` | Seconds to wait for frontend startup |
| `CLOUDFLARE_WORKERS_TOKEN` | β | Cloudflare API token β enables outbound proxy + keep-awake cron |
| `CLOUDFLARE_PROXY_URL` | β | Existing Cloudflare Worker URL (skip auto-setup) |
---
## LLM Providers
Set `LLM_MODEL` to `provider/model-name`:
| Provider | Example `LLM_MODEL` | Notes |
|----------|---------------------|-------|
| **OpenAI** | `openai/gpt-4o` | Default provider |
| **Anthropic** | `anthropic/claude-sonnet-4-5` | Extended thinking supported |
| **Google Gemini** | `google/gemini-2.5-flash` | Extended thinking supported |
| **DeepSeek** | `deepseek/deepseek-chat` | Extended thinking supported |
| **Groq** | `groq/llama-3.3-70b-versatile` | Fast inference |
| **Mistral** | `mistral/mistral-large-latest` | |
| **xAI / Grok** | `xai/grok-3-beta` | |
| **OpenRouter** | `openrouter/anthropic/claude-3-5-sonnet` | Access 200+ models |
| **Qwen / Alibaba** | `qwen/qwen-max` | DashScope compatible |
| **Moonshot / Kimi** | `moonshot/moonshot-v1-128k` | |
| **Custom OpenAI-compat** | `openai/your-model` + `CUSTOM_BASE_URL` | Any self-hosted endpoint |
> **Tip:** Models with extended thinking (Anthropic, Gemini, DeepSeek) produce higher-quality research plans but use more tokens.
---
## Search Tools
DeerFlow uses web search as its primary information source. Configure in priority order:
| Tool | Key | Quality | Cost |
|------|-----|---------|------|
| **Serper** | `SERPER_API_KEY` | βββ (real Google) | ~$0.001/query |
| **Tavily** | `TAVILY_API_KEY` | ββ | free tier available |
| **DuckDuckGo** | none needed | β | free, rate-limited |
Serper is strongly recommended for research quality. Sign up at [serper.dev](https://serper.dev) β 2,500 free queries/month.
---
## Cloudflare Proxy
HF Spaces shares IPs that some APIs block. The Cloudflare outbound proxy routes backend HTTP requests through a Cloudflare Worker, giving you a clean egress IP.
**Setup:**
1. Get a Cloudflare API token with **Workers Edit** permission
2. Set `CLOUDFLARE_WORKERS_TOKEN` in your Space secrets
3. On next start, `cloudflare-proxy-setup.py` auto-creates the Worker and sets `CLOUDFLARE_PROXY_URL`
Or manually provide `CLOUDFLARE_PROXY_URL` if you have an existing Worker.
---
## Data Backup
By default threads are stored in SQLite inside the container β **lost on restart**.
Enable persistent backup with HF Datasets:
1. Set `HF_TOKEN` to a token with **Write** access to your profile
2. Optionally set `BACKUP_DATASET_NAME` (default: `huggingflow-backup`)
3. The dataset is created automatically (private) on first sync
**What's backed up:** SQLite database (threads, messages, uploads index), workspace files.
**Sync schedule:** every `SYNC_INTERVAL` seconds (default 10 min) + on graceful shutdown + on startup (restore).
---
## Stay Alive (Keep-Awake)
Free HF Spaces pause after ~15 minutes of inactivity. Fix it with a Cloudflare Worker cron:
1. Set `CLOUDFLARE_WORKERS_TOKEN` (same token as proxy setup)
2. `cloudflare-keepalive-setup.py` creates a Worker that pings `/health` every 10 minutes
3. Status shown in the dashboard **Keep Awake** tile
Check `KEEPALIVE_STATUS_FILE` (`/tmp/huggingflow-cloudflare-keepalive-status.json`) for current state.
---
## Architecture
```
Browser
β
βΌ :7860
health-server.js ββββ / β status dashboard (HTML)
β ββββ /health β JSON health check
β ββββ /status β JSON full status
β ββββ /* β proxy to nginx
β
βΌ :7861
nginx
β /api/langgraph/* β rewrite β /api/* β backend :8001
β /api/* β β backend :8001
β /health β β backend :8001/health
β /docs /redoc β β backend :8001
β /* β β frontend :3000
β
βββΆ :8001 FastAPI (uvicorn) β DeerFlow gateway, agents, auth, SQLite
βββΆ :3000 Next.js β DeerFlow UI (server-side rendered)
```
**Port map:**
| Port | Service | Exposed |
|------|---------|---------|
| 7860 | health-server.js | β
public (HF Spaces) |
| 7861 | nginx | internal only |
| 8001 | FastAPI backend | internal only |
| 3000 | Next.js frontend | internal only |
**Images used:**
- `ghcr.io/bytedance/deer-flow-backend:latest` β pre-built Python backend + `.venv`
- `ghcr.io/bytedance/deer-flow-frontend:latest` β pre-built Next.js + `node_modules`
- No source compilation β build time ~5 min instead of 30+ min
---
## Local Development
```bash
git clone https://github.com/somratpro/HuggingFlow
cd HuggingFlow
# Build
docker build -t huggingflow .
# Run (set your own keys)
docker run -p 7860:7860 \
-e LLM_MODEL=openai/gpt-4o \
-e LLM_API_KEY=sk-... \
-e SERPER_API_KEY=... \
huggingflow
```
Open `http://localhost:7860` for the dashboard, `http://localhost:7860/setup` to create your admin account, then `http://localhost:7860/workspace`.
**Useful routes:**
| Route | Description |
|-------|-------------|
| `/` | Status dashboard |
| `/workspace` | DeerFlow research UI |
| `/setup` | Admin account creation (first boot only) |
| `/api/health` | Backend health (JSON) |
| `/docs` | Swagger API reference |
| `/redoc` | ReDoc API reference |
---
## Troubleshooting
**"Application error" on `/workspace` or `/setup`**
> The pre-built frontend requires `DEER_FLOW_TRUSTED_ORIGINS` to be set explicitly. `start.sh` handles this automatically. If you see this error in a custom setup, ensure the env var is set before starting Next.js.
**Build takes 30+ minutes / OOMKilled**
> Ensure Docker has β₯4 GB RAM. HuggingFlow uses pre-built images specifically to avoid compilation. If you're rebuilding from source, add `NODE_OPTIONS=--max-old-space-size=3072`.
**DuckDuckGo returning no results**
> DuckDuckGo rate-limits aggressively from shared IPs. Set `SERPER_API_KEY` or `TAVILY_API_KEY`.
**Threads lost after restart**
> Set `HF_TOKEN` and `BACKUP_DATASET_NAME` to enable dataset sync. Without it, storage is ephemeral.
**Space goes to sleep**
> Set `CLOUDFLARE_WORKERS_TOKEN` to enable the keep-awake cron. Alternatively, upgrade to a paid HF Space tier.
**Backend health shows `not_authenticated`**
> Normal β DeerFlow v2 protects all `/api/*` routes. The public health endpoint is `/health` (no auth). nginx routes `/health` β `backend:8001/health`.
---
## More Projects
Similar projects by [@somratpro](https://github.com/somratpro) β all free, one-click deploy on HF Spaces:
| Project | What it runs | HF Space | GitHub |
|---------|-------------|----------|--------|
| **HuggingClip** | Paperclip β AI agent orchestration | [Space](https://huggingface.co/spaces/somratpro/HuggingClip) | [Repo](https://github.com/somratpro/HuggingClip) |
| **HuggingClaw** | OpenClaw β Claude Code in the browser | [Space](https://huggingface.co/spaces/somratpro/HuggingClaw) | [Repo](https://github.com/somratpro/HuggingClaw) |
| **HuggingMes** | Hermes β self-hosted agent gateway | [Space](https://huggingface.co/spaces/somratpro/HuggingMes) | [Repo](https://github.com/somratpro/HuggingMes) |
| **Hugging8n** | n8n β workflow & automation platform | [Space](https://huggingface.co/spaces/somratpro/Hugging8n) | [Repo](https://github.com/somratpro/Hugging8n) |
| **HuggingPost** | Postiz β social media scheduler | [Space](https://huggingface.co/spaces/somratpro/HuggingPost) | [Repo](https://github.com/somratpro/HuggingPost) |
---
## β€οΈ Support
If HuggingFlow saves you time, consider buying me a coffee to keep the projects alive!
**USDT (TRC-20 / TRON network only)**
```
TELx8TJz1W1h7n6SgpgGNNGZXpJCEUZrdB
```
> [!WARNING]
> Send **USDT on TRC-20 network only**. Sending other tokens or using a different network will result in permanent loss.
---
## Contributing
Contributions welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
```
Fork β branch β commit β PR
```
---
## License
MIT β see [LICENSE](LICENSE).
DeerFlow is Β© ByteDance, licensed under MIT.
---
<div align="center">
<sub>Built with β€οΈ by <a href="https://github.com/somratpro">somratpro</a> Β· Powered by <a href="https://github.com/bytedance/deer-flow">DeerFlow</a></sub>
</div>
|