Spaces:
Running on Zero
Running on Zero
| # Copy to .env (gitignored). The Space supports TWO model backends; you | |
| # only need credentials for whichever one(s) you want to use. On the | |
| # deployed HuggingFace Space, leave .env empty and set these as Space | |
| # secrets in the Settings panel instead. | |
| # | |
| # ============================================================ | |
| # PROVIDER SELECTION | |
| # ============================================================ | |
| # Optional. If unset, the app auto-detects based on which credentials | |
| # are present and whether we are running on a HuggingFace Space (see | |
| # app.py::_detect_provider). Valid values: | |
| # anthropic β Claude via the Anthropic SDK (best writeup quality) | |
| # huggingface β Open models (Gemma 2 / Phi-4 / Llama-3.3 / Qwen) via | |
| # HF Inference Providers API. Free on HF Spaces via | |
| # the Space's monthly credits; HF_TOKEN locally. | |
| # zerogpu β Open model (Phi-4-mini-instruct by default) loaded | |
| # locally in the Space and run on free on-demand GPU | |
| # via the HuggingFace Pro plan's ZeroGPU allocation. | |
| # No API round-trip; no inference credits burned. | |
| # Auto-detect precedence: Pro Space β zerogpu, else Anthropic key β | |
| # anthropic, else HF_TOKEN or any Space β huggingface, else anthropic. | |
| # MODEL_PROVIDER= | |
| # ============================================================ | |
| # ANTHROPIC BACKEND | |
| # ============================================================ | |
| # Required for the anthropic backend. Get one at console.anthropic.com. | |
| ANTHROPIC_API_KEY=your-anthropic-api-key-here | |
| # Optional. claude-opus-4-7 is the default β produces materially better | |
| # diagnostic writeups. claude-sonnet-4-6 is a cost-optimized fallback; | |
| # benchmark before flipping (research.md R15). | |
| MODEL_ID=claude-opus-4-7 | |
| # ============================================================ | |
| # HUGGINGFACE BACKEND | |
| # ============================================================ | |
| # Optional locally β get one at huggingface.co/settings/tokens. NOT | |
| # required on a deployed HuggingFace Space (the Space identity is used | |
| # automatically and includes free monthly inference credits). | |
| # HF_TOKEN=your-hf-token-here | |
| # Optional. Default google/gemma-2-9b-it works well and is widely | |
| # available on HF Inference Providers. Other tested choices: | |
| # microsoft/Phi-4-mini-instruct β smaller, faster, decent JSON | |
| # meta-llama/Llama-3.3-70B-Instruct β slower, very high quality | |
| # Qwen/Qwen2.5-72B-Instruct β strong on structured output | |
| # Smaller open models can be looser than Claude on schema adherence; | |
| # the parser raises MalformedResponseError on bad output and the UI | |
| # shows a "try again" message rather than crashing. | |
| # HF_MODEL_ID=google/gemma-2-9b-it | |
| # ============================================================ | |
| # ZEROGPU BACKEND (HuggingFace Pro plan) | |
| # ============================================================ | |
| # No credentials required β the @spaces.GPU decorator handles allocation | |
| # automatically when the Space has a Pro owner. Locally, the function | |
| # decoration is a no-op and the model runs on CPU (slow, smoke-test only). | |
| # | |
| # Optional. Default microsoft/Phi-4-mini-instruct fits on the standard | |
| # A100 allocation with fast cold start. Other tested choices: | |
| # google/gemma-2-9b-it β larger, slower load, more capable | |
| # meta-llama/Llama-3.3-8B-Instruct β Llama 3.3 8B, good JSON adherence | |
| # microsoft/phi-4 β full 14B Phi-4, slower | |
| # HuggingFace's gated models (Llama, etc.) need HF_TOKEN to download. | |
| # ZEROGPU_MODEL_ID=microsoft/Phi-4-mini-instruct | |
| # Optional. Maximum GPU allocation per request, in seconds. The Pro | |
| # plan allows up to 120s per request; raise/lower to balance cold-start | |
| # coverage vs. quota use. | |
| # ZEROGPU_DURATION_SECONDS=120 | |
| # ============================================================ | |
| # VALIDATION | |
| # ============================================================ | |
| # Word-count cap on the description Textbox. The Gradio validator | |
| # rejects submissions outside 200βMAX_DESCRIPTION_WORDS. | |
| MAX_DESCRIPTION_WORDS=5000 | |