Spaces:
Running on Zero
Running on Zero
File size: 4,967 Bytes
5dcfc5c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | # Figment Prerequisites
This page captures the setup contract for building and demoing Figment v1.
## Eligibility And Repos
Required for the Build Small Hackathon:
* Hugging Face account registered for the hackathon.
* Membership in the `build-small-hackathon` Hugging Face org.
* Gradio Space hosted under that org:
`https://huggingface.co/spaces/build-small-hackathon/figment`
* Public repo for code and documentation.
* Final submission assets: Space link, demo video, and social post.
* Model total parameters at or below 32B.
## Accounts And Tokens
Required:
* Hugging Face token with write access for repo/Space pushes.
* NVIDIA API Catalog key for hosted Nemotron 3 Nano Omni live mode.
* Hugging Face token or endpoint access only if using a dedicated HF endpoint or Space push flow.
* Modal account with credits for optional future fine-tuning and batch eval.
Build-time optional, depending on the synthetic-data path:
* Mistral API access for teacher generation or critique.
* MiniMax API access for teacher generation or critique.
## Local Machine
Reference local demo machine:
* macOS dev machine with 48 GB unified memory.
* Enough disk/RAM headroom for the local 4B text model, optional quantized weights, and Parakeet ASR dependencies.
* Internet access for initial model/tool downloads.
Local/offline proof target:
* `nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16` for local text navigation and first fine-tune target.
* `nvidia/parakeet-rnnt-1.1b` for offline ASR after the local ASR gate passes.
* Local OpenAI-compatible server on `http://127.0.0.1:8001`.
* 16k context by default, 8k fallback.
## CLI Tools
Install or verify:
```bash
git --version
python3 --version
uv --version
hf auth whoami
modal --version
docker --version
llama-server --help
```
Recommended install commands on macOS:
```bash
brew install llama.cpp
python3 -m pip install --upgrade huggingface_hub modal
```
`uvx --from huggingface_hub hf ...` is also acceptable when the `hf` executable is not installed globally.
## Python Dependencies
Runtime dependencies live in `requirements.txt`.
Development, testing, and training dependencies live in `requirements-dev.txt`.
Install:
```bash
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt -r requirements-dev.txt
```
## Environment Variables
Copy `.env.example` to `.env` locally and fill secrets there. Do not commit `.env`.
Required or expected variables:
* `FIGMENT_MODE` β `hosted`, `local`, or `canned`.
* `MODEL_STACK` β `omni_native` for hosted demo mode or `local_4b_parakeet` for the gated local/offline path.
* `MODEL_BACKEND` β `hosted_omni`, `llama_cpp`, or `canned`.
* `AUDIO_BACKEND` β `omni_native`, `parakeet_nemo`, `canned`, or `none`.
* `ALLOW_LOCAL_ASR` β set `true` only after Parakeet local ASR is proven and gated.
* `HF_MODEL_ID` β defaults to `nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16`.
* `NVIDIA_API_KEY` β NVIDIA API Catalog key for hosted Omni mode.
* `NVIDIA_BASE_URL` β defaults to `https://integrate.api.nvidia.com/v1`.
* `NVIDIA_MODEL_ID` β defaults to `nvidia/nemotron-3-nano-omni-30b-a3b-reasoning`.
* `LOCAL_MODEL_ID` β local OpenAI-compatible model id or alias; default target is `nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16`.
* `HF_TOKEN` β Hugging Face token for Space pushes or optional HF endpoint access.
* `HF_ENDPOINT_URL` β optional dedicated HF Inference Endpoint URL.
* `LLAMA_BASE_URL` β local OpenAI-compatible endpoint.
* `FIGMENT_TRACE_DIR` β trace export directory.
* `MODAL_PROFILE` β optional Modal profile name.
* `MISTRAL_API_KEY` / `MINIMAX_API_KEY` β optional teacher-model keys.
## Runtime Modes
Hosted live demo:
* Gradio Space under `build-small-hackathon/figment`.
* Hosted NVIDIA API Catalog / NIM-compatible Nemotron Omni powers live navigator output.
* Rules, retrieval, validation, and trace rendering run in the Space.
Local/offline proof:
* Local Gradio app.
* Local protocol cards and SQLite retrieval.
* Local deterministic rules and validators.
* Local OpenAI-compatible server with Nemotron 3 Nano 4B.
* Optional Parakeet ASR only after `ALLOW_LOCAL_ASR=true` and the local gate passes.
Fallback only:
* Canned traces if hosted model, quota, or Space cold-start reliability fails.
* Canned navigator output if the live model returns invalid JSON or violates validation.
## Verification Checklist
Before implementation starts:
```bash
hf auth whoami
hf repos list --namespace build-small-hackathon --type space --search figment --limit 10
modal token info || modal setup
llama-server --help
python -m pip install -r requirements.txt -r requirements-dev.txt
```
Before submission:
```text
Space boots cold under build-small-hackathon/figment.
Hosted live mode returns validated NVIDIA-hosted Nemotron output.
Local 4B mode runs the same demo case without internet.
No patient PHI is used, logged, or committed.
```
|