figment / docs /prerequisites.md
ThomsenDrake's picture
Publish Figment Gradio Space app files
5dcfc5c verified
|
Raw
History Blame Contribute Delete
4.97 kB
# Figment Prerequisites
This page captures the setup contract for building and demoing Figment v1.
## Eligibility And Repos
Required for the Build Small Hackathon:
* Hugging Face account registered for the hackathon.
* Membership in the `build-small-hackathon` Hugging Face org.
* Gradio Space hosted under that org:
`https://huggingface.co/spaces/build-small-hackathon/figment`
* Public repo for code and documentation.
* Final submission assets: Space link, demo video, and social post.
* Model total parameters at or below 32B.
## Accounts And Tokens
Required:
* Hugging Face token with write access for repo/Space pushes.
* NVIDIA API Catalog key for hosted Nemotron 3 Nano Omni live mode.
* Hugging Face token or endpoint access only if using a dedicated HF endpoint or Space push flow.
* Modal account with credits for optional future fine-tuning and batch eval.
Build-time optional, depending on the synthetic-data path:
* Mistral API access for teacher generation or critique.
* MiniMax API access for teacher generation or critique.
## Local Machine
Reference local demo machine:
* macOS dev machine with 48 GB unified memory.
* Enough disk/RAM headroom for the local 4B text model, optional quantized weights, and Parakeet ASR dependencies.
* Internet access for initial model/tool downloads.
Local/offline proof target:
* `nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16` for local text navigation and first fine-tune target.
* `nvidia/parakeet-rnnt-1.1b` for offline ASR after the local ASR gate passes.
* Local OpenAI-compatible server on `http://127.0.0.1:8001`.
* 16k context by default, 8k fallback.
## CLI Tools
Install or verify:
```bash
git --version
python3 --version
uv --version
hf auth whoami
modal --version
docker --version
llama-server --help
```
Recommended install commands on macOS:
```bash
brew install llama.cpp
python3 -m pip install --upgrade huggingface_hub modal
```
`uvx --from huggingface_hub hf ...` is also acceptable when the `hf` executable is not installed globally.
## Python Dependencies
Runtime dependencies live in `requirements.txt`.
Development, testing, and training dependencies live in `requirements-dev.txt`.
Install:
```bash
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt -r requirements-dev.txt
```
## Environment Variables
Copy `.env.example` to `.env` locally and fill secrets there. Do not commit `.env`.
Required or expected variables:
* `FIGMENT_MODE` β€” `hosted`, `local`, or `canned`.
* `MODEL_STACK` β€” `omni_native` for hosted demo mode or `local_4b_parakeet` for the gated local/offline path.
* `MODEL_BACKEND` β€” `hosted_omni`, `llama_cpp`, or `canned`.
* `AUDIO_BACKEND` β€” `omni_native`, `parakeet_nemo`, `canned`, or `none`.
* `ALLOW_LOCAL_ASR` β€” set `true` only after Parakeet local ASR is proven and gated.
* `HF_MODEL_ID` β€” defaults to `nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16`.
* `NVIDIA_API_KEY` β€” NVIDIA API Catalog key for hosted Omni mode.
* `NVIDIA_BASE_URL` β€” defaults to `https://integrate.api.nvidia.com/v1`.
* `NVIDIA_MODEL_ID` β€” defaults to `nvidia/nemotron-3-nano-omni-30b-a3b-reasoning`.
* `LOCAL_MODEL_ID` β€” local OpenAI-compatible model id or alias; default target is `nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16`.
* `HF_TOKEN` β€” Hugging Face token for Space pushes or optional HF endpoint access.
* `HF_ENDPOINT_URL` β€” optional dedicated HF Inference Endpoint URL.
* `LLAMA_BASE_URL` β€” local OpenAI-compatible endpoint.
* `FIGMENT_TRACE_DIR` β€” trace export directory.
* `MODAL_PROFILE` β€” optional Modal profile name.
* `MISTRAL_API_KEY` / `MINIMAX_API_KEY` β€” optional teacher-model keys.
## Runtime Modes
Hosted live demo:
* Gradio Space under `build-small-hackathon/figment`.
* Hosted NVIDIA API Catalog / NIM-compatible Nemotron Omni powers live navigator output.
* Rules, retrieval, validation, and trace rendering run in the Space.
Local/offline proof:
* Local Gradio app.
* Local protocol cards and SQLite retrieval.
* Local deterministic rules and validators.
* Local OpenAI-compatible server with Nemotron 3 Nano 4B.
* Optional Parakeet ASR only after `ALLOW_LOCAL_ASR=true` and the local gate passes.
Fallback only:
* Canned traces if hosted model, quota, or Space cold-start reliability fails.
* Canned navigator output if the live model returns invalid JSON or violates validation.
## Verification Checklist
Before implementation starts:
```bash
hf auth whoami
hf repos list --namespace build-small-hackathon --type space --search figment --limit 10
modal token info || modal setup
llama-server --help
python -m pip install -r requirements.txt -r requirements-dev.txt
```
Before submission:
```text
Space boots cold under build-small-hackathon/figment.
Hosted live mode returns validated NVIDIA-hosted Nemotron output.
Local 4B mode runs the same demo case without internet.
No patient PHI is used, logged, or committed.
```