figment / docs /prerequisites.md
ThomsenDrake's picture
Publish Figment Gradio Space app files
5dcfc5c verified
|
Raw
History Blame Contribute Delete
4.97 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Figment Prerequisites

This page captures the setup contract for building and demoing Figment v1.

Eligibility And Repos

Required for the Build Small Hackathon:

  • Hugging Face account registered for the hackathon.
  • Membership in the build-small-hackathon Hugging Face org.
  • Gradio Space hosted under that org: https://huggingface.co/spaces/build-small-hackathon/figment
  • Public repo for code and documentation.
  • Final submission assets: Space link, demo video, and social post.
  • Model total parameters at or below 32B.

Accounts And Tokens

Required:

  • Hugging Face token with write access for repo/Space pushes.
  • NVIDIA API Catalog key for hosted Nemotron 3 Nano Omni live mode.
  • Hugging Face token or endpoint access only if using a dedicated HF endpoint or Space push flow.
  • Modal account with credits for optional future fine-tuning and batch eval.

Build-time optional, depending on the synthetic-data path:

  • Mistral API access for teacher generation or critique.
  • MiniMax API access for teacher generation or critique.

Local Machine

Reference local demo machine:

  • macOS dev machine with 48 GB unified memory.
  • Enough disk/RAM headroom for the local 4B text model, optional quantized weights, and Parakeet ASR dependencies.
  • Internet access for initial model/tool downloads.

Local/offline proof target:

  • nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16 for local text navigation and first fine-tune target.
  • nvidia/parakeet-rnnt-1.1b for offline ASR after the local ASR gate passes.
  • Local OpenAI-compatible server on http://127.0.0.1:8001.
  • 16k context by default, 8k fallback.

CLI Tools

Install or verify:

git --version
python3 --version
uv --version
hf auth whoami
modal --version
docker --version
llama-server --help

Recommended install commands on macOS:

brew install llama.cpp
python3 -m pip install --upgrade huggingface_hub modal

uvx --from huggingface_hub hf ... is also acceptable when the hf executable is not installed globally.

Python Dependencies

Runtime dependencies live in requirements.txt.

Development, testing, and training dependencies live in requirements-dev.txt.

Install:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt -r requirements-dev.txt

Environment Variables

Copy .env.example to .env locally and fill secrets there. Do not commit .env.

Required or expected variables:

  • FIGMENT_MODE β€” hosted, local, or canned.
  • MODEL_STACK β€” omni_native for hosted demo mode or local_4b_parakeet for the gated local/offline path.
  • MODEL_BACKEND β€” hosted_omni, llama_cpp, or canned.
  • AUDIO_BACKEND β€” omni_native, parakeet_nemo, canned, or none.
  • ALLOW_LOCAL_ASR β€” set true only after Parakeet local ASR is proven and gated.
  • HF_MODEL_ID β€” defaults to nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16.
  • NVIDIA_API_KEY β€” NVIDIA API Catalog key for hosted Omni mode.
  • NVIDIA_BASE_URL β€” defaults to https://integrate.api.nvidia.com/v1.
  • NVIDIA_MODEL_ID β€” defaults to nvidia/nemotron-3-nano-omni-30b-a3b-reasoning.
  • LOCAL_MODEL_ID β€” local OpenAI-compatible model id or alias; default target is nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16.
  • HF_TOKEN β€” Hugging Face token for Space pushes or optional HF endpoint access.
  • HF_ENDPOINT_URL β€” optional dedicated HF Inference Endpoint URL.
  • LLAMA_BASE_URL β€” local OpenAI-compatible endpoint.
  • FIGMENT_TRACE_DIR β€” trace export directory.
  • MODAL_PROFILE β€” optional Modal profile name.
  • MISTRAL_API_KEY / MINIMAX_API_KEY β€” optional teacher-model keys.

Runtime Modes

Hosted live demo:

  • Gradio Space under build-small-hackathon/figment.
  • Hosted NVIDIA API Catalog / NIM-compatible Nemotron Omni powers live navigator output.
  • Rules, retrieval, validation, and trace rendering run in the Space.

Local/offline proof:

  • Local Gradio app.
  • Local protocol cards and SQLite retrieval.
  • Local deterministic rules and validators.
  • Local OpenAI-compatible server with Nemotron 3 Nano 4B.
  • Optional Parakeet ASR only after ALLOW_LOCAL_ASR=true and the local gate passes.

Fallback only:

  • Canned traces if hosted model, quota, or Space cold-start reliability fails.
  • Canned navigator output if the live model returns invalid JSON or violates validation.

Verification Checklist

Before implementation starts:

hf auth whoami
hf repos list --namespace build-small-hackathon --type space --search figment --limit 10
modal token info || modal setup
llama-server --help
python -m pip install -r requirements.txt -r requirements-dev.txt

Before submission:

Space boots cold under build-small-hackathon/figment.
Hosted live mode returns validated NVIDIA-hosted Nemotron output.
Local 4B mode runs the same demo case without internet.
No patient PHI is used, logged, or committed.