A newer version of the Gradio SDK is available: 6.19.0
title: Tiny Aya Compare
emoji: π
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.18.0
app_file: app.py
python_version: '3.12'
startup_duration_timeout: 30m
short_description: Compare Tiny Aya Global vs Earth side by side
Tiny Aya Compare
Side-by-side comparison of CohereLabs/tiny-aya-global and CohereLabs/tiny-aya-earth on the same prompt.
Live Space: https://huggingface.co/spaces/build-small-hackathon/tiny-aya-compare
This Space runs on ZeroGPU using the PyTorch/safetensors checkpoints (not the GGUF builds).
Deploy to ZeroGPU (build-small-hackathon)
Prerequisites
- Hugging Face account with access to the
build-small-hackathonorg - PRO, Team, or Enterprise plan (required to attach ZeroGPU to a Space you create)
hfCLI installed and authenticated:hf auth login- Access to the gated CohereLabs Tiny Aya models
1. Create the Space
hf repos create build-small-hackathon/tiny-aya-compare \
--type space \
--space-sdk gradio \
--flavor zero-a10g \
--public \
--exist-ok
--flavor zero-a10g selects ZeroGPU. Hardware is not set via README frontmatter.
If the Space already exists on cpu-basic, switch hardware:
hf spaces settings build-small-hackathon/tiny-aya-compare --hardware zero-a10g
2. Accept gated model access
Both checkpoints are gated (auto). The account whose token you store as HF_TOKEN must accept access first:
- https://huggingface.co/CohereLabs/tiny-aya-global β click Agree and access
- https://huggingface.co/CohereLabs/tiny-aya-earth β click Agree and access
Without this, startup fails with 403 Client Error / Cannot access gated repo.
3. Add the gated-model token
Add a Space secret so the runtime can download the weights:
hf spaces secrets add build-small-hackathon/tiny-aya-compare -s HF_TOKEN
This uses your logged-in hf token. Re-run after accepting model access if the Space already failed once.
Use a token with read scope and access to both CohereLabs repos.
4. Push the app
From this repo:
git init
git add app.py README.md requirements.txt
git commit -m "Add Tiny Aya Global vs Earth ZeroGPU comparison app"
git remote add space https://huggingface.co/spaces/build-small-hackathon/tiny-aya-compare
git push space HEAD:main
Or upload without git:
hf upload build-small-hackathon/tiny-aya-compare . \
--include "app.py" \
--include "README.md" \
--include "requirements.txt"
5. Verify the deployment
# Runtime stage and hardware
hf spaces info build-small-hackathon/tiny-aya-compare --expand runtime
# Build + startup logs
hf spaces logs build-small-hackathon/tiny-aya-compare --tail 200
# Smoke test (after RUNNING)
gradio predict build-small-hackathon/tiny-aya-compare /compare \
'{"prompt": "Explain photosynthesis simply.", "max_new_tokens": 128, "temperature": 0.7, "top_p": 0.9}'
Expect requested_hardware: zero-a10g and both models loading at startup (~6 GB each in bf16).
Troubleshooting
| Symptom | Fix |
|---|---|
Cannot access gated repo / 403 on startup |
Accept access on both model pages (step 2), then Factory reboot the Space |
ValueError: Invalid file descriptor: -1 after startup |
Benign β Gradio SSR asyncio cleanup noise; ignore if stage is RUNNING |
RUNTIME_ERROR after changing secrets |
Factory reboot: Space Settings β Factory rebuild |
ZeroGPU quota exceeded (visitors) |
Visitors need GPU quota; creators don't pay per visit |
Hardware shows cpu-basic |
hf spaces settings build-small-hackathon/tiny-aya-compare --hardware zero-a10g |
Restart after fixing access:
# Re-upload or reboot from the Space Settings UI β "Factory rebuild"
hf spaces info build-small-hackathon/tiny-aya-compare --expand runtime
6. Update after changes
| Change | Action |
|---|---|
app.py, README.md |
git push space main (hot-reload) |
requirements.txt |
push β full Space rebuild |
| Hardware / secrets | hf spaces settings / hf spaces secrets add |
Local development (uv)
ZeroGPU Spaces require Python 3.12 (not 3.14):
uv python pin 3.12
uv sync --group dev
uv run python app.py
Refresh requirements.txt after dependency changes (keep it minimal β Spaces preinstall torch, gradio, spaces, huggingface_hub):
# Edit requirements.txt manually, e.g.:
# accelerate>=1.4.0
# transformers>=4.50.0
Why not GGUF here?
ZeroGPU is built around PyTorch and @spaces.GPU. GGUF targets llama.cpp (Ollama, llama-server, local inference) and does not integrate with ZeroGPU's scheduler.
Use the Transformers checkpoints in this Space. The GGUF repos remain useful for local llama.cpp deployment.