tiny-aya-compare / README.md
mlandia
Add Tiny Aya Global vs Earth ZeroGPU comparison app
3a5665b
|
Raw
History Blame Contribute Delete
4.97 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: Tiny Aya Compare
emoji: 🌍
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.18.0
app_file: app.py
python_version: '3.12'
startup_duration_timeout: 30m
short_description: Compare Tiny Aya Global vs Earth side by side

Tiny Aya Compare

Side-by-side comparison of CohereLabs/tiny-aya-global and CohereLabs/tiny-aya-earth on the same prompt.

Live Space: https://huggingface.co/spaces/build-small-hackathon/tiny-aya-compare

This Space runs on ZeroGPU using the PyTorch/safetensors checkpoints (not the GGUF builds).

Deploy to ZeroGPU (build-small-hackathon)

Prerequisites

  • Hugging Face account with access to the build-small-hackathon org
  • PRO, Team, or Enterprise plan (required to attach ZeroGPU to a Space you create)
  • hf CLI installed and authenticated: hf auth login
  • Access to the gated CohereLabs Tiny Aya models

1. Create the Space

hf repos create build-small-hackathon/tiny-aya-compare \
  --type space \
  --space-sdk gradio \
  --flavor zero-a10g \
  --public \
  --exist-ok

--flavor zero-a10g selects ZeroGPU. Hardware is not set via README frontmatter.

If the Space already exists on cpu-basic, switch hardware:

hf spaces settings build-small-hackathon/tiny-aya-compare --hardware zero-a10g

2. Accept gated model access

Both checkpoints are gated (auto). The account whose token you store as HF_TOKEN must accept access first:

  1. https://huggingface.co/CohereLabs/tiny-aya-global β€” click Agree and access
  2. https://huggingface.co/CohereLabs/tiny-aya-earth β€” click Agree and access

Without this, startup fails with 403 Client Error / Cannot access gated repo.

3. Add the gated-model token

Add a Space secret so the runtime can download the weights:

hf spaces secrets add build-small-hackathon/tiny-aya-compare -s HF_TOKEN

This uses your logged-in hf token. Re-run after accepting model access if the Space already failed once.

Use a token with read scope and access to both CohereLabs repos.

4. Push the app

From this repo:

git init
git add app.py README.md requirements.txt
git commit -m "Add Tiny Aya Global vs Earth ZeroGPU comparison app"
git remote add space https://huggingface.co/spaces/build-small-hackathon/tiny-aya-compare
git push space HEAD:main

Or upload without git:

hf upload build-small-hackathon/tiny-aya-compare . \
  --include "app.py" \
  --include "README.md" \
  --include "requirements.txt"

5. Verify the deployment

# Runtime stage and hardware
hf spaces info build-small-hackathon/tiny-aya-compare --expand runtime

# Build + startup logs
hf spaces logs build-small-hackathon/tiny-aya-compare --tail 200

# Smoke test (after RUNNING)
gradio predict build-small-hackathon/tiny-aya-compare /compare \
  '{"prompt": "Explain photosynthesis simply.", "max_new_tokens": 128, "temperature": 0.7, "top_p": 0.9}'

Expect requested_hardware: zero-a10g and both models loading at startup (~6 GB each in bf16).

Troubleshooting

Symptom Fix
Cannot access gated repo / 403 on startup Accept access on both model pages (step 2), then Factory reboot the Space
ValueError: Invalid file descriptor: -1 after startup Benign β€” Gradio SSR asyncio cleanup noise; ignore if stage is RUNNING
RUNTIME_ERROR after changing secrets Factory reboot: Space Settings β†’ Factory rebuild
ZeroGPU quota exceeded (visitors) Visitors need GPU quota; creators don't pay per visit
Hardware shows cpu-basic hf spaces settings build-small-hackathon/tiny-aya-compare --hardware zero-a10g

Restart after fixing access:

# Re-upload or reboot from the Space Settings UI β†’ "Factory rebuild"
hf spaces info build-small-hackathon/tiny-aya-compare --expand runtime

6. Update after changes

Change Action
app.py, README.md git push space main (hot-reload)
requirements.txt push β†’ full Space rebuild
Hardware / secrets hf spaces settings / hf spaces secrets add

Local development (uv)

ZeroGPU Spaces require Python 3.12 (not 3.14):

uv python pin 3.12
uv sync --group dev
uv run python app.py

Refresh requirements.txt after dependency changes (keep it minimal β€” Spaces preinstall torch, gradio, spaces, huggingface_hub):

# Edit requirements.txt manually, e.g.:
# accelerate>=1.4.0
# transformers>=4.50.0

Why not GGUF here?

ZeroGPU is built around PyTorch and @spaces.GPU. GGUF targets llama.cpp (Ollama, llama-server, local inference) and does not integrate with ZeroGPU's scheduler.

Use the Transformers checkpoints in this Space. The GGUF repos remain useful for local llama.cpp deployment.