tiny-aya-compare / README.md
mlandia
Add Tiny Aya Global vs Earth ZeroGPU comparison app
3a5665b
|
Raw
History Blame Contribute Delete
4.97 kB
---
title: Tiny Aya Compare
emoji: 🌍
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.18.0
app_file: app.py
python_version: "3.12"
startup_duration_timeout: 30m
short_description: Compare Tiny Aya Global vs Earth side by side
---
# Tiny Aya Compare
Side-by-side comparison of [CohereLabs/tiny-aya-global](https://huggingface.co/CohereLabs/tiny-aya-global) and
[CohereLabs/tiny-aya-earth](https://huggingface.co/CohereLabs/tiny-aya-earth) on the same prompt.
**Live Space:** https://huggingface.co/spaces/build-small-hackathon/tiny-aya-compare
This Space runs on **ZeroGPU** using the PyTorch/safetensors checkpoints (not the GGUF builds).
## Deploy to ZeroGPU (build-small-hackathon)
### Prerequisites
- Hugging Face account with access to the `build-small-hackathon` org
- **PRO, Team, or Enterprise** plan (required to attach ZeroGPU to a Space you create)
- `hf` CLI installed and authenticated: `hf auth login`
- Access to the gated CohereLabs Tiny Aya models
### 1. Create the Space
```bash
hf repos create build-small-hackathon/tiny-aya-compare \
--type space \
--space-sdk gradio \
--flavor zero-a10g \
--public \
--exist-ok
```
`--flavor zero-a10g` selects ZeroGPU. Hardware is **not** set via README frontmatter.
If the Space already exists on `cpu-basic`, switch hardware:
```bash
hf spaces settings build-small-hackathon/tiny-aya-compare --hardware zero-a10g
```
### 2. Accept gated model access
Both checkpoints are **gated** (`auto`). The account whose token you store as `HF_TOKEN` must accept access first:
1. https://huggingface.co/CohereLabs/tiny-aya-global β€” click **Agree and access**
2. https://huggingface.co/CohereLabs/tiny-aya-earth β€” click **Agree and access**
Without this, startup fails with `403 Client Error` / `Cannot access gated repo`.
### 3. Add the gated-model token
Add a Space secret so the runtime can download the weights:
```bash
hf spaces secrets add build-small-hackathon/tiny-aya-compare -s HF_TOKEN
```
This uses your logged-in `hf` token. Re-run after accepting model access if the Space already failed once.
Use a token with `read` scope and access to both CohereLabs repos.
### 4. Push the app
From this repo:
```bash
git init
git add app.py README.md requirements.txt
git commit -m "Add Tiny Aya Global vs Earth ZeroGPU comparison app"
git remote add space https://huggingface.co/spaces/build-small-hackathon/tiny-aya-compare
git push space HEAD:main
```
Or upload without git:
```bash
hf upload build-small-hackathon/tiny-aya-compare . \
--include "app.py" \
--include "README.md" \
--include "requirements.txt"
```
### 5. Verify the deployment
```bash
# Runtime stage and hardware
hf spaces info build-small-hackathon/tiny-aya-compare --expand runtime
# Build + startup logs
hf spaces logs build-small-hackathon/tiny-aya-compare --tail 200
# Smoke test (after RUNNING)
gradio predict build-small-hackathon/tiny-aya-compare /compare \
'{"prompt": "Explain photosynthesis simply.", "max_new_tokens": 128, "temperature": 0.7, "top_p": 0.9}'
```
Expect `requested_hardware: zero-a10g` and both models loading at startup (~6 GB each in bf16).
### Troubleshooting
| Symptom | Fix |
|---------|-----|
| `Cannot access gated repo` / `403` on startup | Accept access on both model pages (step 2), then **Factory reboot** the Space |
| `ValueError: Invalid file descriptor: -1` after startup | **Benign** β€” Gradio SSR asyncio cleanup noise; ignore if stage is `RUNNING` |
| `RUNTIME_ERROR` after changing secrets | Factory reboot: Space Settings β†’ Factory rebuild |
| `ZeroGPU quota exceeded` (visitors) | Visitors need GPU quota; creators don't pay per visit |
| Hardware shows `cpu-basic` | `hf spaces settings build-small-hackathon/tiny-aya-compare --hardware zero-a10g` |
Restart after fixing access:
```bash
# Re-upload or reboot from the Space Settings UI β†’ "Factory rebuild"
hf spaces info build-small-hackathon/tiny-aya-compare --expand runtime
```
### 6. Update after changes
| Change | Action |
|--------|--------|
| `app.py`, `README.md` | `git push space main` (hot-reload) |
| `requirements.txt` | push β†’ full Space rebuild |
| Hardware / secrets | `hf spaces settings` / `hf spaces secrets add` |
## Local development (uv)
ZeroGPU Spaces require **Python 3.12** (not 3.14):
```bash
uv python pin 3.12
uv sync --group dev
uv run python app.py
```
Refresh `requirements.txt` after dependency changes (keep it minimal β€” Spaces preinstall `torch`, `gradio`, `spaces`, `huggingface_hub`):
```bash
# Edit requirements.txt manually, e.g.:
# accelerate>=1.4.0
# transformers>=4.50.0
```
## Why not GGUF here?
ZeroGPU is built around PyTorch and `@spaces.GPU`. GGUF targets llama.cpp (Ollama, `llama-server`, local inference) and does not integrate with ZeroGPU's scheduler.
Use the Transformers checkpoints in this Space. The [GGUF repos](https://huggingface.co/CohereLabs/tiny-aya-global-GGUF) remain useful for local llama.cpp deployment.