Spaces:

build-small-hackathon
/

tiny-aya-compare

Sleeping

App Files Files Community

tiny-aya-compare / README.md

mlandia

Add Tiny Aya Global vs Earth ZeroGPU comparison app

3a5665b 21 days ago

preview code

Raw

History Blame Contribute Delete

4.97 kB

	---
	title: Tiny Aya Compare
	emoji: 🌍
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 6.18.0
	app_file: app.py
	python_version: "3.12"
	startup_duration_timeout: 30m
	short_description: Compare Tiny Aya Global vs Earth side by side
	---

	# Tiny Aya Compare

	Side-by-side comparison of [CohereLabs/tiny-aya-global](https://huggingface.co/CohereLabs/tiny-aya-global) and
	[CohereLabs/tiny-aya-earth](https://huggingface.co/CohereLabs/tiny-aya-earth) on the same prompt.

	Live Space: https://huggingface.co/spaces/build-small-hackathon/tiny-aya-compare

	This Space runs on ZeroGPU using the PyTorch/safetensors checkpoints (not the GGUF builds).

	## Deploy to ZeroGPU (build-small-hackathon)

	### Prerequisites

	- Hugging Face account with access to the `build-small-hackathon` org
	- PRO, Team, or Enterprise plan (required to attach ZeroGPU to a Space you create)
	- `hf` CLI installed and authenticated: `hf auth login`
	- Access to the gated CohereLabs Tiny Aya models

	### 1. Create the Space

	```bash
	hf repos create build-small-hackathon/tiny-aya-compare \
	--type space \
	--space-sdk gradio \
	--flavor zero-a10g \
	--public \
	--exist-ok
	```

	`--flavor zero-a10g` selects ZeroGPU. Hardware is not set via README frontmatter.

	If the Space already exists on `cpu-basic`, switch hardware:

	```bash
	hf spaces settings build-small-hackathon/tiny-aya-compare --hardware zero-a10g
	```

	### 2. Accept gated model access

	Both checkpoints are gated (`auto`). The account whose token you store as `HF_TOKEN` must accept access first:

	1. https://huggingface.co/CohereLabs/tiny-aya-global — click Agree and access
	2. https://huggingface.co/CohereLabs/tiny-aya-earth — click Agree and access

	Without this, startup fails with `403 Client Error` / `Cannot access gated repo`.

	### 3. Add the gated-model token

	Add a Space secret so the runtime can download the weights:

	```bash
	hf spaces secrets add build-small-hackathon/tiny-aya-compare -s HF_TOKEN
	```

	This uses your logged-in `hf` token. Re-run after accepting model access if the Space already failed once.

	Use a token with `read` scope and access to both CohereLabs repos.

	### 4. Push the app

	From this repo:

	```bash
	git init
	git add app.py README.md requirements.txt
	git commit -m "Add Tiny Aya Global vs Earth ZeroGPU comparison app"
	git remote add space https://huggingface.co/spaces/build-small-hackathon/tiny-aya-compare
	git push space HEAD:main
	```

	Or upload without git:

	```bash
	hf upload build-small-hackathon/tiny-aya-compare . \
	--include "app.py" \
	--include "README.md" \
	--include "requirements.txt"
	```

	### 5. Verify the deployment

	```bash
	# Runtime stage and hardware
	hf spaces info build-small-hackathon/tiny-aya-compare --expand runtime

	# Build + startup logs
	hf spaces logs build-small-hackathon/tiny-aya-compare --tail 200

	# Smoke test (after RUNNING)
	gradio predict build-small-hackathon/tiny-aya-compare /compare \
	'{"prompt": "Explain photosynthesis simply.", "max_new_tokens": 128, "temperature": 0.7, "top_p": 0.9}'
	```

	Expect `requested_hardware: zero-a10g` and both models loading at startup (~6 GB each in bf16).

	### Troubleshooting

	\| Symptom \| Fix \|
	\|---------\|-----\|
	\| `Cannot access gated repo` / `403` on startup \| Accept access on both model pages (step 2), then Factory reboot the Space \|
	\| `ValueError: Invalid file descriptor: -1` after startup \| Benign — Gradio SSR asyncio cleanup noise; ignore if stage is `RUNNING` \|
	\| `RUNTIME_ERROR` after changing secrets \| Factory reboot: Space Settings → Factory rebuild \|
	\| `ZeroGPU quota exceeded` (visitors) \| Visitors need GPU quota; creators don't pay per visit \|
	\| Hardware shows `cpu-basic` \| `hf spaces settings build-small-hackathon/tiny-aya-compare --hardware zero-a10g` \|

	Restart after fixing access:

	```bash
	# Re-upload or reboot from the Space Settings UI → "Factory rebuild"
	hf spaces info build-small-hackathon/tiny-aya-compare --expand runtime
	```

	### 6. Update after changes

	\| Change \| Action \|
	\|--------\|--------\|
	\| `app.py`, `README.md` \| `git push space main` (hot-reload) \|
	\| `requirements.txt` \| push → full Space rebuild \|
	\| Hardware / secrets \| `hf spaces settings` / `hf spaces secrets add` \|

	## Local development (uv)

	ZeroGPU Spaces require Python 3.12 (not 3.14):

	```bash
	uv python pin 3.12
	uv sync --group dev
	uv run python app.py
	```

	Refresh `requirements.txt` after dependency changes (keep it minimal — Spaces preinstall `torch`, `gradio`, `spaces`, `huggingface_hub`):

	```bash
	# Edit requirements.txt manually, e.g.:
	# accelerate>=1.4.0
	# transformers>=4.50.0
	```

	## Why not GGUF here?

	ZeroGPU is built around PyTorch and `@spaces.GPU`. GGUF targets llama.cpp (Ollama, `llama-server`, local inference) and does not integrate with ZeroGPU's scheduler.

	Use the Transformers checkpoints in this Space. The [GGUF repos](https://huggingface.co/CohereLabs/tiny-aya-global-GGUF) remain useful for local llama.cpp deployment.