Spaces:

Luigi
/

tiny-scribe

Running

App Files Files Community

tiny-scribe / DEPLOY.md

Luigi

Test: Verify git workflow preserves commit messages

fc5ac33 about 1 month ago

preview code

raw

history blame contribute delete

4.62 kB

	# HuggingFace Spaces Deployment Guide

	## Quick Start

	### 1. Create Space on HuggingFace

	1. Go to [huggingface.co/spaces](https://huggingface.co/spaces)
	2. Click "Create new Space"
	3. Select:
	- Space name: `tiny-scribe` (or your preferred name)
	- SDK: Docker
	- Space hardware: CPU (Free Tier - 2 vCPUs)
	4. Click "Create Space"

	### 2. Upload Files

	Upload these files to your Space:
	- `app.py` - Main Gradio application
	- `Dockerfile` - Container configuration
	- `requirements.txt` - Python dependencies
	- `README.md` - Space documentation
	- `transcripts/` - Example files (optional)

	Using Git:
	```bash
	git clone https://huggingface.co/spaces/your-username/tiny-scribe
	cd tiny-scribe
	# Copy files from this repo
	git add .
	git commit -m "Initial HF Spaces deployment"
	git push
	```

	IMPORTANT: Always use `git push` - never edit files via the HuggingFace web UI. Web edits create generic commit messages like "Upload app.py with huggingface_hub".

	### 3. Wait for Build

	The Space will automatically:
	1. Build the Docker container (~2-5 minutes)
	2. Install dependencies (llama-cpp-python wheel is prebuilt)
	3. Start the Gradio app

	### 4. Access Your App

	Once built, visit: `https://your-username-tiny-scribe.hf.space`

	## Configuration

	### Model Selection

	The default model (`unsloth/Qwen3-0.6B-GGUF` Q4_K_M) is optimized for CPU:
	- Small: 0.6B parameters
	- Fast: ~2-5 seconds for short texts
	- Efficient: Uses ~400MB RAM

	To change models, edit `app.py`:
	```python
	DEFAULT_MODEL = "unsloth/Qwen3-1.7B-GGUF" # Larger model
	DEFAULT_FILENAME = "*Q2_K_L.gguf" # Lower quantization for speed
	```

	### Performance Tuning

	For Free Tier (2 vCPUs):
	- Keep `n_ctx=4096` (context window)
	- Use `max_tokens=512` (output length)
	- Set `temperature=0.6` (balance creativity/coherence)

	### Environment Variables

	Optional settings in Space Settings:
	```
	MODEL_REPO=unsloth/Qwen3-0.6B-GGUF
	MODEL_FILENAME=*Q4_K_M.gguf
	MAX_TOKENS=512
	TEMPERATURE=0.6
	```

	## Features

	1. File Upload: Drag & drop .txt files
	2. Live Streaming: Real-time token output
	3. Traditional Chinese: Auto-conversion to zh-TW
	4. Progressive Loading: Model downloads on first use (~30-60s)
	5. Responsive UI: Works on mobile and desktop

	## Troubleshooting

	### Build Fails
	- Check Docker Hub status
	- Verify requirements.txt syntax
	- Ensure no large files in repo

	### Out of Memory
	- Reduce `n_ctx` (context window)
	- Use smaller model (Q2_K quantization)
	- Limit input file size

	### Slow Inference
	- Normal for CPU-only Free Tier
	- First request downloads model (~400MB)
	- Subsequent requests are faster

	## Architecture

	```
	User Upload → Gradio Interface → app.py → llama-cpp-python → Qwen Model
	↓
	OpenCC (s2twp)
	↓
	Streaming Output → User
	```

	## Deployment Workflow

	### Recommended: Use the Deployment Script

	The `deploy.sh` script ensures meaningful commit messages:

	```bash
	# Make your changes
	vim app.py

	# Test locally
	python app.py

	# Deploy with meaningful message
	./deploy.sh "Fix: Improve thinking block extraction"
	```

	The script will:
	1. Check for uncommitted changes
	2. Prompt for commit message if not provided
	3. Warn about generic/short messages
	4. Show commits to be pushed
	5. Confirm before pushing
	6. Verify commit message was preserved on remote

	### Manual Deployment

	If deploying manually:

	```bash
	# 1. Make changes
	vim app.py

	# 2. Test locally
	python app.py

	# 3. Commit with detailed message
	git add app.py
	git commit -m "Fix: Improve streaming output formatting

	- Extract thinking blocks more reliably
	- Show full response in thinking field
	- Update regex pattern for better parsing"

	# 4. Push to HuggingFace Spaces
	git push origin main

	# 5. Verify deployment
	# Visit: https://huggingface.co/spaces/Luigi/tiny-scribe
	```

	### Avoiding Generic Commit Messages

	❌ DON'T:
	- Edit files directly on huggingface.co
	- Use the "Upload files" button in HF web UI
	- Use single-word commit messages ("fix", "update")

	✅ DO:
	- Always use `git push` from command line
	- Write descriptive commit messages
	- Test locally before pushing

	### Git Hook

	A pre-push hook is installed in `.git/hooks/pre-push` that:
	- Validates commit messages before pushing
	- Warns about very short messages
	- Ensures you're not accidentally pushing generic commits

	## Local Testing

	Before deploying to HF Spaces:

	```bash
	pip install -r requirements.txt
	python app.py
	```

	Then open: http://localhost:7860

	## License

	MIT - See LICENSE file for details.