Spaces:

AVLL
/

Automated_Plant_Analysis_Pipeline_Demo

Running

App Files Files Community

Automated_Plant_Analysis_Pipeline_Demo / DEPLOYMENT.md

Fahimeh Orvati Nia

Fix storage limit by caching to /tmp

7ac2007 2 months ago

preview code

raw

history blame contribute delete

2.51 kB

	# Deployment Guide for Hugging Face Spaces

	## Storage Limit Error Fix

	If you see "Workload evicted, storage limit exceeded (50G)", here's how to fix it:

	### Quick Fix (Recommended)
	The pipeline now uses `/tmp` for caching (ephemeral storage), which resets on each container restart. This should prevent storage buildup.

	To apply the fix:
	1. Push the updated code to your Space
	2. The Space will rebuild automatically
	3. The model will cache to `/tmp` instead of persistent storage

	### Manual Cleanup
	If your Space is still stuck, you need to clean up old cached files:

	1. Go to your Space settings on Hugging Face
	2. Factory Reboot your Space:
	- Settings → Factory reboot
	- This will clear all persistent storage and restart fresh

	### Alternative: Upgrade Space Storage
	If you need more persistent storage:
	1. Go to Settings → Hardware
	2. Upgrade to a tier with more storage (costs $$$)

	## Storage Optimization Applied

	The following changes reduce storage usage:

	### 1. Cache to /tmp (ephemeral)
	```python
	# In sorghum_pipeline/segmentation/manager.py
	cache_dir = "/tmp/huggingface_cache" # Cleared on restart
	```

	### 2. Low memory mode
	```python
	low_cpu_mem_usage=True # Reduces peak memory during model load
	```

	### 3. Ignore files
	- `.dockerignore`: Prevents copying cache/models during build
	- `.gitignore`: Prevents committing large files

	### 4. Smaller model resolution
	- Using 512x512 instead of 1024x1024 for 4x speedup and less memory

	## Monitoring Storage

	To check storage usage in your Space:
	1. Open the Space logs
	2. Look for disk usage warnings
	3. If approaching 50GB, do a factory reboot

	## Expected Storage Usage

	- BRIA RMBG-2.0 model: ~350MB (cached to /tmp)
	- PyTorch/Transformers libs: ~2-3GB
	- Application code: <50MB
	- Temporary files: <1GB (cleared after each run)

	Total: ~3-4GB (well under 50GB limit)

	## Troubleshooting

	### "No space left on device"
	- Factory reboot the Space
	- Check if any large files were committed to git

	### "Model download failed"
	- Check HF_TOKEN is set in Space secrets
	- Verify internet connectivity in Space

	### Slow startup
	- First startup downloads model (~350MB)
	- Subsequent startups load from /tmp (fast)
	- After container restart, re-downloads to /tmp

	## Best Practices

	1. ✅ Use `/tmp` for all caches
	2. ✅ Enable `low_cpu_mem_usage=True`
	3. ✅ Keep `.dockerignore` and `.gitignore` updated
	4. ❌ Don't commit model weights to git
	5. ❌ Don't use persistent cache directories