Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.1.0
Deployment Instructions
Deploying to Hugging Face Spaces
Prerequisites
- A Hugging Face account (free)
- Git installed locally
Steps
Create a new Space on Hugging Face:
- Go to https://huggingface.co/spaces
- Click "Create new Space"
- Choose a name (e.g., "ai-text-assistant")
- Select "Gradio" as the SDK
- Choose visibility (Public or Private)
- Click "Create Space"
Clone your Space repository:
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME cd YOUR_SPACE_NAMECopy the application files: Copy these files from this project to your Space repository:
app.pyrequirements.txtREADME.md.gitignore(optional)
Commit and push:
git add . git commit -m "Initial commit: AI Text Assistant" git pushWait for deployment:
- Hugging Face Spaces will automatically detect the changes
- The build process will install dependencies and start the app
- This may take 5-10 minutes for the first deployment
- You can watch the build logs in the Space's "Logs" tab
Access your app:
- Once deployed, your app will be available at:
https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
Local Testing
To test locally before deploying:
# Install dependencies
pip install -r requirements.txt
# Run the app
python app.py
The app will be available at http://127.0.0.1:7860
Configuration Options
Hardware
For better performance, you can upgrade your Space's hardware:
- Go to Space Settings → Hardware
- Options include CPU (free), GPU T4 (small fee), GPU A10G, etc.
- The app works on CPU but will be faster with GPU
Environment Variables
You can set these in Space Settings → Variables:
TRANSFORMERS_CACHE: Custom cache directory for modelsHF_HOME: Hugging Face home directory
Troubleshooting
Build fails with memory errors:
- The models are relatively small, but if you encounter issues:
- Upgrade to a better hardware tier
- Or consider using Hugging Face Inference API instead
App starts slowly:
- The first run downloads models (~1GB for Qwen, ~1.6GB for BART)
- Subsequent runs will use cached models
- Model loading takes 30-60 seconds on CPU
Token alternatives not showing:
- Make sure you hover over the generated words
- The tooltip appears on hover with a slight delay
- Try different browsers if issues persist
Performance Notes
- First Load: Slow due to model downloads
- Model Loading: 30-60 seconds on CPU, 5-10 seconds on GPU
- Generation Speed:
- Qwen (0.5B): ~10-20 tokens/sec on CPU, ~100+ tokens/sec on GPU
- BART-large: ~5-10 tokens/sec on CPU, ~50+ tokens/sec on GPU
Support
For issues or questions:
- Check Hugging Face Spaces documentation: https://huggingface.co/docs/hub/spaces
- Open an issue on the repository
- Contact: Your email/contact info