Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
6.2.0
Deploy to HuggingFace Spaces - Quick Start
β Issue Fixed
The quote_extractor import error has been fixed! The app will now work even if the file is missing.
π Option 1: Automated Preparation (Recommended)
Run this script to prepare a clean deployment package:
python prepare_for_spaces.py
This will:
- Create a
spaces_deployment/directory - Copy only the required files
- Remove any .env or test files
- Show you a summary of what's included
Then upload everything from spaces_deployment/ to your Space.
π Option 2: Manual Upload
Upload these files to your HuggingFace Space:
Required Files (Must have)
app.py
llm.py
extractors.py
tagging.py
chunking.py
validation.py
reporting.py
dashboard.py
production_logger.py
quote_extractor.py
requirements.txt
Optional Files
README.md
HUGGINGFACE_SPACES_SETUP.md
DO NOT upload:
.envfiletest_*.pyfileslogs/directoryoutputs/directory
π§ Space Configuration
1. Create Space
- Go to https://huggingface.co/new-space
- Name:
transcriptor-ai(or your choice) - SDK: Gradio
- Hardware: GPU (T4 or better) β Important!
2. Upload Files
- Drag and drop all files from the list above
- OR connect a Git repository
3. Configure (Optional)
Go to Settings β Variables and add:
| Variable | Value | When to Use |
|---|---|---|
DEBUG_MODE |
True |
To see detailed logs |
LOCAL_MODEL |
TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
For faster (but lower quality) processing |
LLM_TEMPERATURE |
0.5 |
For more deterministic outputs |
Note: All settings have defaults - you don't need to configure anything!
β±οΈ First Deployment
What to Expect
- Build time: 2-5 minutes (installing dependencies)
- Model download: 2-5 minutes (first time only - downloads Phi-3-mini)
- Subsequent starts: 30-60 seconds
Watch the Logs
Click Logs tab to see:
β
Configuration loaded for HuggingFace Spaces
π TranscriptorAI Enterprise - LLM Backend: local
[Local Model] Loading microsoft/Phi-3-mini-4k-instruct...
Downloading (β¦)lve/main/config.json: 100%
[Local Model] β
Model loaded on cuda:0
Running on local URL: http://0.0.0.0:7860
π§ͺ Test Your Space
- Wait for "Running on local URL" message
- Upload a sample transcript (DOCX or PDF)
- Select "HCP" as interviewee type
- Click "Analyze Transcripts"
Expected:
- Processing time: 5-10 minutes (depending on transcript length)
- Quality score: 0.7-1.0
- CSV and PDF downloads available
π Troubleshooting
Error: ModuleNotFoundError: No module named 'quote_extractor'
Status: β FIXED - This is now optional
Error: ModuleNotFoundError: No module named 'xyz'
Solution: Upload the missing xyz.py file
Error: CUDA out of memory
Solution:
- Change model: Add Variable
LOCAL_MODEL=TinyLlama/TinyLlama-1.1B-Chat-v1.0 - OR upgrade to larger GPU
Error: Very slow processing
Check:
- Is GPU hardware selected? (Not CPU)
- Look for "Model loaded on cuda:0" in logs
- If you see "cpu", upgrade to GPU tier
Quality Score still 0.00
Debug:
- Set
DEBUG_MODE=Truein Variables - Check logs for "[Local Model] β Generated X characters"
- Look for "[LLM Debug] Successfully extracted JSON"
- If you see
[Error]messages, share them
π‘ Tips
Reduce Costs
- Space sleeps after 48h inactivity (free)
- Only pays for GPU time when active
- ~$0.60/hour for T4 GPU
Improve Speed
- Use smaller model (TinyLlama)
- Reduce max tokens (edit llm.py line 410)
- Process fewer chunks
Improve Quality
- Use larger model (Mistral-7B)
- Increase temperature for creative outputs
- Keep default Phi-3-mini for best balance
π Need Help?
- Check logs first - Most issues show clear error messages
- Read HUGGINGFACE_SPACES_SETUP.md - Detailed troubleshooting
- Test locally first - Run
python test_local_model.py
β¨ You're Ready!
Run the preparation script:
python prepare_for_spaces.py
Then upload to HuggingFace Spaces and you're done! π
Last Updated: October 2025