TranscriptWriting / DEPLOY_TO_SPACES.md
jmisak's picture
Upload 13 files
56589d3 verified

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

Deploy to HuggingFace Spaces - Quick Start

βœ… Issue Fixed

The quote_extractor import error has been fixed! The app will now work even if the file is missing.


πŸš€ Option 1: Automated Preparation (Recommended)

Run this script to prepare a clean deployment package:

python prepare_for_spaces.py

This will:

  • Create a spaces_deployment/ directory
  • Copy only the required files
  • Remove any .env or test files
  • Show you a summary of what's included

Then upload everything from spaces_deployment/ to your Space.


πŸ“‹ Option 2: Manual Upload

Upload these files to your HuggingFace Space:

Required Files (Must have)

app.py
llm.py
extractors.py
tagging.py
chunking.py
validation.py
reporting.py
dashboard.py
production_logger.py
quote_extractor.py
requirements.txt

Optional Files

README.md
HUGGINGFACE_SPACES_SETUP.md

DO NOT upload:

  • .env file
  • test_*.py files
  • logs/ directory
  • outputs/ directory

πŸ”§ Space Configuration

1. Create Space

2. Upload Files

  • Drag and drop all files from the list above
  • OR connect a Git repository

3. Configure (Optional)

Go to Settings β†’ Variables and add:

Variable Value When to Use
DEBUG_MODE True To see detailed logs
LOCAL_MODEL TinyLlama/TinyLlama-1.1B-Chat-v1.0 For faster (but lower quality) processing
LLM_TEMPERATURE 0.5 For more deterministic outputs

Note: All settings have defaults - you don't need to configure anything!


⏱️ First Deployment

What to Expect

  1. Build time: 2-5 minutes (installing dependencies)
  2. Model download: 2-5 minutes (first time only - downloads Phi-3-mini)
  3. Subsequent starts: 30-60 seconds

Watch the Logs

Click Logs tab to see:

βœ… Configuration loaded for HuggingFace Spaces
πŸš€ TranscriptorAI Enterprise - LLM Backend: local
[Local Model] Loading microsoft/Phi-3-mini-4k-instruct...
Downloading (…)lve/main/config.json: 100%
[Local Model] βœ… Model loaded on cuda:0
Running on local URL:  http://0.0.0.0:7860

πŸ§ͺ Test Your Space

  1. Wait for "Running on local URL" message
  2. Upload a sample transcript (DOCX or PDF)
  3. Select "HCP" as interviewee type
  4. Click "Analyze Transcripts"

Expected:

  • Processing time: 5-10 minutes (depending on transcript length)
  • Quality score: 0.7-1.0
  • CSV and PDF downloads available

πŸ› Troubleshooting

Error: ModuleNotFoundError: No module named 'quote_extractor'

Status: βœ… FIXED - This is now optional

Error: ModuleNotFoundError: No module named 'xyz'

Solution: Upload the missing xyz.py file

Error: CUDA out of memory

Solution:

  • Change model: Add Variable LOCAL_MODEL=TinyLlama/TinyLlama-1.1B-Chat-v1.0
  • OR upgrade to larger GPU

Error: Very slow processing

Check:

  • Is GPU hardware selected? (Not CPU)
  • Look for "Model loaded on cuda:0" in logs
  • If you see "cpu", upgrade to GPU tier

Quality Score still 0.00

Debug:

  1. Set DEBUG_MODE=True in Variables
  2. Check logs for "[Local Model] βœ… Generated X characters"
  3. Look for "[LLM Debug] Successfully extracted JSON"
  4. If you see [Error] messages, share them

πŸ’‘ Tips

Reduce Costs

  • Space sleeps after 48h inactivity (free)
  • Only pays for GPU time when active
  • ~$0.60/hour for T4 GPU

Improve Speed

  • Use smaller model (TinyLlama)
  • Reduce max tokens (edit llm.py line 410)
  • Process fewer chunks

Improve Quality

  • Use larger model (Mistral-7B)
  • Increase temperature for creative outputs
  • Keep default Phi-3-mini for best balance

πŸ“ž Need Help?

  1. Check logs first - Most issues show clear error messages
  2. Read HUGGINGFACE_SPACES_SETUP.md - Detailed troubleshooting
  3. Test locally first - Run python test_local_model.py

✨ You're Ready!

Run the preparation script:

python prepare_for_spaces.py

Then upload to HuggingFace Spaces and you're done! πŸŽ‰


Last Updated: October 2025