Spaces:
Sleeping
Sleeping
| # Deploy to HuggingFace Spaces - Quick Start | |
| ## β Issue Fixed | |
| **The `quote_extractor` import error has been fixed!** The app will now work even if the file is missing. | |
| --- | |
| ## π Option 1: Automated Preparation (Recommended) | |
| Run this script to prepare a clean deployment package: | |
| ```bash | |
| python prepare_for_spaces.py | |
| ``` | |
| This will: | |
| - Create a `spaces_deployment/` directory | |
| - Copy only the required files | |
| - Remove any .env or test files | |
| - Show you a summary of what's included | |
| Then upload everything from `spaces_deployment/` to your Space. | |
| --- | |
| ## π Option 2: Manual Upload | |
| Upload these files to your HuggingFace Space: | |
| ### Required Files (Must have) | |
| ``` | |
| app.py | |
| llm.py | |
| extractors.py | |
| tagging.py | |
| chunking.py | |
| validation.py | |
| reporting.py | |
| dashboard.py | |
| production_logger.py | |
| quote_extractor.py | |
| requirements.txt | |
| ``` | |
| ### Optional Files | |
| ``` | |
| README.md | |
| HUGGINGFACE_SPACES_SETUP.md | |
| ``` | |
| **DO NOT upload:** | |
| - `.env` file | |
| - `test_*.py` files | |
| - `logs/` directory | |
| - `outputs/` directory | |
| --- | |
| ## π§ Space Configuration | |
| ### 1. Create Space | |
| - Go to https://huggingface.co/new-space | |
| - Name: `transcriptor-ai` (or your choice) | |
| - SDK: **Gradio** | |
| - Hardware: **GPU (T4 or better)** β Important! | |
| ### 2. Upload Files | |
| - Drag and drop all files from the list above | |
| - OR connect a Git repository | |
| ### 3. Configure (Optional) | |
| Go to **Settings β Variables** and add: | |
| | Variable | Value | When to Use | | |
| |----------|-------|-------------| | |
| | `DEBUG_MODE` | `True` | To see detailed logs | | |
| | `LOCAL_MODEL` | `TinyLlama/TinyLlama-1.1B-Chat-v1.0` | For faster (but lower quality) processing | | |
| | `LLM_TEMPERATURE` | `0.5` | For more deterministic outputs | | |
| **Note:** All settings have defaults - you don't need to configure anything! | |
| --- | |
| ## β±οΈ First Deployment | |
| ### What to Expect | |
| 1. **Build time:** 2-5 minutes (installing dependencies) | |
| 2. **Model download:** 2-5 minutes (first time only - downloads Phi-3-mini) | |
| 3. **Subsequent starts:** 30-60 seconds | |
| ### Watch the Logs | |
| Click **Logs** tab to see: | |
| ``` | |
| β Configuration loaded for HuggingFace Spaces | |
| π TranscriptorAI Enterprise - LLM Backend: local | |
| [Local Model] Loading microsoft/Phi-3-mini-4k-instruct... | |
| Downloading (β¦)lve/main/config.json: 100% | |
| [Local Model] β Model loaded on cuda:0 | |
| Running on local URL: http://0.0.0.0:7860 | |
| ``` | |
| --- | |
| ## π§ͺ Test Your Space | |
| 1. Wait for "Running on local URL" message | |
| 2. Upload a sample transcript (DOCX or PDF) | |
| 3. Select "HCP" as interviewee type | |
| 4. Click "Analyze Transcripts" | |
| **Expected:** | |
| - Processing time: 5-10 minutes (depending on transcript length) | |
| - Quality score: 0.7-1.0 | |
| - CSV and PDF downloads available | |
| --- | |
| ## π Troubleshooting | |
| ### Error: `ModuleNotFoundError: No module named 'quote_extractor'` | |
| **Status:** β FIXED - This is now optional | |
| ### Error: `ModuleNotFoundError: No module named 'xyz'` | |
| **Solution:** Upload the missing `xyz.py` file | |
| ### Error: `CUDA out of memory` | |
| **Solution:** | |
| - Change model: Add Variable `LOCAL_MODEL=TinyLlama/TinyLlama-1.1B-Chat-v1.0` | |
| - OR upgrade to larger GPU | |
| ### Error: Very slow processing | |
| **Check:** | |
| - Is GPU hardware selected? (Not CPU) | |
| - Look for "Model loaded on cuda:0" in logs | |
| - If you see "cpu", upgrade to GPU tier | |
| ### Quality Score still 0.00 | |
| **Debug:** | |
| 1. Set `DEBUG_MODE=True` in Variables | |
| 2. Check logs for "[Local Model] β Generated X characters" | |
| 3. Look for "[LLM Debug] Successfully extracted JSON" | |
| 4. If you see `[Error]` messages, share them | |
| --- | |
| ## π‘ Tips | |
| ### Reduce Costs | |
| - Space sleeps after 48h inactivity (free) | |
| - Only pays for GPU time when active | |
| - ~$0.60/hour for T4 GPU | |
| ### Improve Speed | |
| - Use smaller model (TinyLlama) | |
| - Reduce max tokens (edit llm.py line 410) | |
| - Process fewer chunks | |
| ### Improve Quality | |
| - Use larger model (Mistral-7B) | |
| - Increase temperature for creative outputs | |
| - Keep default Phi-3-mini for best balance | |
| --- | |
| ## π Need Help? | |
| 1. **Check logs first** - Most issues show clear error messages | |
| 2. **Read HUGGINGFACE_SPACES_SETUP.md** - Detailed troubleshooting | |
| 3. **Test locally first** - Run `python test_local_model.py` | |
| --- | |
| ## β¨ You're Ready! | |
| Run the preparation script: | |
| ```bash | |
| python prepare_for_spaces.py | |
| ``` | |
| Then upload to HuggingFace Spaces and you're done! π | |
| --- | |
| **Last Updated:** October 2025 | |