TranscriptWriting / REQUIRED_FILES_FOR_SPACES.md
jmisak's picture
Upload 13 files
56589d3 verified
# Required Files for HuggingFace Spaces Deployment
## βœ… CRITICAL - Must Upload These Files
### Main Application
- `app.py` - Main Gradio application
### Core Processing Modules
- `llm.py` - LLM inference (local model support)
- `extractors.py` - DOCX/PDF text extraction
- `tagging.py` - Speaker identification
- `chunking.py` - Semantic text chunking
- `validation.py` - Quality scoring and validation
- `reporting.py` - CSV/PDF report generation
- `dashboard.py` - Dashboard generation
- `production_logger.py` - Session logging
### Optional but Recommended
- `quote_extractor.py` - Market research quote extraction (now optional)
### Configuration
- `requirements.txt` - Python dependencies
- `README.md` - Documentation (optional but good practice)
---
## ❌ DO NOT Upload These Files
### Local Development Only
- `.env` - Contains local secrets (use Spaces Variables instead)
- `*.log` - Log files
- `logs/` - Log directory
- `outputs/` - Output directory
- `__pycache__/` - Python cache
- `.git/` - Git repository
### Test Files (Not Needed)
- `test_*.py` - All test scripts
- `check_*.py` - Check scripts
- `debug_*.py` - Debug scripts
- `verify_*.py` - Verification scripts
- `fix_*.py` - Fix scripts
- `patch_*.py` - Patch scripts
- `create_sample_*.py` - Sample creation
### Documentation (Optional)
- `*.md` files - Helpful but not required for app to run
- You can upload them if you want documentation in your Space
---
## πŸ“¦ Minimal File List (Absolute Minimum)
If you want the smallest deployment, upload only these:
```
app.py
llm.py
extractors.py
tagging.py
chunking.py
validation.py
reporting.py
dashboard.py
production_logger.py
requirements.txt
```
**Quote extraction will be disabled** but everything else will work.
---
## πŸ“‹ Complete File List (Recommended)
Upload all core files plus quote extraction:
```
app.py
llm.py
extractors.py
tagging.py
chunking.py
validation.py
reporting.py
dashboard.py
production_logger.py
quote_extractor.py
requirements.txt
README.md (optional)
```
---
## πŸ” How to Check What's Missing
If you get `ModuleNotFoundError: No module named 'xyz'`, you need to upload `xyz.py`.
**Common missing modules:**
- `quote_extractor` β†’ Upload `quote_extractor.py`
- `production_logger` β†’ Upload `production_logger.py`
- `dashboard` β†’ Upload `dashboard.py`
---
## πŸ“ Folder Structure on HuggingFace Spaces
Your Space should look like:
```
your-space/
β”œβ”€β”€ app.py
β”œβ”€β”€ llm.py
β”œβ”€β”€ extractors.py
β”œβ”€β”€ tagging.py
β”œβ”€β”€ chunking.py
β”œβ”€β”€ validation.py
β”œβ”€β”€ reporting.py
β”œβ”€β”€ dashboard.py
β”œβ”€β”€ production_logger.py
β”œβ”€β”€ quote_extractor.py (optional)
β”œβ”€β”€ requirements.txt
└── README.md (optional)
```
**Do NOT create subdirectories** - keep all Python files in the root.
---
## πŸš€ Quick Upload Checklist
Before uploading to Spaces:
- [ ] `app.py` - Main file
- [ ] All imported modules (llm, extractors, etc.)
- [ ] `requirements.txt` - Dependencies
- [ ] Selected **GPU** hardware in Spaces settings
- [ ] No `.env` file included
- [ ] No test/debug files included
---
## πŸ”§ Troubleshooting Import Errors
### Error: `ModuleNotFoundError: No module named 'quote_extractor'`
**Fixed!** This is now optional - app will work without it.
### Error: `ModuleNotFoundError: No module named 'extractors'`
**Solution:** Upload `extractors.py`
### Error: `ModuleNotFoundError: No module named 'production_logger'`
**Solution:** Upload `production_logger.py`
### Error: `ModuleNotFoundError: No module named 'transformers'`
**Solution:** Check `requirements.txt` is uploaded and correct
---
## πŸ“ Alternative: Use Git Repository
Instead of manual upload, you can:
1. Create a Git repository with only required files
2. Connect it to your HuggingFace Space
3. Auto-deploy on push
**Create `.gitignore` to exclude:**
```
.env
*.log
logs/
outputs/
__pycache__/
test_*.py
debug_*.py
*.pyc
```
---
## Last Updated
October 2025