ensemble-tts-annotation / REALTIME_PROGRESS.md
marcosremar
Fix emotion2vec loading - use wav2vec2 compatible model
d669352
# πŸš€ Real-Time Progress - SkyPilot Fine-tuning
**Status**: ⏳ IN PROGRESS
**Started**: 2025-12-02 13:00 UTC
**Cluster**: sky-33ba-marcos
---
## πŸ“Š Current Job: Fine-tuning
### Machine Provisioned:
```
Provider: Vast.ai (Czechia, CZ, EU)
Instance: A100 SXM4
vCPUs: 32 cores
RAM: 64GB
GPU: A100 (1x)
Cost: $0.00/hr ✨ FREE!
```
### What's Running:
1. βœ… Machine provisioned
2. ⏳ Installing dependencies (torch, transformers, librosa)
3. ⏳ Cloning repository
4. ⏳ Creating synthetic data (50 samples/emotion)
5. ⏳ Preparing dataset
6. ⏳ Fine-tuning emotion2vec (10 epochs)
7. ⏳ Testing model
### Estimated Time:
- Setup: ~5min
- Data generation: ~1min
- Fine-tuning: ~20-30min
- Testing: ~2min
- **Total**: ~30-40min
### Expected Output:
```
βœ… Fine-tuning complete!
Model saved to: models/emotion/emotion2vec_finetuned_synthetic/
```
---
## πŸ“ How to Monitor
### Check logs in real-time:
```bash
sky logs sky-33ba-marcos -f
```
### Check status:
```bash
sky status
```
### SSH to machine (while running):
```bash
sky ssh sky-33ba-marcos
# Inside:
cd ensemble-tts-annotation
watch -n 1 nvidia-smi # Monitor GPU usage
```
---
## πŸ’° Cost Tracking
| Item | Cost |
|------|------|
| Validation test | $0.00 |
| Fine-tuning (current) | $0.00 (Vast.ai spot) |
| **Total so far** | **$0.00** ✨ |
---
## 🎯 After This Completes
### Next Steps:
1. **Download model**:
```bash
sky scp sky-33ba-marcos:~/ensemble-tts-annotation/models/emotion/finetuned/ ./models/
```
2. **Test locally**:
```python
from ensemble_tts import EnsembleAnnotator
annotator = EnsembleAnnotator(mode='balanced', device='cuda')
result = annotator.annotate('audio.wav')
```
3. **Cleanup**:
```bash
sky down sky-33ba-marcos
```
4. **Then run**:
- Multi-GPU test (optional)
- OR Full Orpheus annotation (118k samples)
---
## πŸ“ˆ Progress Updates
### βœ… Job Completed - Partial Success
**Time**: 2025-12-02 13:03 UTC
**Duration**: 3 minutes
**Status**: βœ… SUCCEEDED (com erro no model loading)
#### What Worked βœ…
- βœ… Machine provisioned (A100 SXM4, 32 vCPUs, 64GB RAM)
- βœ… Dependencies installed (torch, transformers, librosa)
- βœ… Repository cloned
- βœ… **350 synthetic samples created** (50/emotion)
- βœ… **Dataset prepared** (data/prepared/synthetic_prepared)
#### Issues Found ❌
- ❌ emotion2vec model loading failed
- ❌ Model requires `funasr` library (not standard transformers)
- ❌ Fine-tuning didn't execute
- ❌ Model testing failed
#### Next Steps πŸ”§
1. Update emotion2vec implementation to use compatible wav2vec2
2. Re-run fine-tuning with corrected code
3. Or: Install funasr for native emotion2vec support
**Last update**: 2025-12-02 13:07 UTC - Completed with model loading error