MambaSSM
vanitas
spoken-dialogue
flow-matching
vanitas-sft / README.md
md13's picture
Upload README.md with huggingface_hub
503684c verified
|
Raw
History Blame Contribute Delete
859 Bytes
---
tags:
- vanitas
- spoken-dialogue
- mamba-ssm
- flow-matching
license: mit
datasets:
- kyutai/DailyTalkContiguous
---
# Vanitas SFT Model
Supervised fine-tuned model for real-time spoken dialogue, trained on [kyutai/DailyTalkContiguous](https://huggingface.co/datasets/kyutai/DailyTalkContiguous).
## Architecture
- **Perception Stream**: Mamba-2 SSM (4 layers, d=256)
- **Cognition Core**: Sparse Attention (4 layers, d=256)
- **Production Stream**: Mamba-2 + Flow Matching (4 layers, d=256)
## Training
- **Dataset**: kyutai/DailyTalkContiguous (2,286 dialogues)
- **Epochs**: 50
- **Batch Size**: 16
- **Learning Rate**: 2e-4
- **Hardware**: NVIDIA A100 (Modal Cloud)
## Files
- `best_model.pt` — Checkpoint with the lowest validation loss
- `final_model.pt` — Checkpoint after completing all 50 epochs
- `config.json` — Model configuration