Instructions to use md13/vanitas-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MambaSSM
How to use md13/vanitas-sft with MambaSSM:
from mamba_ssm import MambaLMHeadModel model = MambaLMHeadModel.from_pretrained("md13/vanitas-sft") - Notebooks
- Google Colab
- Kaggle
metadata
tags:
- vanitas
- spoken-dialogue
- mamba-ssm
- flow-matching
license: mit
datasets:
- kyutai/DailyTalkContiguous
Vanitas SFT Model
Supervised fine-tuned model for real-time spoken dialogue, trained on kyutai/DailyTalkContiguous.
Architecture
- Perception Stream: Mamba-2 SSM (4 layers, d=256)
- Cognition Core: Sparse Attention (4 layers, d=256)
- Production Stream: Mamba-2 + Flow Matching (4 layers, d=256)
Training
- Dataset: kyutai/DailyTalkContiguous (2,286 dialogues)
- Epochs: 50
- Batch Size: 16
- Learning Rate: 2e-4
- Hardware: NVIDIA A100 (Modal Cloud)
Files
best_model.pt— Checkpoint with the lowest validation lossfinal_model.pt— Checkpoint after completing all 50 epochsconfig.json— Model configuration