OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Paper • 2410.17799 • Published • 13
A sub-200M parameter full-duplex spoken interaction model with 200ms turn-taking, built on SmolLM2-135M + CosyVoice. See ARCHITECTURE.md for the complete PRD.
This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "PranavHarshan/SmolDuplex"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.