Automatic Speech Recognition
PEFT
Safetensors
audio
lora
voxtral
voxtral-realtime
affect-tagging
expressive-tags
half-duplex
elevenlabs-tags
raft
rejection-sampling
rlhf
Instructions to use YongkangZOU/evoxtral-realtime-rl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use YongkangZOU/evoxtral-realtime-rl with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("mistralai/Voxtral-Mini-4B-Realtime-2602") model = PeftModel.from_pretrained(base_model, "YongkangZOU/evoxtral-realtime-rl") - Notebooks
- Google Colab
- Kaggle
RL — RAFT on Recipe I (Reward rAnked FineTuning). Tag F1 28%, Tag Recall 50%, -5pp hallucination vs SFT. Pair with base for ASR; see serve_modal.py Mode B hybrid. https://github.com/YongkangZOU/evoxtral-realtime
acfd33d verified | { | |
| "feature_extractor": { | |
| "feature_extractor_type": "VoxtralRealtimeFeatureExtractor", | |
| "feature_size": 128, | |
| "global_log_mel_max": 1.5, | |
| "hop_length": 160, | |
| "n_fft": 400, | |
| "padding_side": "right", | |
| "padding_value": 0.0, | |
| "return_attention_mask": true, | |
| "sampling_rate": 16000, | |
| "win_length": 400 | |
| }, | |
| "processor_class": "VoxtralRealtimeProcessor" | |
| } | |