Spaces:
Sleeping
Sleeping
Commit History
BREAKTHROUGH: Fix PAD token issue with correct STT start token 8f3b8dc
Enable audio boosting for quiet speech (RMS < 0.12) d065e75
Remove cross-attention from v0_1_vision config for Q8 model 8ac2fe2
Use v0_1_vision() config for exact Q8 model dimension match de3ee9c
Fix Q8 model shape mismatch - let GGUF auto-configure dimensions 018577d
Fix GGUF model loading - HeaderTooLarge error resolved f0ee083
Fix transformer config field names (validated) - v1.6.4 23055ba
Fix LM config compilation error - v1.6.3 ce00088
Fix model dimensions + pre-download moshiko-candle-bf16 models - v1.6.2 bd7015b
Implement exact official Moshi backend configuration - v1.6.0 a39a349
Switch to kyutai/moshiko-candle-q8 quantized model - v1.5.0 ac08335
Fix Rust borrowing error in bounds checking - v1.4.40 9a275d7
Add defensive bounds checking for Moshi lm_state.step() - v1.4.39 7c0151f
v1.4.38: CRITICAL FIX - Moshi dual-stream configuration for STT 0a9e40c
DEFINITIVE FIX: Configure Mimi for exactly 8 codebooks to match STT v0_1_one_way() v1.4.18 939920e
Use proper STT-only configuration: v0_1_one_way() 7385d5b
Fix array bounds: limit codes to 8 elements instead of 16 093fc01
Force Rust binary rebuild with v1.4.13 - includes Moshi panic fix 0590246
Fix codes mutability for truncate - make codes mutable 76a3c4f
Fix Rust compilation warning: remove unused mut 5f68dd7
Fix critical Moshi panic: correct audio codebook configuration aecbce5
v1.4.8: Add language conditioning for multilingual STT model 26d8204
Peter Michael Gits Claude commited on
v1.4.7: BREAKTHROUGH - Fix extremely quiet audio causing pad tokens c638914
Peter Michael Gits Claude commited on
v1.4.6: SYSTEMATIC DEBUGGING - Audio analysis + step limit fix 73f0350
Peter Michael Gits Claude commited on
v1.4.5: Follow moshi library patterns - Remove competing normalization a3c5d00
Peter Michael Gits Claude commited on
v1.4.4: CRITICAL FIX - Audio normalization destroying speech recognition a4207ab
Peter Michael Gits Claude commited on
v1.4.3: Debug token filtering - Show all generated tokens a6c9652
Peter Michael Gits Claude commited on
CRITICAL FIX: Resolve index out of bounds panic in audio processing a1ef79c
Peter Michael Gits Claude commited on
CRITICAL FIX: Correct vocab size mismatch in 1B model configuration e8b026b
Peter Michael Gits Claude commited on
REVERT: Switch back to 1B multilingual model for T4 GPU compatibility 5d40667
Peter Michael Gits Claude commited on
Fix vocab_size scope error by adding to MoshiAsrModel struct 55a8e6e
Peter Michael Gits Claude commited on
Fix Rust compilation errors in model.rs ba9d4d9
Peter Michael Gits Claude commited on
MAJOR: Switch to 2.6B English STT model to match unmute.sh 81597c8
Peter Michael Gits Claude commited on
Fix lm_generate_multistream::State constructor 593a7cc
Peter Michael Gits Claude commited on
Fix STT implementation to use proper moshi-backend patterns 902e9ab
Peter Michael Gits Claude commited on
Add debugging for LM model text generation 7f194a3
Peter Michael Gits Claude commited on
Fix all candle_core references to use renamed candle crate a5fc257
Peter Michael Gits Claude commited on
Fix candle dependency using moshi pattern 3d7da2c
Peter Michael Gits Claude commited on
Fix candle imports to match moshi asr.rs patterns 2d3cafd
Peter Michael Gits Claude commited on
Implement A+ plan: Real text generation from LmModel cb2aec6
Peter Michael Gits Claude commited on
π MAJOR: Pre-load models into Docker image for instant startup 290d157
Peter Michael Gits Claude commited on
REVERT: Restore to last known working state (pre-load_streaming) 2404b60
Peter Michael Gits Claude commited on
Fix compilation errors with moshi 0.6.3 API 2805311
Peter Michael Gits Claude commited on
Use official moshi-backend loading pattern: load_streaming() 6bdc0ee
Peter Michael Gits Claude commited on
Fix audio tensor rank: 2D β 3D for Mimi encoder compatibility 0770155
Peter Michael Gits Claude commited on
Fix moshi Config field names: vocab_size_in β text_in_vocab_size dc4ad94
Peter Michael Gits Claude commited on
Fix model vocab size mismatch: 48001 β 8001 tokens 75da5dd
Peter Michael Gits Claude commited on
Fix CUDA compatibility for T4 GPU by using F32 instead of BF16 21cada0
Peter Michael Gits Claude commited on
Fix root cause: eliminate multiple process startup v0.8.0 cbe173b
Peter Michael Gits Claude commited on