Frontend simplification (4→2 tabs) + lazy imports for HF Spaces 78caafb KinetoLabs Claude Opus 4.5 commited on Jan 11
Replace dual 8B with single 30B-A3B FP8 vision model 706520f KinetoLabs Claude Opus 4.5 commited on Jan 11
Reduce thinking model max_new_tokens to fix slow inference 0699c5f KinetoLabs Claude Opus 4.5 commited on Jan 11
Replace 30B MoE with dual 8B models (Thinking + Instruct) 333c083 KinetoLabs Claude Opus 4.5 commited on Jan 11
Implement lazy model loading to prevent CUDA OOM on 4xL4 GPUs 5f0db1e KinetoLabs Claude Opus 4.5 commited on Jan 10
Fix multi-GPU compatibility issues (6 locations) d1901ae KinetoLabs Claude Opus 4.5 commited on Jan 10
Fix critical model implementations and add sample scenarios f3ebc82 KinetoLabs Claude Opus 4.5 commited on Jan 10