Frontend simplification (4→2 tabs) + lazy imports for HF Spaces 78caafb KinetoLabs Claude Opus 4.5 commited on 2 days ago
Reduce thinking model max_new_tokens to fix slow inference 0699c5f KinetoLabs Claude Opus 4.5 commited on 2 days ago
Replace 30B MoE with dual 8B models (Thinking + Instruct) 333c083 KinetoLabs Claude Opus 4.5 commited on 2 days ago
Implement lazy model loading to prevent CUDA OOM on 4xL4 GPUs 5f0db1e KinetoLabs Claude Opus 4.5 commited on 2 days ago
Fix critical model implementations and add sample scenarios f3ebc82 KinetoLabs Claude Opus 4.5 commited on 2 days ago