Fix delete_revisions import with fallback cache cleanup 7a2a590 Alikestocode commited on Nov 10, 2025
Fix delete_revisions import - use fallback cache cleanup method 4be72e0 Alikestocode commited on Nov 10, 2025
Fix AWQModifier import path: use modifiers.awq instead of modifiers.quantization f0033ab Alikestocode commited on Nov 10, 2025
Fix LLM Compressor package name: llmcompressor (no hyphen) 2326498 Alikestocode commited on Nov 10, 2025
Remove duplicate LLM Compressor section - now primary method d4bc333 Alikestocode commited on Nov 10, 2025
Replace AutoAWQ with LLM Compressor (vLLM native) in Colab notebook ae07f77 Alikestocode commited on Nov 10, 2025
Add disk space cleanup after quantization in Colab notebook 24107f3 Alikestocode commited on Nov 10, 2025
Fix linter error: use %pip instead of !pip in Colab notebook 2dff966 Alikestocode commited on Nov 10, 2025
Add Colab notebook for AWQ quantization of router models a79bc8f Alikestocode commited on Nov 10, 2025
Clarify LLM Compressor optional status - vLLM has native AWQ support b2bf767 Alikestocode commited on Nov 10, 2025
Fix vLLM token parameter and improve streaming error handling b4fd5e9 Alikestocode commited on Nov 10, 2025
Add debug logging for model loading and generation issues 54880b1 Alikestocode commited on Nov 9, 2025
Fix streaming loop break condition - only break when finished is True d6f9002 Alikestocode commited on Nov 9, 2025
Add permission fix guide for spherical-gate-477614-q7 project 162c75a Alikestocode commited on Nov 8, 2025
Add Cloud Build deployment script and permission setup helper fd26b3d Alikestocode commited on Nov 8, 2025
Fix Gradio UI structure and add comprehensive fallback logging 03689e3 Alikestocode commited on Nov 8, 2025
Fix syntax error: correct indentation in BitsAndBytes fallback block f43bdac Alikestocode commited on Nov 8, 2025
Suppress AutoAWQ deprecation warnings and improve vLLM logging 83a232d Alikestocode commited on Nov 8, 2025
Implement vLLM with LLM Compressor and performance optimizations a79facb Alikestocode commited on Nov 8, 2025
Fix: Pre-create GPU wrappers at module load time for startup detection cdac920 Alikestocode commited on Nov 8, 2025
Make GPU duration slider functional with dynamic wrapper creation fc0ab14 Alikestocode commited on Nov 8, 2025
Fix indentation errors in _generate_router_plan_streaming_internal c454e43 Alikestocode commited on Nov 8, 2025
Fix: Remove context manager usage for spaces.GPU decorator a217627 Alikestocode commited on Nov 8, 2025
Add user-configurable GPU duration slider (60-1800 seconds) 9a4d6d3 Alikestocode commited on Nov 8, 2025
Fix: Move trim_at_stop_sequences function before it's used 597f1a9 Alikestocode commited on Nov 8, 2025
Improve streaming with incremental JSON parsing and plan end token f5a609d Alikestocode commited on Nov 7, 2025
Update app.py and requirements.txt for CourseGPT-Pro router models 4c3d05b Alikestocode commited on Nov 7, 2025
Update README: Focus on CourseGPT-Pro router checkpoints 4706b45 Alikestocode commited on Nov 7, 2025