fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects 200ed70 m97j commited on Dec 20, 2025
fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects ed9e701 m97j commited on Dec 20, 2025
fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects 683b339 m97j commited on Dec 20, 2025
fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects a7532d6 m97j commited on Dec 19, 2025
fix(llm_model): align token chunking and prefix handling with engine deb604d m97j commited on Dec 14, 2025
feat(inference engine): add input normalization and attention_mask support e923fc2 m97j commited on Dec 14, 2025
fix: make on_message_submit a generator and yield streaming tuples to match Gradio outputs 02e187b m97j commited on Dec 13, 2025
refactor: pass language textbox component to UI rendering and event binding for proper i18n support 70d32f4 m97j commited on Dec 13, 2025
refactor: pass language textbox component to UI rendering and event binding for proper i18n support 77f5dc6 m97j commited on Dec 13, 2025
refactor: move LLM model initialization from global scope to function-level for lazy loading f62140d m97j commited on Dec 13, 2025
refactor: move LLM model initialization from global scope to function-level for lazy loading 6b694d6 m97j commited on Dec 13, 2025
refactor: switch to lazy model/prefix initialization to improve startup and UI responsiveness 320147d m97j commited on Dec 13, 2025
fix: configure Gradio launch with explicit server_name/port and disable SSR for stable UI rendering 207ed35 m97j commited on Dec 13, 2025
Add torchao to requirements and fix torch version to match torchao 093c3c6 m97j commited on Dec 13, 2025
fix: use weights_only=True in torch.load to safely load state_dict f6e3bea m97j commited on Dec 13, 2025
Refactor model initialization to use hf_hub_download cache paths 9b58d8f m97j commited on Dec 13, 2025
Remove manual local_dir and local path variables from config.py ca0e5cc m97j commited on Dec 13, 2025
Refactor load_llm: use AutoConfig and direct state_dict loading 7d823a8 m97j commited on Dec 13, 2025