fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects 200ed70 m97j commited on Dec 20, 2025
fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects ed9e701 m97j commited on Dec 20, 2025
fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects 683b339 m97j commited on Dec 20, 2025
fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects a7532d6 m97j commited on Dec 19, 2025
fix(llm_model): align token chunking and prefix handling with engine deb604d m97j commited on Dec 14, 2025
feat(inference engine): add input normalization and attention_mask support e923fc2 m97j commited on Dec 14, 2025
refactor: move LLM model initialization from global scope to function-level for lazy loading f62140d m97j commited on Dec 13, 2025
fix: use weights_only=True in torch.load to safely load state_dict f6e3bea m97j commited on Dec 13, 2025
Refactor model initialization to use hf_hub_download cache paths 9b58d8f m97j commited on Dec 13, 2025
Refactor load_llm: use AutoConfig and direct state_dict loading 7d823a8 m97j commited on Dec 13, 2025