Commit History

fix: update onnxruntime version for Python 3.13 compatibility
b2a89ad

m97j commited on

add decoding interface
fe9397b

m97j commited on

edit: update datetime codes
83d11f8

m97j commited on

Update: Refine logic is now disabled
2462006

m97j commited on

Update: Refine logic is now disabled
4a1ae09

m97j commited on

Update: Query-based logic is now disabled.
ea0ccf7

m97j commited on

fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects
200ed70

m97j commited on

fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects
ed9e701

m97j commited on

fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects
683b339

m97j commited on

fix: Modify code to ensure the input dtype/shape/key is what the ONNX model expects
a7532d6

m97j commited on

edit planning step in main pipeline
a5e5e3d

m97j commited on

edit: downgrade torch version to match with torchao
81bbf68

m97j commited on

edit planning step in main pipeline
3dbea5c

m97j commited on

edit planning step in main pipeline
d1581d3

m97j commited on

fix(llm_model): align token chunking and prefix handling with engine
deb604d

m97j commited on

feat(initializer): store prefix cache as input_ids tensors
ac17ed0

m97j commited on

feat(inference engine): add input normalization and attention_mask support
e923fc2

m97j commited on

edit process request codes
8668f91

m97j commited on

apdate ui structure
627de06

m97j commited on

apdate ui structure
273aee7

m97j commited on

add model init at container startup
a677b03

m97j commited on

fix: make on_message_submit a generator and yield streaming tuples to match Gradio outputs
02e187b

m97j commited on

refactor: pass language textbox component to UI rendering and event binding for proper i18n support
70d32f4

m97j commited on

refactor: pass language textbox component to UI rendering and event binding for proper i18n support
77f5dc6

m97j commited on

edit login_btn.click handler's input
4d559f1

m97j commited on

edit format to match translations' placeholder
a807343

m97j commited on

add initial ui rendering
b1d8951

m97j commited on

update torch version
60163cc

m97j commited on

update import block
2256134

m97j commited on

refactor: move LLM model initialization from global scope to function-level for lazy loading
f62140d

m97j commited on

refactor: move LLM model initialization from global scope to function-level for lazy loading
6b694d6

m97j commited on

refactor: switch to lazy model/prefix initialization to improve startup and UI responsiveness
320147d

m97j commited on

fix: configure Gradio launch with explicit server_name/port and disable SSR for stable UI rendering
207ed35

m97j commited on

Edit : switch model path
3375419

m97j commited on

Add acelerate to requirements.txt
6125a3b

m97j commited on

Edit : add model load method
bf2f314

m97j commited on

Edit : add model load method
7b942de

m97j commited on

fix: update torch version
e472713

m97j commited on

fix: update torchao version
ff8b9e9

m97j commited on

update initializer.py
662eb29

m97j commited on

Add torchao to requirements and fix torch version to match torchao
093c3c6

m97j commited on

fix: use weights_only=True in torch.load to safely load state_dict
f6e3bea

m97j commited on

Edit llm model filename
8062148

m97j commited on

Edit wrong code
4f39165

m97j commited on

Refactor model initialization to use hf_hub_download cache paths
9b58d8f

m97j commited on

Remove manual local_dir and local path variables from config.py
ca0e5cc

m97j commited on

Refactor load_llm: use AutoConfig and direct state_dict loading
7d823a8

m97j commited on

Update initializer.py to use explicit Hub filenames
4e65de4

m97j commited on

Refactor config.py to separate Hub and local paths
3e43d9a

m97j commited on

Refactor import paths in app.py to remove 'app.' prefix
596fcaa

m97j commited on