future-html / gemmademo /_model.py

Commit History

bug fixes
e96d38d

aadya1762 commited on

bug fixes
d99243b

aadya1762 commited on

add sliders
e9e9e0c

aadya1762 commited on

add examples functionality
3a14fb3

aadya1762 commited on

remove unuseful model imports and comments
a251128

aadya1762 commited on

remove unnecessary models
7f1341d

aadya1762 commited on

remove unnecessary model
827ddeb

aadya1762 commited on

bug fixes
581c860

aadya1762 commited on

Add Gemma3-1B Quantized Model
0304bfe

aadya1762 commited on

handle batched response for inference
28295c6

aadya1762 commited on

handle streaming properly
b709bb5

aadya1762 commited on

Stream LLM responses
d24a753

aadya1762 commited on

port to gradio
8cc5c82

aadya1762 commited on

use 4 bit quantized models for faster inference
5ca1c38

aadya1762 commited on

Add model config sliders
c1e7456

aadya1762 commited on

Update _model.py
bc32324
unverified

Aadya Chinubhai commited on

use llama.cpp
bdca525

aadya1762 commited on

increase cache limit -> fewer recompilations by pytorch
5160420

aadya1762 commited on

remove torch.compile
9038c58

aadya1762 commited on

initial commit
b4ecb60

aadya1762 commited on