Spaces:

vongole83
/

LinkedIn-Content-ZeroGPU

Running on Zero

App Files Files Community

LinkedIn-Content-ZeroGPU

Commit History

perf: drop bnb 4-bit and torch.compile for faster ZeroGPU inference

23c94ae

Running

vongole83 Claude Sonnet 4.6 commited on 24 days ago

fix: move demo.load inside Blocks context

21c8ff7

vongole83 commited on Apr 30

add page-load warmup to pre-load models and burn torch.compile on first visit

ef1de57

vongole83 commited on Apr 30

add torch.compile with reduce-overhead mode

02bac93

vongole83 commited on Apr 30

add streaming + sdpa attention for faster generation UX

df41ead

vongole83 commited on Apr 30

add bitsandbytes to runtime install so 4-bit quantization actually applies

392778d

vongole83 commited on Apr 30

fix inference: use return_dict=True and unpack inputs for generate

7cbc66f

vongole83 commited on Apr 30

bypass adapter_config.json by downloading weights-only snapshot

60717df

vongole83 commited on Apr 30

load fine-tune directly as merged model, drop peft dependency

a854c2a

vongole83 commited on Apr 30

add 4-bit quantization to bring model size back to ~3GB each

7a27688

vongole83 commited on Apr 30

fix dtype deprecation warning

4131f41

vongole83 commited on Apr 30

programmatic install as workaround for requirements.txt not being picked up

f2fee3b

vongole83 commited on Apr 30

fix requirements: remove spaces (pre-installed), pin transformers>=4.51 for Gemma 4

6a17476

vongole83 commited on Apr 30

force rebuild

eb89851

vongole83 commited on Apr 30

First Commit

bcd34ff

vongole83 commited on Apr 30

initial commit

41cbcc0
verified

vongole83 commited on Apr 29