Spaces:
Runtime error
Runtime error
Commit History
try to load gemma again
5a0f6f1
John Ho
commited on
testing fps for internvl3 processor
3feaae8
John Ho
commited on
added do_sample to generate
9137c51
John Ho
commited on
make sure temperature is float
a792463
John Ho
commited on
skipping the use of gemma model for now
96a7d4d
John Ho
commited on
added temp and testing gemma
035a7ef
John Ho
commited on
updated case matching
e13ff04
John Ho
commited on
testing model quantization
6ec3c10
John Ho
commited on
skip loading of internvl3 8b model
25dfae5
John Ho
commited on
adding more internvl3 models
b89bc96
John Ho
commited on
debugging internvl3
d9bd1e8
John Ho
commited on
debugging internvl3
3eaf3ec
John Ho
commited on
debugging internvl3
4dc5aed
John Ho
commited on
fixed push logic in the last step
c5ab6e1
John Ho
commited on
updated to not generate requirements if it already exists
ca46eec
John Ho
commited on
added different inference code for internvl3
d673ad7
John Ho
commited on
added intern video model
15ef0c9
John Ho
commited on
added control for fps and max tokens
ff0b093
John Ho
commited on
updated app to load multiple models
4361fd1
John Ho
commited on
Update requirements.txt
ba43302
verified
Update requirements.txt
2de01bc
verified
pinning transformer to try to fix flash attention error
1aab8b2
verified
added in extra requirements
5b5395e
verified
switch back to previous app
2592317
verified
attempt to slim down req
3555eab
verified
Update requirements.txt
200f657
verified
Update requirements.txt
802b23c
verified
chore: update requirements.txt [auto-generated by CI]
d5f0d6d
github-actions[bot]
commited on
updated torch dependency
e6a0ef4
John Ho
commited on
trying newer torch versions
110f151
John Ho
commited on
trying a different inference script
6dd8fb2
John Ho
commited on
trying a different inference script
1df8e73
John Ho
commited on
try quantization again
88958c8
John Ho
commited on
try quantization again
4e1e198
John Ho
commited on
trying to load model and processors outside space decorator
a83f12f
John Ho
commited on
trying model quantization
a7fd61f
John Ho
commited on
trying model quantization
d9d1598
John Ho
commited on
fixing issue with device map for the inputs
c7e712e
John Ho
commited on
added low_cpu_mem_usage and move input to device also
c697b34
John Ho
commited on
added low_cpu_mem_usage and move input to device also
8edc124
John Ho
commited on
testing more efficient model loading
b3db9ce
John Ho
commited on
testing more efficient model loading
f18bd0f
John Ho
commited on
pinning transformers version
bd916de
John Ho
commited on
make sure DTYPE is used
8b3dcea
John Ho
commited on
testing cuda for processor
1679d51
John Ho
commited on
added back in input.to(device)
ce0e222
John Ho
commited on
make flash attention an input
f10889a
John Ho
commited on
try use flash attention again
f87fafd
John Ho
commited on
trying to set device outsite of spaces.GPU decorator
2a9891d
John Ho
commited on