Spaces:
Runtime error
Runtime error
Commit History
try to load gemma again 5a0f6f1
John Ho commited on
testing fps for internvl3 processor 3feaae8
John Ho commited on
added do_sample to generate 9137c51
John Ho commited on
make sure temperature is float a792463
John Ho commited on
skipping the use of gemma model for now 96a7d4d
John Ho commited on
added temp and testing gemma 035a7ef
John Ho commited on
updated case matching e13ff04
John Ho commited on
testing model quantization 6ec3c10
John Ho commited on
skip loading of internvl3 8b model 25dfae5
John Ho commited on
adding more internvl3 models b89bc96
John Ho commited on
debugging internvl3 d9bd1e8
John Ho commited on
debugging internvl3 3eaf3ec
John Ho commited on
debugging internvl3 4dc5aed
John Ho commited on
fixed push logic in the last step c5ab6e1
John Ho commited on
updated to not generate requirements if it already exists ca46eec
John Ho commited on
added different inference code for internvl3 d673ad7
John Ho commited on
added intern video model 15ef0c9
John Ho commited on
added control for fps and max tokens ff0b093
John Ho commited on
updated app to load multiple models 4361fd1
John Ho commited on
Update requirements.txt ba43302 verified
Update requirements.txt 2de01bc verified
pinning transformer to try to fix flash attention error 1aab8b2 verified
added in extra requirements 5b5395e verified
switch back to previous app 2592317 verified
attempt to slim down req 3555eab verified
Update requirements.txt 200f657 verified
Update requirements.txt 802b23c verified
chore: update requirements.txt [auto-generated by CI] d5f0d6d
github-actions[bot] commited on
updated torch dependency e6a0ef4
John Ho commited on
trying newer torch versions 110f151
John Ho commited on
trying a different inference script 6dd8fb2
John Ho commited on
trying a different inference script 1df8e73
John Ho commited on
try quantization again 88958c8
John Ho commited on
try quantization again 4e1e198
John Ho commited on
trying to load model and processors outside space decorator a83f12f
John Ho commited on
trying model quantization a7fd61f
John Ho commited on
trying model quantization d9d1598
John Ho commited on
fixing issue with device map for the inputs c7e712e
John Ho commited on
added low_cpu_mem_usage and move input to device also c697b34
John Ho commited on
added low_cpu_mem_usage and move input to device also 8edc124
John Ho commited on
testing more efficient model loading b3db9ce
John Ho commited on
testing more efficient model loading f18bd0f
John Ho commited on
pinning transformers version bd916de
John Ho commited on
make sure DTYPE is used 8b3dcea
John Ho commited on
testing cuda for processor 1679d51
John Ho commited on
added back in input.to(device) ce0e222
John Ho commited on
make flash attention an input f10889a
John Ho commited on
try use flash attention again f87fafd
John Ho commited on
trying to set device outsite of spaces.GPU decorator 2a9891d
John Ho commited on