Joe Rocca PRO
rocca
AI & ML interests
None yet
Recent Activity
liked a model about 11 hours ago
kingJulio/nanoG1 liked a model 3 days ago
zai-org/GLM-5.2 liked a Space 5 days ago
multimodalart/Boogu-ImageOrganizations
Minimal working setup for SGLang (and vLLM?)
#2 opened 8 months ago
by
rocca
Which model to be loaded?
11
#3 opened 11 months ago
by
Skymid
FP4 quantization, INT4 crash on RTX 50XX
🔥 1
7
#2 opened about 1 year ago
by
Askd234
Any chance of creating a v30 version to hold over until real support?
👍 3
9
#1 opened about 1 year ago
by
Todokete
Does FlashMLA support kv cache fp8 dtype and how to enable FlashMLA ?
10
#6 opened about 1 year ago
by
CharlesLincoln
Diffusers Roadmap?
👍 1
4
#5 opened over 1 year ago
by
Impulse2000
70B model?
5
#1 opened about 2 years ago
by
rocca
vllm can not inter this model (other 70b gptq model are ok)
15
#1 opened over 2 years ago
by
tutu329
Is `--speculate` needed for Medusa in TGI?
2
#1 opened about 2 years ago
by
rocca
Double-hyphen in "gptq-3bit--1g-actorder_True" branch name is invalid according to TGI
1
#1 opened over 2 years ago
by
rocca
Conversion to ONNX
4
#1 opened over 3 years ago
by
roseman
Delete the outdated rwkv-4-pile-430m-webgl.onnx
1
#4 opened almost 4 years ago
by
AXKuhta
Upload rwkv-4-pile-430m-webgl.onnx
1
#3 opened almost 4 years ago
by
AXKuhta
Upload rwkv-4-pile-430m-webgl.onnx
1
#2 opened almost 4 years ago
by
AXKuhta
Upload rwkv-4-pile-169m-webgl.onnx
1
#1 opened almost 4 years ago
by
AXKuhta
This is really exciting
2
#1 opened about 4 years ago
by
victor