Add Qwen3-VL, Qwen3_5 Native support for native TPU-inference
#11 opened 5 days ago
by
Thisusernamealreadyexists00
Running Qwen3-VL-2B-Instruct on real security camera feeds — impressive results at IQ2 quantization
👍 1
1
#10 opened 2 months ago
by
SharpAI
Batch vs individual inference output mismatch
2
#9 opened 4 months ago
by
E1eMental
torch.OutOfMemoryError: CUDA out of memory
#8 opened 4 months ago
by
shadowT
Inference seems to be very slow on A100 even when flash_attn is enabled
➕ 10
3
#7 opened 5 months ago
by
boydcheung
Are these variables implicitly read by transformers library or do I need to incorporate into generate function?
#6 opened 5 months ago
by
boydcheung
why the outputs are different ?
2
#5 opened 5 months ago
by
AAsuka
How different are its hardware requirements from those of the Qwen2-VL-2B?
2
#4 opened 6 months ago
by
likewendy
Finetune It's Brain On Text
#3 opened 6 months ago
by
VINAY-UMRETHE
GGUFs are here. Tutorials to run locally.
🔥 5
#2 opened 6 months ago
by
alanzhuly
Local Installation Video and Testing - Step by Step
#1 opened 7 months ago
by
fahdmirzac