Qwen3-0.6B-oga / README.md
SamTheDev's picture
Upload folder using huggingface_hub
0009788 verified
---
license: mit
base_model:
- Qwen/Qwen3-0.6B
pipeline_tag: text-generation
tags:
- onnx
- onnxruntime-genai
- oga
---
My Tests (Tesla P4)
- CUDA int4: 1153 MiB, 12 TPS
- CUDA fp16: 2179 MiB, 29 TPS
- CUDA fp32: dnf