| license: mit | |
| base_model: | |
| - Qwen/Qwen3-0.6B | |
| pipeline_tag: text-generation | |
| tags: | |
| - onnx | |
| - onnxruntime-genai | |
| - oga | |
| My Tests (Tesla P4) | |
| - CUDA int4: 1153 MiB, 12 TPS | |
| - CUDA fp16: 2179 MiB, 29 TPS | |
| - CUDA fp32: dnf | |
| license: mit | |
| base_model: | |
| - Qwen/Qwen3-0.6B | |
| pipeline_tag: text-generation | |
| tags: | |
| - onnx | |
| - onnxruntime-genai | |
| - oga | |
| My Tests (Tesla P4) | |
| - CUDA int4: 1153 MiB, 12 TPS | |
| - CUDA fp16: 2179 MiB, 29 TPS | |
| - CUDA fp32: dnf | |