rahtml
/

Qwen3-Coder-30B-A3B-Instruct-NVFP4

8-bit precision

Model card Files Files and versions

Qwen3-Coder-30B-A3B-Instruct-NVFP4 / README.md

rahtml's picture

Update README.md

898e2a0 verified 2 months ago

|

history blame contribute delete

354 Bytes

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen3-Coder-30B-A3B-Instruct
	---

	## Description

	NVFP4 Quantization of [Qwen/Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) using [TensorRT-Model-Optimizer](https://github.com/NVIDIA/Model-Optimizer). KV Cache quantized to FP8 for compatibility with inference backends.