RedHatAI
/

Trinity-Large-Thinking-NVFP4

compressed-tensors

8-bit precision

Model card Files Files and versions

Trinity-Large-Thinking-NVFP4 / README.md

dsikka's picture

Update README.md

f9ea443 verified about 1 month ago

|

history blame contribute delete

677 Bytes

	---
	base_model:
	- arcee-ai/Trinity-Large-Thinking
	tags:
	- afmoe
	- nvfp4
	- vllm
	- compressed-tensors
	name: RedHatAI/Trinity-Large-Thinking-NVFP4
	---

	# NVFP4 Quantized RedHatAI/Trinity-Large-Thinking-NVFP4

	This is a preliminary version (and subject to change) of NVFP4 quantized [arcee-ai/Trinity-Large-Thinking ](https://huggingface.co/arcee-ai/Trinity-Large-Thinking/tree/main ) model.
	The model has both weights and activations quantized to NVFP4 format with [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor).

	It is compatible and tested against vllm main. Run it with ```vllm serve RedHatAI/Trinity-Large-Thinking-NVFP4 --trust-remote-code```