RedHatAI
/

Trinity-Large-Thinking-NVFP4

compressed-tensors

8-bit precision

Model card Files Files and versions

dsikka commited on Apr 3

Commit

7657f94

·

verified ·

1 Parent(s): 6dd2b6d

Create README.md

Files changed (1) hide show

README.md +17 -0

README.md ADDED Viewed

	@@ -0,0 +1,17 @@

+---
+base_model:
+- arcee-ai/Trinity-Large-Thinking
+tags:
+- afmoe
+- nvfp4
+- vllm
+- compressed-tensors
+name: RedHatAI/Trinity-Large-Thinking-NVFP4
+---
+# NVFP4 Quantized RedHatAI/Trinity-Large-Thinking-NVFP4
+This is a preliminary version (and subject to change) of NVFP4 quantized [arcee-ai/Trinity-Large-Thinking ](https://huggingface.co/arcee-ai/Trinity-Large-Thinking/tree/main ) model.
+The model has both weights and activations quantized to NVFP4 format with [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor).
+It is compatible and tested against vllm main.