File size: 677 Bytes

---
base_model:
- arcee-ai/Trinity-Large-Thinking
tags:
- afmoe
- nvfp4
- vllm
- compressed-tensors
name: RedHatAI/Trinity-Large-Thinking-NVFP4
---

# NVFP4 Quantized RedHatAI/Trinity-Large-Thinking-NVFP4

This is a preliminary version (and subject to change) of NVFP4 quantized [arcee-ai/Trinity-Large-Thinking ](https://huggingface.co/arcee-ai/Trinity-Large-Thinking/tree/main ) model. 
The model has both weights and activations quantized to NVFP4 format with [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor).

It is compatible and tested against vllm main. Run it with ```vllm serve RedHatAI/Trinity-Large-Thinking-NVFP4 --trust-remote-code```