surogate
/

Qwen3.5-9B-FP8

Image-Text-to-Text

Model card Files Files and versions

flaviusburca commited on Mar 16

Commit

0d8b9bd

·

verified ·

1 Parent(s): 3282744

Update README.md

Files changed (1) hide show

README.md +5 -3

README.md CHANGED Viewed

@@ -9,14 +9,16 @@ base_model:
 # Qwen3.5-9B
-<img width="400px" src="https://qianwen-res.oss-accelerate.aliyuncs.com/logo_qwen3.5.png">
-[![Qwen Chat](https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5)](https://chat.qwen.ai)
 > [!Note]
-> This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format.
 >
 > These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.
 Over recent months, we have intensified our focus on developing foundation models that deliver exceptional utility and performance. Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.

 # Qwen3.5-9B
+<img width="400px" src="https://huggingface.co/surogate/Qwen3.5-4B-FP8/resolve/main/Surogate-Logo-White.png">
+**This Qwen3.5 variant is recommended for Surogate. Check out [http://surogate.ai](http://surogate.ai)**
 > [!Note]
+> This repository contains FP8-quantized model weights and configuration files for the post-trained model in the Hugging Face Transformers format.
 >
 > These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.
+>
+> The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original model.
 Over recent months, we have intensified our focus on developing foundation models that deliver exceptional utility and performance. Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.