This is Alibaba-Apsara/DASD-4B-Thinking quantized with LLM Compressor with FP8 W8A8. The model has been created and tested by The Valdanito. The model is compatible with SGlang v0.5.7 & vLLM v0.14.0. Tested with an NVIDIA H800.
- Downloads last month
- 18
Model tree for valdanito/DASD-4B-Thinking-FP8-DYNAMIC
Base model
Alibaba-Apsara/DASD-4B-Thinking