ig1
/

Qwen2.5-VL-7B-Instruct-FP8-Dynamic

Image-Text-to-Text

text-generation-inference

compressed-tensors

Model card Files Files and versions

Qwen2.5-VL-7B-Instruct-FP8-Dynamic / README.md

ig1sa's picture

Update README.md

122c7c0 verified 3 months ago

|

history blame contribute delete

632 Bytes

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: image-text-to-text
	tags:
	- multimodal
	library_name: transformers
	base_model:
	- Qwen/Qwen2.5-VL-7B-Instruct
	---

	```bash
	# Create a dedicated python env
	python3 -m venv llmcompressor
	source llmcompressor/bin/activate
	# Install llm-compressor and additionnal needed libs
	pip install llmcompressor qwen_vl_utils torchvision
	# Download model in HF cache
	hf download Qwen/Qwen2.5-VL-7B-Instruct
	# Start quantization
	wget https://github.com/vllm-project/llm-compressor/blob/main/examples/quantization_w8a8_fp8/qwen_2_5_vl_example.py -O qwen_2_5_vl_fp8.py
	python3 qwen_2_5_vl_fp8.py
	```