HyperAI's picture

🤝 Open to Collab

4

HyperAI

Hyper-AI

·

AI & ML interests

lightvl is a lightweight Vision-Language Model (VLM) quantization toolkit supporting FP8, INT8, FP8-Block. It integrates with vLLM for high-throughput inference and supports Qwen3-VL, Qwen3.5, InternVL-Chat, and Gemma-4 models. fast quant your model step by step: 1、 pip3 install lightvl 2、 lightvl YOUR_HF_MODEL_PATH

Recent Activity

updated a model 23 days ago

Hyper-AI/Qwen3-VL-Embedding-8B-fp8

updated a model 23 days ago

Hyper-AI/gemma-4-31B-it-fp8

updated a model 23 days ago

Hyper-AI/gemma-4-E4B-it-fp8

View all activity

Organizations

None yet

updated 4 models 23 days ago

Hyper-AI/Qwen3-VL-Embedding-8B-fp8

Feature Extraction • 8B • Updated 23 days ago • 26 • 2

Hyper-AI/gemma-4-31B-it-fp8

Image-Text-to-Text • 31B • Updated 23 days ago • 58 • 2

Hyper-AI/gemma-4-E4B-it-fp8

Any-to-Any • Updated 23 days ago • 265

Hyper-AI/Qwen3.5-9B-fp8

Image-Text-to-Text • 10B • Updated 23 days ago • 129k • 3

updated a collection 3 months ago

gemma-4-fp8

fp8 quant for gemma-4 models, nearly half memory decrease, speedup 30%, vllm serve can run • 2 items • Updated Apr 8

published a model 3 months ago

Hyper-AI/gemma-4-E4B-it-fp8

Any-to-Any • Updated 23 days ago • 265

liked a model 3 months ago

Hyper-AI/gemma-4-31B-it-fp8

Image-Text-to-Text • 31B • Updated 23 days ago • 58 • 2

updated 4 collections 3 months ago

qwen3.5-fp8

fp8 quant for qwen3.5 models, nearly half memory decrease, speedup 30%, vllm serve can run • 1 item • Updated Apr 8

qwen3-vl-fp8

fp8 quant for qwen3-vl models, nearly half memory decrease, speedup 30%, vllm serve can run • 1 item • Updated Apr 8

qwen3-vl-embedding-fp8

fp8 quant for qwen3-vl-embedding models, nearly half memory decrease, speedup 30%, vllm serve can run • 1 item • Updated Apr 8

gemma-4-fp8

fp8 quant for gemma-4 models, nearly half memory decrease, speedup 30%, vllm serve can run • 2 items • Updated Apr 8

published a model 3 months ago

Hyper-AI/gemma-4-31B-it-fp8

Image-Text-to-Text • 31B • Updated 23 days ago • 58 • 2

updated a model 3 months ago

Hyper-AI/Qwen3-VL-2B-Instruct-fp8

Image-Text-to-Text • 2B • Updated Apr 8 • 4 • 1

liked 3 models 3 months ago

Hyper-AI/Qwen3-VL-2B-Instruct-fp8

Image-Text-to-Text • 2B • Updated Apr 8 • 4 • 1

Hyper-AI/Qwen3.5-9B-fp8

Image-Text-to-Text • 10B • Updated 23 days ago • 129k • 3

Hyper-AI/Qwen3-VL-Embedding-8B-fp8

Feature Extraction • 8B • Updated 23 days ago • 26 • 2

published 3 models 3 months ago

Hyper-AI/Qwen3.5-9B-fp8

Image-Text-to-Text • 10B • Updated 23 days ago • 129k • 3

Hyper-AI/Qwen3-VL-Embedding-8B-fp8

Feature Extraction • 8B • Updated 23 days ago • 26 • 2

Hyper-AI/Qwen3-VL-2B-Instruct-fp8

Image-Text-to-Text • 2B • Updated Apr 8 • 4 • 1