wheel streamlit ddgs gradio>=5.0.0 torch>=2.8.0 transformers>=4.53.3 spaces sentencepiece accelerate vllm>=0.6.0 # llm-compressor is optional - only needed for quantizing models, not loading pre-quantized AWQ # vLLM has native AWQ support built-in # llmcompressor>=0.1.0 # Commented out - not needed for loading pre-quantized models autoawq flash-attn>=2.5.0 timm compressed-tensors bitsandbytes