Restore original fused float16 model.safetensors for MLX (4.2GB) 0f60d3e verified efops commited on about 1 hour ago
v0.5.8: GPTQ W4A16 quantized model for vLLM CPU (~4GB) 6f37080 verified efops commited on about 14 hours ago
Delete model.safetensors.index.json with huggingface_hub 3ccfa93 verified efops commited on about 14 hours ago
Delete model-00004-of-00004.safetensors with huggingface_hub 7829d1c verified efops commited on about 14 hours ago
Delete model-00003-of-00004.safetensors with huggingface_hub 8a3dca8 verified efops commited on about 14 hours ago
Delete model-00002-of-00004.safetensors with huggingface_hub 82c8c01 verified efops commited on about 14 hours ago
Delete model-00001-of-00004.safetensors with huggingface_hub 2a6918f verified efops commited on about 14 hours ago
v0.5.8: Replace MLX-quantized with proper dequantized safetensors for llm-compressor a692ea7 verified efops commited on about 23 hours ago
Fix config.json: remove invalid GGML quantization fields 37ba332 verified efops commited on about 24 hours ago
Fix tokenizer_class: TokenizersBackend → PreTrainedTokenizerFast 47fef26 verified efops commited on 1 day ago