Commit History

Restore original fused float16 model.safetensors for MLX (4.2GB)
0f60d3e
verified

efops commited on

Remove GPTQ model.safetensors.index.json
3fba9c1
verified

efops commited on

Remove GPTQ model-00002-of-00002.safetensors
5c95498
verified

efops commited on

Remove GPTQ model-00001-of-00002.safetensors
9ab0e79
verified

efops commited on

v0.5.9: semantic intent routing
9cd2a84
verified

efops commited on

v0.5.8: 3-tier inference (MLX/vLLM/llama.cpp)
be9c653
verified

efops commited on

v0.5.8: GPTQ W4A16 quantized model for vLLM CPU (~4GB)
6f37080
verified

efops commited on

Delete model.safetensors.index.json with huggingface_hub
3ccfa93
verified

efops commited on

Delete model-00004-of-00004.safetensors with huggingface_hub
7829d1c
verified

efops commited on

Delete model-00003-of-00004.safetensors with huggingface_hub
8a3dca8
verified

efops commited on

Delete model-00002-of-00004.safetensors with huggingface_hub
82c8c01
verified

efops commited on

Delete model-00001-of-00004.safetensors with huggingface_hub
2a6918f
verified

efops commited on

Delete model.safetensors with huggingface_hub
a356bcf
verified

efops commited on

v0.5.8: Replace MLX-quantized with proper dequantized safetensors for llm-compressor
a692ea7
verified

efops commited on

Fix config.json: remove invalid GGML quantization fields
37ba332
verified

efops commited on

v0.5.7
49979de
verified

efops commited on

Fix tokenizer_class: TokenizersBackend → PreTrainedTokenizerFast
47fef26
verified

efops commited on

v0.5.6
5385906
verified

efops commited on

v0.5.5
c4e795f
verified

efops commited on

v0.5.4: vLLM CPU pre-built wheel, bfloat16, TCMalloc
6996b5a
verified

efops commited on

v0.5.3: fix CPU device detection, consolidate changelog
b9ce948
verified

efops commited on

Update README to v0.5.2
6379b15
verified

efops commited on

Update README to v0.5.1
d1deb59
verified

efops commited on

Initial release v0.5.0 (Clean history)
ed718b3

efops commited on