--- license: mit tags: - bigsmall - compressed - lossless --- [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20279247.svg)](https://doi.org/10.5281/zenodo.20279247) # Phi-3.5 Mini Instruct — Lossless Compressed > **7.12 GB → 4.67 GB (34% smaller). Bit-identical weights. Drop-in replacement.** ## Use it in 2 lines ```bash pip install "bigsmall>=3.14.4" ``` ```python from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("wpferrell/phi-3.5-mini-instruct-bigsmall") ``` It works exactly like loading the original model. No code changes needed. ## Size comparison | | Size | |---|---| | Original ([microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct)) | 7.12 GB | | This compressed version | 4.67 GB | | Saved | 2.45 GB (34%) | ## What "lossless" means Every weight is mathematically identical to the original model. - **Not quantized.** Quantization rounds weights and changes model behaviour. - **Not pruned.** Pruning removes parts of the model. - **Bit-for-bit identical.** md5 is verified on every tensor at decompression. ## Low-VRAM streaming ```python from bigsmall import BigSmallStreamingModel model = BigSmallStreamingModel.from_pretrained( "wpferrell/phi-3.5-mini-instruct-bigsmall", device="cuda", lru_max_vram_gb=2.0, ) ``` Uses up to ~12× less VRAM than standard loading by streaming layers on demand. ## Stream straight from the Hub (no disk) ```python import bigsmall state_dict = bigsmall.stream_from_hub("wpferrell/phi-3.5-mini-instruct-bigsmall", device="cpu") ``` Decompresses directly from the HuggingFace CDN over HTTP range requests. With the default `cache=False`, no `.bs` file is ever written to disk (V10). ## Decompress to safetensors ```python import bigsmall from safetensors.torch import save_file # bigsmall decompress works on local .bs files, not Hub repos, so # stream the weights from the Hub and write them out as safetensors. state_dict = bigsmall.stream_from_hub("wpferrell/phi-3.5-mini-instruct-bigsmall", device="cpu") save_file(state_dict, "phi-3.5-mini-instruct-bigsmall.safetensors") ``` ## Original model This is a lossless-compressed copy of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct). All credit to the original authors. The weights are unchanged. ## Want to compress your own model? ```bash pip install "bigsmall>=3.14.4" bigsmall compress my-model/ -o my-model.bs ``` See [github.com/wpferrell/Bigsmall](https://github.com/wpferrell/Bigsmall) for the full docs. ## License - **Model weights:** mit — same as [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct). - **BigSmall format:** [Elastic License 2.0](https://github.com/wpferrell/Bigsmall/blob/main/LICENSE) — free for personal, research, and commercial use. - **Commercial SaaS licensing:** wpferrell@gmail.com ## Citation ```bibtex @misc{bigsmall2026, title={BigSmall: Lossless Neural Network Weight Compression}, author={Ferrell, Will}, year={2026}, doi={10.5281/zenodo.20279247}, url={https://doi.org/10.5281/zenodo.20279247} } ``` ## Requires `bigsmall >= 3.14.4` for the latest features. Earlier versions (>= 3.0.0) can still decode this model.