| # Pytorch to Safetensor Converter | |
| --- | |
| A simple converter which converts pytorch .bin tensor files (Usually listed as "pytorch_model.bin" or "pytorch_model-xxxx-of-xxxx.bin") to safetensor files. Reason? | |
| ~~because it's cool!~~ | |
| Because the safetensor format decreases the loading time of large LLM models, currently supported in [oobabooga's text-generation-webui](https://github.com/oobabooga/text-generation-webui). It also supports in-place loading, which effectively decreased the required memory to load a LLM. | |
| Note: Most of the code originated from [Convert to Safetensors - a Hugging Face Space by safetensors](https://huggingface.co/spaces/safetensors/convert), and this code cannot deal with files that are not named as "pytorch_model.bin" or "pytorch_model-xxxx-of-xxxx.bin". | |
| ### Limitations: | |
| The program requires **A lot** of memory. To be specific, your idle memory should be **at least** twice the size of your largest ".bin" file. Or else, the program will run out of memory and use your swap... that would be **slow!** | |
| This program **will not** re-shard (aka break down) the model, you'll need to do it yourself using some other tools. | |
| ### Usage: | |
| After installing python (Python 3.10.x is suggested), ``cd`` into the repository and install dependencies first: | |
| ``` | |
| git clone https://github.com/Silver267/pytorch-to-safetensor-converter.git | |
| cd pytorch-to-safetensor-converter | |
| pip install -r requirements.txt | |
| ``` | |
| Copy **all content** of your model's folder into this repository, then run: | |
| ``` | |
| python convert_to_safetensor.py | |
| ``` | |
| Follow the instruction in the program. Remember to use the **full path** for the model directory (Something like ``E:\models\xxx-fp16`` that contains all the model files). Wait for a while, and you're good to go. The program will automatically copy all other files to your destination folder, enjoy! | |
| ### Precision stuff | |
| if your original model is fp32 then don't forget to edit ``"torch_dtype": "float32",`` to ``"torch_dtype": "float16",`` in ``config.json`` | |
| #### Note that this operation might (in rare occasions) cause the LLM to output NaN while performing operations since it decreases the precision to fp16. | |
| If you're worried about that, simply edit the line ``loaded = {k: v.contiguous().half() for k, v in loaded.items()}`` in ``convert_to_safetensor.py`` into ``loaded = {k: v.contiguous() for k, v in loaded.items()}`` and you'll have a full precision model. | |