| | --- |
| | base_model: unsloth/llama-3-8b-bnb-4bit |
| | language: |
| | - en |
| | license: apache-2.0 |
| | tags: |
| | - text-generation-inference |
| | - transformers |
| | - unsloth |
| | - llama |
| | - trl |
| | --- |
| | |
| | # Uploaded model |
| |
|
| | - **Developed by:** tykiww |
| | - **License:** apache-2.0 |
| | - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit |
| |
|
| | This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
| |
|
| | [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |
| |
|
| |
|
| | --------------------------------------------- |
| |
|
| | # Setting up and testing own Endpoint Handler |
| |
|
| | Sources: |
| |
|
| | - https://www.philschmid.de/custom-inference-handler |
| | - https://discuss.huggingface.co/t/model-wont-load-on-custom-inference-endpoint/91780 |
| | - https://huggingface.co/docs/inference-endpoints/guides/custom_handler |
| | |
| | |
| | ### Setup Environment |
| | |
| | Install necessary packages to set up and test endpoint handler. |
| | |
| | ``` |
| | # install git-lfs to interact with the repository |
| | sudo apt-get update |
| | sudo apt-get install git-lfs |
| | # install transformers (not needed for inference since it is installed by default in the container) |
| | pip install transformers[sklearn,sentencepiece,audio,vision] |
| | ``` |
| | |
| | Clone model weights of interest. |
| | |
| | ``` |
| | git lfs install |
| | git clone https://huggingface.co/tykiww/llama3-8b-quantized |
| | ``` |
| | |
| | Login to huggingface |
| | |
| | ``` |
| | # setup cli with token |
| | huggingface-cli login |
| | git config --global credential.helper store |
| | ``` |
| | |
| | Confirm login in case you are unsure. |
| | |
| | ``` |
| | huggingface-cli whoami |
| | ``` |
| | |
| | Navigate to repo and create a handler.py file |
| | |
| | ``` |
| | cd llama3-8b-bnb-4bit-lora #&& touch handler.py |
| | ``` |
| | |
| | Create a requirements.txt file with the following items |
| | |
| | ``` |
| | huggingface_hub |
| | unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git |
| | xformers |
| | trl<0.9.0 |
| | peft==0.11.1 |
| | bitsandbytes |
| | transformers==4.41.2 # must use /: |
| | ``` |
| | |
| | Must have a GPU compatible with Unsloth. |