| | --- |
| | base_model: |
| | - Qwen/Qwen2.5-Math-1.5B |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | tags: |
| | - rknn |
| | - rkllm |
| | - chat |
| | - rk3588 |
| | --- |
| | ## 3ib0n's RKLLM Guide |
| | These models and binaries require an RK3588 board running rknpu driver version 0.9.7 or above |
| |
|
| | ## Steps to reproduce conversion |
| | ```shell |
| | # Download and setup miniforge3 |
| | curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" |
| | bash Miniforge3-$(uname)-$(uname -m).sh |
| | |
| | # activate the base environment |
| | source ~/miniforge3/bin/activate |
| | |
| | # create and activate a python 3.8 environment |
| | conda create -n rknn-llm-1.1.4 python=3.8 |
| | conda activate rknn-llm-1.1.4 |
| | |
| | # clone the lastest rknn-llm toolkit |
| | git clone https://github.com/airockchip/rknn-llm.git |
| | |
| | # intstall dependencies for the toolkit |
| | pip install transformers accelerate torchvision rknn-toolkit2==2.2.1 |
| | pip install --upgrade torch pillow |
| | |
| | # install rkllm |
| | pip install ../../rkllm-toolkit/packages/rkllm_toolkit-1.1.4-cp38-cp38-linux_x86_64.whl |
| | |
| | # edit or create a script to export rkllm models |
| | cd rknn-llm/examples/rkllm_multimodal_demo |
| | nano export/export_rkllm.py # update input and output paths |
| | python export/export_rkllm.py |
| | ``` |
| |
|
| | Example export_rkllm.py modified from https://github.com/airockchip/rknn-llm/blob/main/examples/rkllm_multimodel_demo/export/export_rkllm.py |
| | ```python |
| | import os |
| | from rkllm.api import RKLLM |
| | from datasets import load_dataset |
| | from transformers import AutoTokenizer |
| | from tqdm import tqdm |
| | import torch |
| | from torch import nn |
| | |
| | modelpath = "~/models/Qwen/Qwen2.5-Math-1.5B-Instruct/" ## UPDATE HERE |
| | savepath = './Qwen2.5-Math-1.5B-Instruct.rkllm' ## UPDATE HERE |
| | llm = RKLLM() |
| | |
| | # Load model |
| | # Use 'export CUDA_VISIBLE_DEVICES=2' to specify GPU device |
| | ret = llm.load_huggingface(model=modelpath, device='cpu') |
| | if ret != 0: |
| | print('Load model failed!') |
| | exit(ret) |
| | |
| | # Build model |
| | qparams = None |
| | |
| | ## Do not use the dataset parameter as we are converting a pure text model, not a multimodal |
| | ret = llm.build(do_quantization=True, optimization_level=1, quantized_dtype='w8a8', |
| | quantized_algorithm='normal', target_platform='rk3588', num_npu_core=3, extra_qparams=qparams) |
| | |
| | if ret != 0: |
| | print('Build model failed!') |
| | exit(ret) |
| | |
| | # # Export rkllm model |
| | ret = llm.export_rkllm(savepath) |
| | if ret != 0: |
| | print('Export model failed!') |
| | exit(ret) |
| | ``` |
| |
|
| | ## Steps to build and run demo |
| |
|
| | ```shell |
| | # Dwonload the correct toolchain for working with rkllm |
| | # Documentation here: https://github.com/airockchip/rknn-llm/blob/main/doc/Rockchip_RKLLM_SDK_EN_1.1.0.pdf |
| | wget https://developer.arm.com/-/media/Files/downloads/gnu-a/10.2-2020.11/binrel/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu.tar.xz |
| | tar -xz gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu.tar.xz |
| | |
| | # ensure that the gcc compiler path is set to the location where the toolchain dowloaded earlier is unpacked |
| | nano deploy/build-linux.sh # update the gcc compiler path |
| | |
| | # compile the demo app |
| | cd delpoy/ |
| | ./build-linux.sh |
| | ``` |
| |
|
| | ## Steps to run the app |
| | More information and original guide: https://github.com/airockchip/rknn-llm/tree/main/examples/rkllm_multimodel_demo |
| | ```shell |
| | # push install dir to device |
| | adb push ./install/demo_Linux_aarch64 /data |
| | # push model file to device |
| | adb push Qwen2.5-Math-1.5B-Instruct.rkllm /data/models |
| | |
| | adb shell |
| | cd /data/demo_Linux_aarch64 |
| | # export lib path |
| | export LD_LIBRARY_PATH=./lib |
| | # soft link models dir |
| | ln -s /data/models . |
| | # run llm(Pure Text Example) |
| | ./llm models/Qwen2.5-Math-1.5B-Instruct.rkllm 128 512 |
| | ``` |