Update README.md
Browse files
README.md
CHANGED
|
@@ -28,7 +28,7 @@ Then, use the prepare-script and the finetuning script in the files list of this
|
|
| 28 |
|
| 29 |
# How to use it
|
| 30 |
You can directly download the final model as ONNX format - so it runs without the need to install a huge Python environment with PyTorch, CUDA, etc... - as INT8 and in full precision.
|
| 31 |
-
Use `inference.py` for local inference on CUDA or CPU!
|
| 32 |
|
| 33 |
|
| 34 |
Have fun! :D
|
|
|
|
| 28 |
|
| 29 |
# How to use it
|
| 30 |
You can directly download the final model as ONNX format - so it runs without the need to install a huge Python environment with PyTorch, CUDA, etc... - as INT8 and in full precision.
|
| 31 |
+
Use `inference.py` for local inference on CUDA or CPU! First, install `pip install onnxruntime-gpu tiktoken numpy nvidia-cudnn-cu12 nvidia-cublas-cu12` on your system (in a Python VENV for Linux users).
|
| 32 |
|
| 33 |
|
| 34 |
Have fun! :D
|