Update README.md
Browse files
README.md
CHANGED
|
@@ -88,6 +88,17 @@ python setup_env.py --hf-repo tiiuae/Falcon-E-3B-Instruct -q i2_s
|
|
| 88 |
python run_inference.py -m models/Falcon-E-3B-Instruct/ggml-model-i2_s.gguf -p "You are a helpful assistant" -cnv
|
| 89 |
```
|
| 90 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
### Fine-tuning
|
| 92 |
|
| 93 |
For fine-tuning the model, you should load the `prequantized` revision of the model and use the `onebitllms` Python package:
|
|
|
|
| 88 |
python run_inference.py -m models/Falcon-E-3B-Instruct/ggml-model-i2_s.gguf -p "You are a helpful assistant" -cnv
|
| 89 |
```
|
| 90 |
|
| 91 |
+
#### mlx-lm
|
| 92 |
+
|
| 93 |
+
```
|
| 94 |
+
pip install -U mlx-lm
|
| 95 |
+
```
|
| 96 |
+
|
| 97 |
+
Then:
|
| 98 |
+
```
|
| 99 |
+
mlx_lm.generate --model tiiuae/Falcon-E-3B-Instruct --prompt "Implement bubble sort" --max-tokens 100 --temp 0.1
|
| 100 |
+
```
|
| 101 |
+
|
| 102 |
### Fine-tuning
|
| 103 |
|
| 104 |
For fine-tuning the model, you should load the `prequantized` revision of the model and use the `onebitllms` Python package:
|