Update README.md
Browse files
README.md
CHANGED
|
@@ -18,7 +18,7 @@ Fine-tuned [Llama 3.1 8B Instruct](https://huggingface.co/meta-llama/Llama-3.1-8
|
|
| 18 |
## Training
|
| 19 |
|
| 20 |
- **Dataset:** 900 examples from [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k)
|
| 21 |
-
- **Method:** LoRA
|
| 22 |
- **Trainable params:** 42M / 8B (0.52%)
|
| 23 |
- **Epochs:** 1
|
| 24 |
- **Loss:** 0.66 → 0.63
|
|
@@ -62,4 +62,9 @@ You are a helpful assistant with access to the following tools or function calls
|
|
| 62 |
|
| 63 |
- Trained on 900 examples (proof of concept)
|
| 64 |
- May have argument variations vs ground truth
|
| 65 |
-
- Best for single/simple tool calls
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
## Training
|
| 19 |
|
| 20 |
- **Dataset:** 900 examples from [Salesforce/xlam-function-calling-60k](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k)
|
| 21 |
+
- **Method:** LoRA
|
| 22 |
- **Trainable params:** 42M / 8B (0.52%)
|
| 23 |
- **Epochs:** 1
|
| 24 |
- **Loss:** 0.66 → 0.63
|
|
|
|
| 62 |
|
| 63 |
- Trained on 900 examples (proof of concept)
|
| 64 |
- May have argument variations vs ground truth
|
| 65 |
+
- Best for single/simple tool calls
|
| 66 |
+
|
| 67 |
+
## Training Details
|
| 68 |
+
- **Framework:** Unsloth 2025.11.2 + TRL
|
| 69 |
+
- **Hardware:** RTX 5090 (32GB)
|
| 70 |
+
- **Method:** LoRA (r=16, alpha=16)
|