Update README.md
Browse files
README.md
CHANGED
|
@@ -105,7 +105,7 @@ target | q, k, v, o, gate | q, k, v, o, gate
|
|
| 105 |
|
| 106 |
|
| 107 |
### List of Training arguments
|
| 108 |
-
based on [Transformer
|
| 109 |
|
| 110 |
Parameter | SFT | DPO
|
| 111 |
:------:| :------:| :------:
|
|
|
|
| 105 |
|
| 106 |
|
| 107 |
### List of Training arguments
|
| 108 |
+
based on [Transformer Reinforcement Learning (TRL)](https://github.com/huggingface/trl)
|
| 109 |
|
| 110 |
Parameter | SFT | DPO
|
| 111 |
:------:| :------:| :------:
|