Update README.md
Browse files
README.md
CHANGED
|
@@ -60,6 +60,7 @@ Users should be aware of potential biases in the model's responses and the limit
|
|
| 60 |
### Training Data
|
| 61 |
|
| 62 |
The training data consists of questions and answers generated using the head-to-tail pipeline with a Dbpedia script. See the paper and Github repository for more details.
|
|
|
|
| 63 |
|
| 64 |
### Training Procedure
|
| 65 |
|
|
@@ -67,11 +68,21 @@ The model was fine-tuned using LoRA.
|
|
| 67 |
|
| 68 |
#### Training Hyperparameters
|
| 69 |
|
| 70 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
|
| 72 |
## Evaluation
|
| 73 |
|
| 74 |
-
[
|
| 75 |
|
| 76 |
## Environmental Impact
|
| 77 |
|
|
|
|
| 60 |
### Training Data
|
| 61 |
|
| 62 |
The training data consists of questions and answers generated using the head-to-tail pipeline with a Dbpedia script. See the paper and Github repository for more details.
|
| 63 |
+
Model was trained on 3000 Unknown questions with 10 additional HighlyKnown question per Unknown
|
| 64 |
|
| 65 |
### Training Procedure
|
| 66 |
|
|
|
|
| 68 |
|
| 69 |
#### Training Hyperparameters
|
| 70 |
|
| 71 |
+
LR = 1e-3
|
| 72 |
+
BS = 8
|
| 73 |
+
EPOCHS = 10
|
| 74 |
+
LoRA:
|
| 75 |
+
lora_rank = 1
|
| 76 |
+
lora_alpha = 2
|
| 77 |
+
use_rslora = True
|
| 78 |
+
lora_dropout = 0.1
|
| 79 |
+
bias = "none"
|
| 80 |
+
target_modules = ["down_proj", "gate_proj", "up_proj"]
|
| 81 |
+
task_type = "CAUSAL_LM"
|
| 82 |
|
| 83 |
## Evaluation
|
| 84 |
|
| 85 |
+
For evaluation you can use [notebooks](https://github.com/AIRI-Institute/knowledge-packing/tree/main/notebooks) from github repository
|
| 86 |
|
| 87 |
## Environmental Impact
|
| 88 |
|