Update README.md
Browse files
README.md
CHANGED
|
@@ -107,7 +107,7 @@ logger.info(f"Base Model output: {tokenizer.decode(out_ids[0], skip_special_toke
|
|
| 107 |
|
| 108 |
- **Training Data:** WikiText-103
|
| 109 |
- **Training Objective:** Hybrid KL divergence and language modeling loss
|
| 110 |
-
- **Supervision Signal:** kNN distributions from GPT2-xl
|
| 111 |
- **Hyperparameters:**
|
| 112 |
- Learning rate: 1e-3
|
| 113 |
- Beta (loss balance): 0.5
|
|
|
|
| 107 |
|
| 108 |
- **Training Data:** WikiText-103
|
| 109 |
- **Training Objective:** Hybrid KL divergence and language modeling loss
|
| 110 |
+
- **Supervision Signal:** kNN distributions from GPT2-xl, it is suggested to use the finetuned version of GPT2-xl [here](https://huggingface.co/Clover-Hill/gpt2-xl-finetuned-wikitext103).
|
| 111 |
- **Hyperparameters:**
|
| 112 |
- Learning rate: 1e-3
|
| 113 |
- Beta (loss balance): 0.5
|