Update README.md
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ tags:
|
|
| 11 |
- cardiology
|
| 12 |
---
|
| 13 |
|
| 14 |
-
Llama-3.2-1B-Instruct, with domain adapted pretraining (DAPT), also called Continuous Pre-training (CPT) on a Dutch medical corpus.
|
| 15 |
|
| 16 |
Training for one full epoch, with a 256 batch size, maximally 768 sequence length and a linear-cosine schedule (details follow..).
|
| 17 |
|
|
@@ -19,4 +19,4 @@ This model will be further pre-trained on 5 million cardiology records from the
|
|
| 19 |
|
| 20 |
The perplexity was around 5 on the validation set.
|
| 21 |
|
| 22 |
-
Note: this is not instruction tuned, and does not generate an EOS token.
|
|
|
|
| 11 |
- cardiology
|
| 12 |
---
|
| 13 |
|
| 14 |
+
Llama-3.2-1B-Instruct, with domain adapted pretraining (DAPT), also called Continuous Pre-training (CPT) on a Dutch medical corpus, slightly biased towards cardiology.
|
| 15 |
|
| 16 |
Training for one full epoch, with a 256 batch size, maximally 768 sequence length and a linear-cosine schedule (details follow..).
|
| 17 |
|
|
|
|
| 19 |
|
| 20 |
The perplexity was around 5 on the validation set.
|
| 21 |
|
| 22 |
+
Note: this is not instruction tuned, and does not generate an EOS token. Update coming.
|