updated model card
Browse files
README.md
CHANGED
|
@@ -10,6 +10,12 @@ base_model:
|
|
| 10 |
- common-pile/comma-v0.1-2t
|
| 11 |
pipeline_tag: text-generation
|
| 12 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
| Stage | Batch size | Steps | HF path | Data mix | Comments |
|
| 15 |
|-|-|-|-|-|-|
|
|
|
|
| 10 |
- common-pile/comma-v0.1-2t
|
| 11 |
pipeline_tag: text-generation
|
| 12 |
---
|
| 13 |
+
# Munin-7B-Open-pt
|
| 14 |
+
|
| 15 |
+
Munin-7B-open-pt is a 7 billion parameter language model continually pre-trained from [Comma v0.1-2T](https://huggingface.co/common-pile/comma-v0.1-2t/) using 30B tokens using a mix of the [Dynaword](https://huggingface.co/datasets/danish-foundation-models/danish-dynaword) and [the Comma v0.1 dataset](https://huggingface.co/datasets/common-pile/comma_v0.1_training_dataset), comprising only public domain and openly licensed data.
|
| 16 |
+
Munin-7B-open-pt is a base model that can be used a the starting point for fine-tuning and post-training. It has not been instruction-tuned and cannot directly be expected to function as a chat model.
|
| 17 |
+
|
| 18 |
+
Munin-7B-open-pt has been trained using the [maester](https://github.com/rlrs/maester) framework developed as part of the [Danish Foundation Models project](https://foundationmodels.dk/). The three pre-training stages are detailed in the following table:
|
| 19 |
|
| 20 |
| Stage | Batch size | Steps | HF path | Data mix | Comments |
|
| 21 |
|-|-|-|-|-|-|
|