ltgoslo commited on
Commit
3e5cb22
·
verified ·
1 Parent(s): 0f44edf

model card

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - HPLT/HPLT3.0
5
+ - allenai/olmo-mix-1124
6
+ - allenai/MADLAD-400
7
+ - HuggingFaceFW/finepdfs
8
+ language:
9
+ - fi
10
+ base_model:
11
+ - allenai/OLMo-2-1124-13B
12
+ library_name: transformers
13
+ tags:
14
+ - finnish
15
+ - suomi
16
+ - HPLT
17
+ ---
18
+
19
+ ![](https://hplt-project.org/_next/static/media/logo-hplt.8765d2d4.svg)
20
+
21
+ This is a base (not instruction-tuned) large language model, continually pre-trained on Finnish data starting from the English [OLMo2-13B](https://huggingface.co/allenai/OLMo-2-1124-13B) model.
22
+
23
+ Our training data mixture included [HPLTv3](https://huggingface.co/datasets/HPLT/HPLT3.0) Finnish, FinePDF Finnish, MADLAD400 Finnish, OLMo-Mix.
24
+
25
+ Training was conducted as a part of the [HPLT project](https://hplt-project.org/)
26
+
27
+ _This project has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101070350 and from UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee [grant number 10052546]_