|
|
--- |
|
|
datasets: |
|
|
- togethercomputer/RedPajama-Data-V2 |
|
|
- LSX-UniWue/LLaMmlein-Dataset |
|
|
language: |
|
|
- de |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
license: other |
|
|
new_version: LSX-UniWue/LLaMmlein_120M |
|
|
--- |
|
|
|
|
|
# LLäMmlein 120M |
|
|
|
|
|
This is a German Tinyllama 120M language model trained from scratch using the [Tinyllama](https://github.com/jzhang38/TinyLlama) codebase on the German portion of [RedPajama V2](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2). |
|
|
Find more details on our [page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) and our [preprint](https://arxiv.org/abs/2411.11171) |
|
|
|
|
|
Next to the final model, we publish intermediate training checkpoints for our base models as separate branches of the model repository. These can be accessed via the drop-down menu labeled "main" in the top left corner of the "Files and versions" section. |
|
|
|
|
|
### Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("LSX-UniWue/LLaMmlein_120M") |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/LLaMmlein_120M") |
|
|
``` |
|
|
|
|
|
|
|
|
### Performance |
|
|
We evaluated our model on the [SuperGLEBer](https://lsx-uniwue.github.io/SuperGLEBer-site/) benchmark. |
|
|
[Data Take Down](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) |