LLaMmlein_120M_prerelease / README.md

JanPf

Update README.md

a829d01 verified 3 months ago

preview code

raw

history blame contribute delete

1.35 kB

metadata

datasets:
  - togethercomputer/RedPajama-Data-V2
  - LSX-UniWue/LLaMmlein-Dataset
language:
  - de
pipeline_tag: text-generation
library_name: transformers
license: other
new_version: LSX-UniWue/LLaMmlein_120M

LLäMmlein 120M

This is a German Tinyllama 120M language model trained from scratch using the Tinyllama codebase on the German portion of RedPajama V2. Find more details on our page and our preprint

Next to the final model, we publish intermediate training checkpoints for our base models as separate branches of the model repository. These can be accessed via the drop-down menu labeled "main" in the top left corner of the "Files and versions" section.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("LSX-UniWue/LLaMmlein_120M")

tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/LLaMmlein_120M")

Performance

We evaluated our model on the SuperGLEBer benchmark. Data Take Down