LSX-UniWue
/

LLaMmlein_120M_prerelease

Text Generation

text-generation-inference

Model card Files Files and versions

LLaMmlein_120M_prerelease / README.md

JanPf's picture

Update README.md

a829d01 verified 3 months ago

|

history blame contribute delete

1.35 kB

	---
	datasets:
	- togethercomputer/RedPajama-Data-V2
	- LSX-UniWue/LLaMmlein-Dataset
	language:
	- de
	pipeline_tag: text-generation
	library_name: transformers
	license: other
	new_version: LSX-UniWue/LLaMmlein_120M
	---

	# LLäMmlein 120M

	This is a German Tinyllama 120M language model trained from scratch using the [Tinyllama](https://github.com/jzhang38/TinyLlama) codebase on the German portion of [RedPajama V2](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2).
	Find more details on our [page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) and our [preprint](https://arxiv.org/abs/2411.11171)

	Next to the final model, we publish intermediate training checkpoints for our base models as separate branches of the model repository. These can be accessed via the drop-down menu labeled "main" in the top left corner of the "Files and versions" section.

	### Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("LSX-UniWue/LLaMmlein_120M")

	tokenizer = AutoTokenizer.from_pretrained("LSX-UniWue/LLaMmlein_120M")
	```


	### Performance
	We evaluated our model on the [SuperGLEBer](https://lsx-uniwue.github.io/SuperGLEBer-site/) benchmark.
	[Data Take Down](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/)