trillionlabs
/

Tri-1.9B-Base

Model card Files Files and versions

Tri-1.9B-Base / README.md

juyoung-trl's picture

Update README.md

08598ff verified 5 months ago

|

history blame contribute delete

2.08 kB

	---
	license: apache-2.0
	language:
	- en
	- ko
	- ja
	- zh
	---

	# Tri-1.8B-Base


	Tri-1.8B-Base is a 1.8 billion parameter multilingual language model trained as an early experimental run before the Tri-7B training.

	The model covers English, Korean, Japanese, and Chinese, with additional exposure to programming languages and mathematical reasoning.
	Pretrained on \~1.88 trillion tokens, it serves as a lightweight base model for research, fine-tuning, and open-source community use - especially for advancing Korean LLM development.


	## Model Summary


	* Architecture: decoder-only Transformer (LLaMA-style)
	* Parameters: \~1.8B (untied embeddings and LM head)
	* Layers / hidden size / attention heads: 25 / 2048 / 16
	* Feedforward hidden size: 5,632 (SiLU-gated MLP)
	* Context length: 4,096
	* RoPE θ: 100,000
	* Training precision: bfloat16
	* Status: base pretraining only (no instruction tuning, no RLHF)


	## Intended Use

	* As a foundation for downstream fine-tuning and alignment.
	* Research on multilingual pretraining and adaptation.


	## Limitations

	* Being a base model, outputs may be unsafe, incoherent, or factually incorrect.


	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	name = "trillionlabs/Tri-1.8B-Base"
	tok = AutoTokenizer.from_pretrained(name)
	model = AutoModelForCausalLM.from_pretrained(
	name,
	torch_dtype="bfloat16",
	device_map="auto"
	)

	prompt = "Write a short paragraph about Hangul."
	x = tok(prompt, return_tensors="pt").to(model.device)
	y = model.generate(
	**x,
	max_new_tokens=128,
	do_sample=True,
	temperature=0.8,
	top_p=0.95
	)
	print(tok.decode(y[0], skip_special_tokens=True))
	```

	## License

	This model is released under the Apache 2.0 License.
	See [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) for details.

	---

	## Citation

	If you use this model, please cite it as:

	```
	@misc{trillionlabs_tri18b_base_2025,
	title = {Tri-1.8B-Base},
	author = {Trillion Labs},
	year = {2025},
	note = {https://huggingface.co/trillionlabs/Tri-1.8B-Base}
	}
	```