|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- faur-ai/fulg |
|
|
language: |
|
|
- ro |
|
|
--- |
|
|
|
|
|
# LLMic Model Card |
|
|
|
|
|
[LLMic: Romanian Foundation Language Model](https://arxiv.org/abs/2501.07721) |
|
|
|
|
|
## Model Summary |
|
|
|
|
|
LLMic is a bilingual Romanian-English foundation model. LLmic is a 3B |
|
|
parameters dense decoder-only Transformer model based on Llama2. |
|
|
|
|
|
This is the v2 of the model, with **casing** and **diacritics**. |
|
|
|
|
|
## Architecture |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|---------| |
|
|
| Sequence Length | 2048 | |
|
|
| Number of Layers | 24 | |
|
|
| Embedding Size | 2,560 | |
|
|
| FFN Hidden Size | 10,240 | |
|
|
| Number of Heads | 20 | |
|
|
| Number of KV Heads | 5 | |
|
|
| Activation Function | SiLU | |
|
|
| Position Encodings | RoPE (Θ=500,000) | |
|
|
| Layer Norm | RMSNorm (ε=10⁻⁵) | |
|
|
| Tied Embeddings | No | |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
Our model is designed to accelerate research on Romanian language models, serving as a building block for generative AI applications. |
|
|
|
|
|
## Use with transformers |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer |
|
|
|
|
|
device = "cuda" |
|
|
model_id = "faur-ai/LLMic_v2" |
|
|
prompt = "Capitala României este" |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained(model_id).to(device) |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
streamer = TextStreamer(tokenizer) |
|
|
|
|
|
inputs = tokenizer.encode( |
|
|
prompt, |
|
|
add_special_tokens=False, |
|
|
return_tensors='pt', |
|
|
).to(device) |
|
|
|
|
|
outputs = model.generate( |
|
|
streamer=streamer, |
|
|
input_ids=inputs, |
|
|
temperature=0.8, |
|
|
do_sample=True |
|
|
) |
|
|
``` |
|
|
|
|
|
|
|
|
## Citation |
|
|
|
|
|
**BibTeX:** |
|
|
|
|
|
``` |
|
|
@misc{bădoiu2025llmicromanianfoundationlanguage, |
|
|
title={LLMic: Romanian Foundation Language Model}, |
|
|
author={Vlad-Andrei Bădoiu and Mihai-Valentin Dumitru and Alexandru M. Gherghescu and Alexandru Agache and Costin Raiciu}, |
|
|
year={2025}, |
|
|
eprint={2501.07721}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL}, |
|
|
url={https://arxiv.org/abs/2501.07721}, |
|
|
} |
|
|
``` |
|
|
|
|
|
|