--- license: apache-2.0 datasets: - faur-ai/fulg language: - ro --- # LLMic Model Card [LLMic: Romanian Foundation Language Model](https://arxiv.org/abs/2501.07721) ## Model Summary LLMic is a bilingual Romanian-English foundation model. LLmic is a 3B parameters dense decoder-only Transformer model based on Llama2. This is the v2 of the model, with **casing** and **diacritics**. ## Architecture | Parameter | Value | |-----------|---------| | Sequence Length | 2048 | | Number of Layers | 24 | | Embedding Size | 2,560 | | FFN Hidden Size | 10,240 | | Number of Heads | 20 | | Number of KV Heads | 5 | | Activation Function | SiLU | | Position Encodings | RoPE (Θ=500,000) | | Layer Norm | RMSNorm (ε=10⁻⁵) | | Tied Embeddings | No | ## Intended Use Our model is designed to accelerate research on Romanian language models, serving as a building block for generative AI applications. ## Use with transformers ```python from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer device = "cuda" model_id = "faur-ai/LLMic_v2" prompt = "Capitala României este" model = AutoModelForCausalLM.from_pretrained(model_id).to(device) tokenizer = AutoTokenizer.from_pretrained(model_id) streamer = TextStreamer(tokenizer) inputs = tokenizer.encode( prompt, add_special_tokens=False, return_tensors='pt', ).to(device) outputs = model.generate( streamer=streamer, input_ids=inputs, temperature=0.8, do_sample=True ) ``` ## Citation **BibTeX:** ``` @misc{bădoiu2025llmicromanianfoundationlanguage, title={LLMic: Romanian Foundation Language Model}, author={Vlad-Andrei Bădoiu and Mihai-Valentin Dumitru and Alexandru M. Gherghescu and Alexandru Agache and Costin Raiciu}, year={2025}, eprint={2501.07721}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.07721}, } ```