| --- |
| language: |
| - it |
| - en |
| license: apache-2.0 |
| library_name: transformers |
| pipeline_tag: text-generation |
| base_model: |
| - meta-llama/Llama-3.1-8B |
| --- |
| |
| # Llama-3.1-8B-Italian-FVT |
| <div align="center"> |
|
|
| <img src="https://github.com/Andrew-Wyn/images/blob/master/sava/italian_adapt-img.jpg?raw=true" width="400" height="400" style="border-radius:10%" /> |
|
|
| </div> |
|
|
| The **Llama-3.1-8B-Adapted** collection of large language models (LLMs), is a collection of adapted generative models in 8B (text in/text out), adapted models from **Llama-3.1-8B**. |
|
|
| *Llama-3.1-8B-Italian-FVT* is a continually trained Llama model, after tokenizer substitution. |
|
|
| The tokenizer of this model after adaptation is the same as [Minverva-3B](https://huggingface.co/sapienzanlp/Minerva-3B-base-v1.0). |
|
|
| **Model developer:** SapienzaNLP, ISTI-CNR, ILC-CNR |
|
|
| **Model Architecture:** Llama-3.1-8B-Adapted is an auto-regressive language model that uses an optimized transformer architecture. |
|
|
| ## Data used for the adaptation |
|
|
| The **Llama-3.1-8B-Adapted** model was trained on a collection of Italian and English data extracted from [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX). |
| The data was extracted to be skewed toward Italian language with a ratio of one over four. Extracting the first 9B tokens from the Italian part of CulturaX and the first 3B tokens from the English part of CulturaX. |
|
|
|
|
| ## Use with Transformers |
|
|
| You can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function. |
|
|
| Make sure to update your transformers installation via `pip install --upgrade transformers`. |
|
|
| ```python |
| import transformers |
| import torch |
| |
| model_id = "SemanticAlignment/Llama-3.1-8B-Italian-FVT" |
| |
| pipeline = transformers.pipeline( |
| "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto" |
| ) |
| |
| pipeline("Cosa si può fare in una bella giornata di sole?") |
| ``` |
|
|
| Code: https://github.com/SapienzaNLP/sava |
|
|
| ## Citation |
|
|
| If you use any part of this work, please consider citing the paper as follows: |
|
|
| ```bibtex |
| @misc{moroni2025optimizingllmsitalianreducing, |
| title={Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation}, |
| author={Luca Moroni and Giovanni Puccetti and Pere-Lluis Huguet Cabot and Andrei Stefan Bejgu and Edoardo Barba and Alessio Miaschi and Felice Dell'Orletta and Andrea Esuli and Roberto Navigli}, |
| year={2025}, |
| eprint={2504.17025}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.CL}, |
| url={https://arxiv.org/abs/2504.17025}, |
| } |
| ``` |