iperbole commited on
Commit
45374cb
·
verified ·
1 Parent(s): 9df79ec

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - it
5
+ - en
6
+ ---
7
+
8
+ # Llama-3.1-8B-Italian-LAPT
9
+ <div align="center">
10
+
11
+ <img src="https://github.com/Andrew-Wyn/images/blob/master/sava/italian_adapt-img.jpg?raw=true" width="400" height="400" style="border-radius:10%" />
12
+
13
+ </div>
14
+
15
+ The **Llama-3.1-8B-Adapted** collection of large language models (LLMs), is a collection of adapted generative models in 8B (text in/text out), adapted models from **Llama-3.1-8B**.
16
+
17
+ *Llama-3.1-8B-Italian-LAPT* is a continual trained mistral model.
18
+
19
+ **Model developer:** SapienzaNLP, ISTI-CNR, ILC-CNR
20
+
21
+ **Model Architecture:** Mistral-7B-v0.1-Adapted is an auto-regressive language model that uses an optimized transformer architecture.
22
+
23
+ ## Data used for the adaptation
24
+
25
+ The **Mistral-7B-v0.1-Adapted** model are trained on a collection of Italian and English data extracted from [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX).
26
+ The data are extracted to be skewed toward Italian language with a ration of one over four. Extracting the first 9B tokens from Italian part of CulturaX and the first 3B tokens from English part of CulturaX.
27
+
28
+
29
+ ## Use with Transformers
30
+
31
+ You can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.
32
+
33
+ Make sure to update your transformers installation via pip install --upgrade transformers.
34
+
35
+ ```python
36
+ import transformers
37
+ import torch
38
+
39
+ model_id = "SemanticAlignment/Llama-3.1-8B-Italian-LAPT"
40
+
41
+ pipeline = transformers.pipeline(
42
+ "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
43
+ )
44
+
45
+ pipeline("Cosa si può fare in una bella giornata di sole?")
46
+ ```
47
+
48
+ ## Citation
49
+
50
+ If you use any part of this work, please consider citing the paper as follows:
51
+
52
+ ```bibtex
53
+ @misc{moroni2025optimizingllmsitalianreducing,
54
+ title={Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation},
55
+ author={Luca Moroni and Giovanni Puccetti and Pere-Lluis Huguet Cabot and Andrei Stefan Bejgu and Edoardo Barba and Alessio Miaschi and Felice Dell'Orletta and Andrea Esuli and Roberto Navigli},
56
+ year={2025},
57
+ eprint={2504.17025},
58
+ archivePrefix={arXiv},
59
+ primaryClass={cs.CL},
60
+ url={https://arxiv.org/abs/2504.17025},
61
+ }
62
+ ```