Mattimax commited on
Commit
7716e29
·
verified ·
1 Parent(s): 4ba7177

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +126 -0
README.md CHANGED
@@ -144,4 +144,130 @@ Se utilizzi **Mattimax/DACMini-IT** in un progetto, un articolo o qualsiasi lavo
144
  year = {2025},
145
  note = {License: MIT. Se usi questo modello, per favore citane la fonte originale.}
146
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
  ```
 
144
  year = {2025},
145
  note = {License: MIT. Se usi questo modello, per favore citane la fonte originale.}
146
  }
147
+ ```
148
+
149
+ # English version
150
+
151
+ ## Description
152
+
153
+ **DACMini-IT** is a compact, instruction-tuned language model for **Italian chat and dialogue**.
154
+ Based on the **GPT-2 Small (Italian adaptation)** architecture, it is designed to be fast, lightweight, and easily deployable on low-resource devices.
155
+
156
+ Compared to the “base” DACMini, **DACMini-IT** is trained on Italian conversational datasets structured in *user-assistant* format, optimizing its ability to follow instructions and handle natural multi-turn conversations.
157
+
158
+ ---
159
+
160
+ ## Size and technical specs
161
+
162
+ * **Parameters:** 109M
163
+ * **Architecture:** GPT-2 Small (Italian adaptation)
164
+ * **Max context length:** 512 tokens
165
+ * **Number of layers:** 12
166
+ * **Number of attention heads:** 12
167
+ * **Embedding size:** 768
168
+ * **Vocabulary:** ~50,000 tokens
169
+ * **Quantization:** supported (optional 8-bit / 4-bit via `bitsandbytes`)
170
+
171
+ ---
172
+
173
+ ## Training dataset
174
+
175
+ Trained on [**Mattimax/DATA-AI_Conversation_ITA**](https://huggingface.co/datasets/Mattimax/DATA-AI_Conversation_ITA), an Italian instruction-tuned conversational dataset containing structured *prompt-response* pairs designed to promote coherent, natural, and grammatically correct answers.
176
+
177
+ ---
178
+
179
+ ## Objectives
180
+
181
+ * Italian-language chatbot with instruction-following capabilities.
182
+ * Concise, clear, and natural responses in multi-turn contexts.
183
+ * Lightweight or offline applications where model size is a constraint.
184
+
185
+ ---
186
+
187
+ ## Warnings and limitations
188
+
189
+ * **Experimental** model: may produce logical errors or irrelevant answers.
190
+ * Not trained on sensitive topics or specialized content.
191
+ * Limited performance on very long conversations or complex prompts.
192
+ * Not intended for commercial use without further validation.
193
+
194
+ ---
195
+
196
+ ## Recommended use
197
+
198
+ * Lightweight or offline Italian chatbot applications.
199
+ * Prototyping and testing of Italian NLP pipelines.
200
+ * Synthetic response generation and datasets for training or evaluation.
201
+
202
+ ---
203
+
204
+ ## Example inference code
205
+
206
+ ```python
207
+ from transformers import AutoTokenizer, AutoModelForCausalLM
208
+ import torch
209
+
210
+ # 1. Load trained model and tokenizer
211
+ model_path = "Mattimax/DACMini-IT"
212
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
213
+ model = AutoModelForCausalLM.from_pretrained(model_path)
214
+ model.eval()
215
+
216
+ # 2. Generation function
217
+ def chat_inference(prompt, max_new_tokens=150, temperature=0.7, top_p=0.9):
218
+ # Build input in the format used during training
219
+ formatted_prompt = f"<|user|> {prompt.strip()} <|assistant|>"
220
+
221
+ # Tokenize
222
+ inputs = tokenizer(formatted_prompt, return_tensors="pt")
223
+
224
+ # Generate response
225
+ with torch.no_grad():
226
+ output = model.generate(
227
+ **inputs,
228
+ max_new_tokens=max_new_tokens,
229
+ temperature=temperature,
230
+ top_p=top_p,
231
+ do_sample=True,
232
+ pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id
233
+ )
234
+
235
+ # Decode and remove initial prompt
236
+ generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
237
+ response = generated_text.split("<|assistant|>")[-1].strip()
238
+ return response
239
+
240
+ # 3. Usage example
241
+ if __name__ == "__main__":
242
+ while True:
243
+ user_input = input("👤 User: ")
244
+ if user_input.lower() in ["exit", "quit"]:
245
+ break
246
+ response = chat_inference(user_input)
247
+ print(f"🤖 Assistant: {response}\n")
248
+ ```
249
+
250
+ ---
251
+
252
+ ## References
253
+
254
+ * Dataset: [Mattimax/DATA-AI_Conversation_ITA](https://huggingface.co/datasets/Mattimax/DATA-AI_Conversation_ITA)
255
+ * Base model: [DACMini](https://huggingface.co/Mattimax/DACMini)
256
+ * Organization: [M.INC](https://huggingface.co/MINC01)
257
+ * Collection: [Little_DAC Collection](https://huggingface.co/collections/Mattimax/little-dac-collection-68e11d19a5949d08e672b312)
258
+
259
+ ---
260
+
261
+ ## Citation
262
+
263
+ If you use **Mattimax/DACMini-IT** in a project, paper, or any work, please cite it using the `CITATION.bib` file included in the repository:
264
+
265
+ ```bibtex
266
+ @misc{mattimax2025dacminiit,
267
+ title = {{Mattimax/DACMini-IT}: An open-source language model},
268
+ author = {Mattimax},
269
+ howpublished = {\url{https://huggingface.co/Mattimax/DACMini-IT}},
270
+ year = {2025},
271
+ note = {License: MIT. If you use this model, please cite the original source.}
272
+ }
273
  ```