Mattimax
/

DACMini-IT

@@ -144,4 +144,130 @@ Se utilizzi **Mattimax/DACMini-IT** in un progetto, un articolo o qualsiasi lavo
     year = {2025},
     note = {License: MIT. Se usi questo modello, per favore citane la fonte originale.}
 }
 ```

     year = {2025},
     note = {License: MIT. Se usi questo modello, per favore citane la fonte originale.}
 }
+```
+# English version
+## Description
+**DACMini-IT** is a compact, instruction-tuned language model for **Italian chat and dialogue**.
+Based on the **GPT-2 Small (Italian adaptation)** architecture, it is designed to be fast, lightweight, and easily deployable on low-resource devices.
+Compared to the “base” DACMini, **DACMini-IT** is trained on Italian conversational datasets structured in *user-assistant* format, optimizing its ability to follow instructions and handle natural multi-turn conversations.
+---
+## Size and technical specs
+* **Parameters:** 109M
+* **Architecture:** GPT-2 Small (Italian adaptation)
+* **Max context length:** 512 tokens
+* **Number of layers:** 12
+* **Number of attention heads:** 12
+* **Embedding size:** 768
+* **Vocabulary:** ~50,000 tokens
+* **Quantization:** supported (optional 8-bit / 4-bit via `bitsandbytes`)
+---
+## Training dataset
+Trained on [**Mattimax/DATA-AI_Conversation_ITA**](https://huggingface.co/datasets/Mattimax/DATA-AI_Conversation_ITA), an Italian instruction-tuned conversational dataset containing structured *prompt-response* pairs designed to promote coherent, natural, and grammatically correct answers.
+---
+## Objectives
+* Italian-language chatbot with instruction-following capabilities.
+* Concise, clear, and natural responses in multi-turn contexts.
+* Lightweight or offline applications where model size is a constraint.
+---
+## Warnings and limitations
+* **Experimental** model: may produce logical errors or irrelevant answers.
+* Not trained on sensitive topics or specialized content.
+* Limited performance on very long conversations or complex prompts.
+* Not intended for commercial use without further validation.
+---
+## Recommended use
+* Lightweight or offline Italian chatbot applications.
+* Prototyping and testing of Italian NLP pipelines.
+* Synthetic response generation and datasets for training or evaluation.
+---
+## Example inference code
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+# 1. Load trained model and tokenizer
+model_path = "Mattimax/DACMini-IT"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+model = AutoModelForCausalLM.from_pretrained(model_path)
+model.eval()
+# 2. Generation function
+def chat_inference(prompt, max_new_tokens=150, temperature=0.7, top_p=0.9):
+    # Build input in the format used during training
+    formatted_prompt = f"<|user|> {prompt.strip()} <|assistant|>"
+    # Tokenize
+    inputs = tokenizer(formatted_prompt, return_tensors="pt")
+    # Generate response
+    with torch.no_grad():
+        output = model.generate(
+            **inputs,
+            max_new_tokens=max_new_tokens,
+            temperature=temperature,
+            top_p=top_p,
+            do_sample=True,
+            pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id
+        )
+    # Decode and remove initial prompt
+    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
+    response = generated_text.split("<|assistant|>")[-1].strip()
+    return response
+# 3. Usage example
+if __name__ == "__main__":
+    while True:
+        user_input = input("👤 User: ")
+        if user_input.lower() in ["exit", "quit"]:
+            break
+        response = chat_inference(user_input)
+        print(f"🤖 Assistant: {response}\n")
+```
+---
+## References
+* Dataset: [Mattimax/DATA-AI_Conversation_ITA](https://huggingface.co/datasets/Mattimax/DATA-AI_Conversation_ITA)
+* Base model: [DACMini](https://huggingface.co/Mattimax/DACMini)
+* Organization: [M.INC](https://huggingface.co/MINC01)
+* Collection: [Little_DAC Collection](https://huggingface.co/collections/Mattimax/little-dac-collection-68e11d19a5949d08e672b312)
+---
+## Citation
+If you use **Mattimax/DACMini-IT** in a project, paper, or any work, please cite it using the `CITATION.bib` file included in the repository:
+```bibtex
+@misc{mattimax2025dacminiit,
+    title = {{Mattimax/DACMini-IT}: An open-source language model},
+    author = {Mattimax},
+    howpublished = {\url{https://huggingface.co/Mattimax/DACMini-IT}},
+    year = {2025},
+    note = {License: MIT. If you use this model, please cite the original source.}
+}
 ```