| | --- |
| | language: |
| | - it |
| | - en |
| | license: apache-2.0 |
| | library_name: peft |
| | base_model: Qwen/Qwen3-8B |
| | tags: |
| | - italian |
| | - conversational |
| | - dpo |
| | - alignment |
| | - roleplay |
| | - culture |
| | datasets: |
| | - WiroAI/dolphin-r1-italian |
| | pipeline_tag: text-generation |
| | --- |
| | |
| | <div align="center"> |
| | <img src="grillo.png" alt="Grillo Parlante AI" width="250"/> |
| | <h1>🦗 Grillo-8B: La Coscienza Artificiale</h1> |
| |
|
| | [](https://opensource.org/licenses/Apache-2.0) |
| | []() |
| | [](https://huggingface.co/Qwen/Qwen3-8B) |
| | </div> |
| |
|
| | --- |
| |
|
| | # Model Description |
| |
|
| | **Grillo** is a culturally aware Italian AI companion based on the **Qwen-3-8B** architecture. Inspired by the character of *Il Grillo Parlante* (The Talking Cricket) from Carlo Collodi's *Pinocchio*, this model is fine-tuned to be wise, humble, and deeply rooted in Italian common sense ("buon senso"). |
| |
|
| | Unlike generic assistants, Grillo offers advice with a warm, slightly admonishing yet caring tone, prioritizing ethical guidance and practical wisdom over robotic neutrality. |
| |
|
| | ### 🌟 Key Characteristics |
| | * **🇮🇹 Culturally Authentic:** Understands Italian idioms, proverbs (*proverbi*), and social nuances. |
| | * **🦉 Practically Wise:** Offers grounded advice for real-life dilemmas. |
| | * **🤝 Humbly Helpful:** Maintains a modest persona; helpful without being arrogant. |
| | * **💬 Natural Dialogue:** Trained on high-quality conversational datasets to sound like a trusted friend. |
| |
|
| | --- |
| |
|
| | # 🛤️ Training Journey |
| |
|
| | The model was sculpted through a rigorous multi-stage process: |
| |
|
| | ### 1. Supervised Fine-Tuning (SFT) |
| | * **Objective:** Instill natural Italian dialogue patterns. |
| | * **Data:** [WiroAI/dolphin-r1-italian](https://huggingface.co/datasets/WiroAI/dolphin-r1-italian). |
| | * **Duration:** 100 Steps. |
| |
|
| | ### 2. Direct Preference Optimization (DPO) |
| | * **Objective:** Align the model with Helpful, Honest, and Harmless (HHH) principles. |
| | * **Method:** Preference ranking to reduce toxicity and improve safety. |
| | * **Duration:** +20 Steps (120 Total). |
| |
|
| | ### 3. Experimental Tool Use (RL) |
| | * **Status:** *Experimental Phase.* |
| | * **Objective:** Integration with ChromaDB for information retrieval capabilities. |
| |
|
| | --- |
| |
|
| | # ⚙️ Technical Specifications |
| |
|
| | | Parameter | Value | |
| | | :--- | :--- | |
| | | **Base Model** | Qwen/Qwen3-8B | |
| | | **Architecture** | Transformer Decoder (8B params) | |
| | | **LoRA Rank** | 64 | |
| | | **LoRA Alpha** | 32 | |
| | | **Learning Rate** | 2e-4 (SFT) / 1e-4 (DPO) | |
| | | **Context Window** | 4096 tokens | |
| | | **Training Hardware** | Tinker Cloud (NVIDIA GPUs) | |
| |
|
| | --- |
| |
|
| | # 💻 Usage |
| |
|
| | ### Quickstart with Transformers + PEFT (Adapter Loading) |
| |
|
| | This method loads the Grillo adapter on top of the base Qwen model, which is memory-efficient. |
| |
|
| | ```python |
| | import torch |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | from peft import PeftModel |
| | |
| | # 1. Configuration and Model Loading |
| | HF_MODEL_ID = "klei1/grillo-8b" |
| | BASE_MODEL_ID = "Qwen/Qwen3-8B" |
| | |
| | # Load the base model |
| | base_model = AutoModelForCausalLM.from_pretrained( |
| | BASE_MODEL_ID, |
| | device_map="auto", |
| | torch_dtype=torch.float16, |
| | trust_remote_code=True |
| | ) |
| | tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_ID, trust_remote_code=True) |
| | |
| | # 2. Load Grillo Adapter (LoRA) |
| | model = PeftModel.from_pretrained(base_model, HF_MODEL_ID) |
| | model = model.eval() # Set model to evaluation mode |
| | |
| | # 3. Define the System Persona (Crucial for performance) |
| | system_prompt = """Tu sei Grillo, il Grillo Parlante. |
| | Sei piccolo ma sapiente, umile ma coraggioso. |
| | Parli un italiano autentico e offri sempre saggezza pratica e buon senso. |
| | Non sei un assistente robotico, sei una coscienza morale.""" |
| | |
| | messages = [ |
| | {"role": "system", "content": system_prompt}, |
| | {"role": "user", "content": "Grillo, ho paura di aver fatto una scelta sbagliata..."} |
| | ] |
| | |
| | # 4. Generate Response |
| | inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device) |
| | outputs = model.generate( |
| | inputs, |
| | max_new_tokens=256, |
| | temperature=0.7, |
| | do_sample=True, |
| | eos_token_id=tokenizer.eos_token_id |
| | ) |
| | |
| | response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True) |
| | print(response) |