--- language: - it - en license: apache-2.0 library_name: peft base_model: Qwen/Qwen3-8B tags: - italian - conversational - dpo - alignment - roleplay - culture datasets: - WiroAI/dolphin-r1-italian pipeline_tag: text-generation ---
  Grillo Parlante AI  

🦗 Grillo-8B: La Coscienza Artificiale

  [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)   [![Language](https://img.shields.io/badge/Language-Italian-green.svg)]()   [![Base Model](https://img.shields.io/badge/Base_Model-Qwen3--8B-yellow.svg)](https://huggingface.co/Qwen/Qwen3-8B)
--- # Model Description **Grillo** is a culturally aware Italian AI companion based on the **Qwen-3-8B** architecture. Inspired by the character of *Il Grillo Parlante* (The Talking Cricket) from Carlo Collodi's *Pinocchio*, this model is fine-tuned to be wise, humble, and deeply rooted in Italian common sense ("buon senso"). Unlike generic assistants, Grillo offers advice with a warm, slightly admonishing yet caring tone, prioritizing ethical guidance and practical wisdom over robotic neutrality. ### 🌟 Key Characteristics * **🇮🇹 Culturally Authentic:** Understands Italian idioms, proverbs (*proverbi*), and social nuances. * **🦉 Practically Wise:** Offers grounded advice for real-life dilemmas. * **🤝 Humbly Helpful:** Maintains a modest persona; helpful without being arrogant. * **💬 Natural Dialogue:** Trained on high-quality conversational datasets to sound like a trusted friend. --- # 🛤️ Training Journey The model was sculpted through a rigorous multi-stage process: ### 1. Supervised Fine-Tuning (SFT) * **Objective:** Instill natural Italian dialogue patterns. * **Data:** [WiroAI/dolphin-r1-italian](https://huggingface.co/datasets/WiroAI/dolphin-r1-italian). * **Duration:** 100 Steps. ### 2. Direct Preference Optimization (DPO) * **Objective:** Align the model with Helpful, Honest, and Harmless (HHH) principles. * **Method:** Preference ranking to reduce toxicity and improve safety. * **Duration:** +20 Steps (120 Total). ### 3. Experimental Tool Use (RL) * **Status:** *Experimental Phase.* * **Objective:** Integration with ChromaDB for information retrieval capabilities. --- # ⚙️ Technical Specifications | Parameter | Value | | :--- | :--- | | **Base Model** | Qwen/Qwen3-8B | | **Architecture** | Transformer Decoder (8B params) | | **LoRA Rank** | 64 | | **LoRA Alpha** | 32 | | **Learning Rate** | 2e-4 (SFT) / 1e-4 (DPO) | | **Context Window** | 4096 tokens | | **Training Hardware** | Tinker Cloud (NVIDIA GPUs) | --- # 💻 Usage ### Quickstart with Transformers + PEFT (Adapter Loading) This method loads the Grillo adapter on top of the base Qwen model, which is memory-efficient. ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel # 1. Configuration and Model Loading HF_MODEL_ID = "klei1/grillo-8b" BASE_MODEL_ID = "Qwen/Qwen3-8B" # Load the base model base_model = AutoModelForCausalLM.from_pretrained( BASE_MODEL_ID, device_map="auto", torch_dtype=torch.float16, trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_ID, trust_remote_code=True) # 2. Load Grillo Adapter (LoRA) model = PeftModel.from_pretrained(base_model, HF_MODEL_ID) model = model.eval() # Set model to evaluation mode # 3. Define the System Persona (Crucial for performance) system_prompt = """Tu sei Grillo, il Grillo Parlante. Sei piccolo ma sapiente, umile ma coraggioso. Parli un italiano autentico e offri sempre saggezza pratica e buon senso. Non sei un assistente robotico, sei una coscienza morale.""" messages = [ {"role": "system", "content": system_prompt}, {"role": "user", "content": "Grillo, ho paura di aver fatto una scelta sbagliata..."} ] # 4. Generate Response inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device) outputs = model.generate( inputs, max_new_tokens=256, temperature=0.7, do_sample=True, eos_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True) print(response)