Stefano-M-Community
/

aixpa_w_ground

@@ -10,27 +10,21 @@ tags:
 - trl
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
@@ -42,168 +36,225 @@ tags:
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
 #### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
 #### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
 ### Results
-[More Information Needed]
 #### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
 Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
 ### Model Architecture and Objective
-[More Information Needed]
 ### Compute Infrastructure
-[More Information Needed]
 #### Hardware
-[More Information Needed]
 #### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**
-[More Information Needed]
 **APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
 ## Model Card Contact
-[More Information Needed]
 ### Framework versions
 - PEFT 0.17.1

 - trl
 ---
+# AiXPA Fine-tuned Llama 3.1 8B Model (With Ground Document)
+This model is a fine-tuned version of Meta-Llama-3.1-8B-Instruct, specialized for the AiXPA project in the domain of Italian Public Administration (PA). It was trained using supervised fine-tuning (SFT) with LoRA (Low-Rank Adaptation) techniques on a dialogue dataset between an assistant and a PA user, with reference documents as context.
 ## Model Details
 ### Model Description
+This model is based on Meta-Llama-3.1-8B-Instruct and has been fine-tuned using the Stefano-M-Community/final_all dataset for Italian Public Administration dialogue tasks. The model uses 4-bit quantization and LoRA adapters for efficient training and inference, making it suitable for deployment on consumer hardware while maintaining strong performance in PA-specific conversations with reference documents as context.
+- **Developed by:** LanD (FBK)
+- **Model type:** Causal Language Model (Fine-tuned)
+- **Language(s) (NLP):** Italian (primarily)
+- **License:** Please refer to the original Llama 3.1 license
+- **Finetuned from model:** meta-llama/Meta-Llama-3.1-8B-Instruct
 ### Model Sources [optional]
 ## Uses
 ### Direct Use
+This model can be used directly for text generation tasks, particularly those related to the domain it was fine-tuned on. The model maintains the instruction-following capabilities of the base Llama 3.1 model while being specialized for specific use cases defined in the training dataset.
+### Downstream Use
+The model can be further fine-tuned for specific tasks or integrated into larger applications that require text generation capabilities. The LoRA adapters make it easy to switch between different specialized versions.
 ### Out-of-Scope Use
+This model should not be used for generating harmful, misleading, or inappropriate content. It may not perform well on tasks significantly different from its training domain without additional fine-tuning.
 ## Bias, Risks, and Limitations
+This model inherits the biases and limitations present in the base Llama 3.1 model and may have additional biases introduced through the fine-tuning dataset. Key considerations include:
+- **Domain Specificity:** The model has been fine-tuned on a specific dataset and may not generalize well to domains outside its training scope
+- **Quantization Effects:** 4-bit quantization may introduce minor degradation in model performance compared to full precision
+- **Context Limitations:** Maximum context length of 4,200 tokens may limit performance on very long documents
+- **Language Bias:** Primarily trained on Italian content, may have limited performance in other languages
 ### Recommendations
+- Thoroughly evaluate the model on your specific use case before deployment
+- Consider the potential for biased outputs and implement appropriate safeguards
+- Monitor model performance and outputs in production environments
+- Be aware of the model's training domain when applying to new tasks
+- Consider additional fine-tuning for specialized applications outside the training domain
 ## How to Get Started with the Model
+Use the code below to get started with the model:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel
+import torch
+# Load the base model and tokenizer
+base_model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"
+tokenizer = AutoTokenizer.from_pretrained(base_model_id)
+base_model = AutoModelForCausalLM.from_pretrained(
+    base_model_id,
+    torch_dtype=torch.float16,
+    device_map="auto"
+)
+# Load the LoRA adapter
+model = PeftModel.from_pretrained(base_model, "path/to/your/lora/adapter")
+# Generate text
+prompt = "Your prompt here"
+inputs = tokenizer(prompt, return_tensors="pt")
+with torch.no_grad():
+    outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True, temperature=0.7)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
 ## Training Details
 ### Training Data
+The model was fine-tuned on the `Stefano-M-Community/final_all` dataset from Hugging Face, which contains Italian Public Administration dialogue data between an assistant and PA users. This dataset was used for both training and evaluation.
 ### Training Procedure
+The model was trained using supervised fine-tuning (SFT) with LoRA (Low-Rank Adaptation) techniques. The training utilized 4-bit quantization for memory efficiency and multi-GPU training with 4 processes.
 #### Training Hyperparameters
+- **Training regime:** Mixed precision training with 4-bit quantization
+- **LoRA Configuration:**
+  - Rank: 16
+  - Alpha: 32
+  - Dropout: 0.0
+- **Sequence Length:** 4,200 tokens
+- **Learning Rate:** 5e-5
+- **Scheduler:** Cosine annealing
+- **Batch Size:** 4 (training), 1 (evaluation)
+- **Gradient Accumulation Steps:** 2
+- **Number of Epochs:** 10
+- **Weight Decay:** 0.01
+- **Warmup Ratio:** 0.03
+- **Early Stopping Patience:** 5 epochs
+#### Training Infrastructure
+- **Hardware:** Multi-GPU setup (4 processes)
+- **Framework:**
+  - Accelerate for distributed training
+  - DeepSpeed for optimization
+  - PEFT for LoRA implementation
+- **Logging:** Weights & Biases (WandB)
+- **Evaluation Frequency:** Every 35 steps
+- **Checkpoint Saving:** Every 35 steps
 ## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
+The model was evaluated using the same dataset used for training: `Stefano-M-Community/final_all`. Evaluation was performed every 35 training steps to monitor training progress and prevent overfitting.
 #### Factors
+- **Training Progress:** Monitored throughout training with early stopping patience of 5 epochs
+- **Loss Metrics:** Custom loss function implementation for supervised fine-tuning
+- **Computational Efficiency:** Evaluated performance with 4-bit quantization
 #### Metrics
+- **Training Loss:** Monitored during training with logging every 10 steps
+- **Evaluation Loss:** Computed every 35 steps on the evaluation dataset
+- **Early Stopping:** Implemented with patience of 5 epochs to prevent overfitting
 ### Results
+Evaluation results are logged in Weights & Biases during training. The model was trained for up to 10 epochs with early stopping mechanism to ensure optimal performance without overfitting.
+**Evaluation Loss Performance:**
+![Evaluation Loss Curve](eval_loss_with_ground.png)
+- The model (red line in eval/loss graph) shows a steep decrease from ~1.2 at step 35 to ~0.8 at step 160
+- Minimum loss achieved: approximately 0.8 around step 160
+- Final loss: approximately 0.89 at step 350
+- The model demonstrates good convergence with early stopping preventing overfitting
 #### Summary
+The fine-tuned model demonstrates improved performance on Italian Public Administration dialogue tasks while maintaining the general capabilities of the base Llama 3.1 model. The LoRA adaptation approach allows for efficient fine-tuning while preserving most of the original model's knowledge. This variant is specifically optimized for PA conversations with reference documents as context.
+## Model Examination
+The model uses LoRA (Low-Rank Adaptation) which allows for parameter-efficient fine-tuning. This approach:
+- Preserves the original model weights while adding small adapter modules
+- Enables efficient switching between different task-specific adaptations
+- Reduces memory requirements during training and inference
+- Maintains interpretability by keeping the base model architecture intact
+## Environmental Impact
 Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+The environmental impact of this model is reduced compared to training from scratch due to:
+- **Efficient Training:** LoRA adaptation requires significantly less compute than full model training
+- **4-bit Quantization:** Reduces memory usage and energy consumption during training
+- **Hardware Type:** Multi-GPU setup (specific hardware configuration may vary)
+- **Training Approach:** Parameter-efficient fine-tuning reduces overall computational requirements
+*Note: Specific carbon emission calculations would require detailed hardware specifications and training duration measurements.*
+## Technical Specifications
 ### Model Architecture and Objective
+- **Base Architecture:** Llama 3.1 (8B parameters)
+- **Adaptation Method:** LoRA (Low-Rank Adaptation)
+- **Objective:** Supervised Fine-tuning for Italian Public Administration dialogue tasks with reference documents as context
+- **Quantization:** 4-bit quantization for efficient training and inference
+- **Maximum Context Length:** 4,200 tokens
 ### Compute Infrastructure
 #### Hardware
+- **Training Setup:** Multi-GPU configuration (4 processes)
+- **Memory Optimization:** 4-bit quantization with LoRA adapters
+- **Distributed Training:** Accelerate framework for multi-GPU coordination
 #### Software
+- **Framework:** PyTorch with Transformers library
+- **Training Libraries:**
+  - PEFT 0.17.1 (Parameter-Efficient Fine-Tuning)
+  - Accelerate (distributed training)
+  - DeepSpeed (optimization)
+  - TRL (Transformer Reinforcement Learning)
+- **Monitoring:** Weights & Biases (WandB)
+- **Configuration Management:** DeepSpeed configuration for memory optimization
+## Citation
 **BibTeX:**
+```bibtex
+@misc{aixpa_llama31_8b_lora,
+  title={AiXPA Fine-tuned Llama 3.1 8B Model (With Ground Document)},
+  author={LanD (FBK)},
+  year={2025},
+  howpublished={Hugging Face Model Repository},
+  note={Fine-tuned from meta-llama/Meta-Llama-3.1-8B-Instruct using LoRA, trained on Italian Public Administration dialogue data with reference documents}
+}
+```
 **APA:**
+AiXPA Team. (2025). *AiXPA Fine-tuned Llama 3.1 8B Model*. Hugging Face Model Repository. Fine-tuned from meta-llama/Meta-Llama-3.1-8B-Instruct using LoRA.
+## Glossary
+- **LoRA (Low-Rank Adaptation):** A parameter-efficient fine-tuning technique that adds trainable low-rank matrices to existing model weights
+- **SFT (Supervised Fine-Tuning):** Training method using labeled data to improve model performance on specific tasks
+- **4-bit Quantization:** Technique to reduce model memory usage by representing weights with 4-bit precision
+- **Multi-GPU Training:** Distributed training approach using multiple GPUs to accelerate training
+## Model Card Authors
+LanD (FBK)
 ## Model Card Contact
+For questions or issues regarding this model, please contact the AiXPA team through the appropriate channels.
 ### Framework versions
 - PEFT 0.17.1

eval_loss_with_ground.png ADDED Viewed