--- language: en license: apache-2.0 library_name: transformers tags: - commonsense-reasoning - winoGrande - fine-tuned - llama - reasoning datasets: - allenai/winogrande metrics: - accuracy - loss base_model: - PleIAs/Monad --- ## Model Details ### Model Description The model has been trained on the WinoGrande dataset which tests the ability to resolve pronouns and make logical inferences in everyday scenarios. ### Model Sources - **Base Model:** https://huggingface.co/PleIAs/Monad ### Training Data Dataset: WinoGrande (allenai/winogrande) - Size: 9,248 training examples, 1,267 validation examples - Task: Commonsense reasoning with pronoun resolution - Format: Multiple choice questions requiring logical reasoning ## Training Hyperparameters | Epochs | Batch Size | Learning Rate | Warmup Ratio | Warmup Steps | Weight Decay | Max Gradient Norm | Evaluation Steps | Save Steps | Early Stopping Patience | |--------|------------|---------------|--------------|--------------|--------------|-------------------|------------------|------------|-------------------------| | 5 | 16 | 1e-05 | 0.05 | 144 | 0.01 | 1.0 | 150 | 150 | 7 | ## Training Results | Metric | Value | |--------|-------| | Final Training Loss | 0.9143 | | Training Time | 1,526.9s | ## Validation Performance Validation loss stabilized between 0.83-0.86 throughout the training #### Summary The model achieved strong convergence during training: - **Final training loss:** 0.9143 - **Evaluation loss:** ~0.834 (final checkpoint) - **Training completed:** All 5 epochs with early stopping monitoring ### Compute Infrastructure #### Hardware - **GPU:** Single NVIDIA A10G (24GB VRAM) - **Platform:** Modal.com