Monad / README.md
Yuvrajxms09's picture
Update README.md
5b559bd verified
---
language: en
license: apache-2.0
library_name: transformers
tags:
- commonsense-reasoning
- winoGrande
- fine-tuned
- llama
- reasoning
datasets:
- allenai/winogrande
metrics:
- accuracy
- loss
base_model:
- PleIAs/Monad
---
## Model Details
### Model Description
The model has been trained on the WinoGrande dataset which tests the ability to resolve pronouns
and make logical inferences in everyday scenarios.
### Model Sources
- **Base Model:** https://huggingface.co/PleIAs/Monad
### Training Data
Dataset: WinoGrande (allenai/winogrande)
- Size: 9,248 training examples, 1,267 validation examples
- Task: Commonsense reasoning with pronoun resolution
- Format: Multiple choice questions requiring logical reasoning
## Training Hyperparameters
| Epochs | Batch Size | Learning Rate | Warmup Ratio | Warmup Steps | Weight Decay | Max Gradient Norm | Evaluation Steps | Save Steps | Early Stopping Patience |
|--------|------------|---------------|--------------|--------------|--------------|-------------------|------------------|------------|-------------------------|
| 5 | 16 | 1e-05 | 0.05 | 144 | 0.01 | 1.0 | 150 | 150 | 7 |
## Training Results
| Metric | Value |
|--------|-------|
| Final Training Loss | 0.9143 |
| Training Time | 1,526.9s |
## Validation Performance
Validation loss stabilized between 0.83-0.86 throughout the training
#### Summary
The model achieved strong convergence during training:
- **Final training loss:** 0.9143
- **Evaluation loss:** ~0.834 (final checkpoint)
- **Training completed:** All 5 epochs with early stopping monitoring
### Compute Infrastructure
#### Hardware
- **GPU:** Single NVIDIA A10G (24GB VRAM)
- **Platform:** Modal.com