CI_MA_Reframe / training_report.md
jokugeorgin's picture
Upload 22 files
7bf2b59 verified

Microaggression Text Reframing Model - Training Report

Model Configuration

  • Base Model: t5-base
  • Total Parameters: 222,903,552
  • Trainable Parameters: 222,903,552
  • Training Epochs: 5
  • Batch Size: 8
  • Learning Rate: 5e-05
  • Max Sequence Length: 256
  • GPUs Used: 2

Training Results

  • Final Training Loss: nan
  • Final Validation Loss: 1.1424
  • Best Validation Loss: 1.0700 (Epoch 3)

Evaluation Metrics

  • Average BLEU Score: 0.1307
  • Average ROUGE-1: 0.4270
  • Average ROUGE-2: 0.1896
  • Average ROUGE-L: 0.3932

Files Saved

  • Model files: pytorch_model.bin, config.json
  • Tokenizer files: tokenizer.json, spiece.model
  • Training history: training_history.json, training_history.pkl
  • Evaluation results: evaluation_results.json
  • Detailed predictions: detailed_predictions.csv
  • Visualization plots: training_plots.png/pdf, evaluation_plots.png/pdf
  • Statistics: score_statistics.csv, score_correlation_matrix.csv
  • Loss history: loss_history.csv

Usage Example

from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load the model
tokenizer = T5Tokenizer.from_pretrained('/kaggle/working/microaggression_reframing_model')
model = T5ForConditionalGeneration.from_pretrained('/kaggle/working/microaggression_reframing_model')

# Generate reframed text
input_text = "Your microaggressive text here"
prefixed_text = f"rephrase: {input_text}"
inputs = tokenizer(prefixed_text, return_tensors='pt', max_length=256, truncation=True)
outputs = model.generate(**inputs, max_length=256, num_beams=4, early_stopping=True)
reframed_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

Report generated on: 2025-10-22 00:02:37.399982