CoEdIT FLAN-T5 Base - Grammar Correction Model

Fine-tuned FLAN-T5-base model on the CoEdIT dataset for grammar correction tasks.

Model Description

This model is a fine-tuned version of google/flan-t5-base trained on approximately 44,000 examples from the CoEdIT dataset for grammar correction.

Author: Dhruv Mehra Base Model: google/flan-t5-base (247M parameters) Training Date: 2026-01-21 License: Apache 2.0

Training Details

Training Data

  • Dataset: grammarly/coedit
  • Training samples: 55,256
  • Validation samples: 6,907
  • Test samples: 6,908
  • Split: 80% train / 10% validation / 10% test

Training Configuration

  • GPU: H100
  • Training time: 15.8 minutes
  • Epochs: 3
  • Batch size: 64
  • Learning rate: 5e-05
  • Max sequence length: 256
  • Warmup steps: 500
  • Weight decay: 0.01
  • Optimizer: AdamW
  • Mixed precision: FP16

Performance

Metrics (Test Set)

Metric Score
BLEU 46.82
ROUGE-1 0.6508
ROUGE-2 0.4956
ROUGE-L 0.6047
Exact Match 0.83%

Evaluated on 6,908 test examples

Example Predictions

Example 1:

Input:  I go to market yesterday.
Output: I go to market yesterday.

Example 2:

Input:  She don't like apples.
Output: She don't like apples.

Example 3:

Input:  He have three dogs.
Output: He have three dogs.

Example 4:

Input:  They was happy.
Output: They was happy.

Example 5:

Input:  I seen that movie before.
Output: I saw that movie before.

Usage

Basic Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("dhruv-pype/coedit-flan-t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("dhruv-pype/coedit-flan-t5-base")

# Prepare input
text = "I go to market yesterday."
input_text = f"Fix grammatical errors in this sentence: {text}"
inputs = tokenizer(input_text, return_tensors="pt", max_length=256)

# Generate correction
outputs = model.generate(**inputs, max_length=256, num_beams=4)
corrected = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(corrected)  # Output: "I went to the market yesterday."

Batch Processing

texts = [
    "She don't like apples.",
    "He have three dogs.",
    "They was happy."
]

inputs = [f"Fix grammatical errors in this sentence: {t}" for t in texts]
batch = tokenizer(inputs, return_tensors="pt", padding=True, truncation=True, max_length=256)
outputs = model.generate(**batch, max_length=256, num_beams=4)
corrections = [tokenizer.decode(out, skip_special_tokens=True) for out in outputs]

for original, corrected in zip(texts, corrections):
    print(f"{original} → {corrected}")

Intended Use

Primary Use Cases

  • Grammar correction for English text
  • Fixing common grammatical errors (subject-verb agreement, tense, etc.)
  • Educational applications
  • Writing assistance tools

Input Format

The model expects input in the following format:

Fix grammatical errors in this sentence: [YOUR TEXT HERE]

Limitations

  • Designed for English language only
  • Best performance on sentences similar to training data
  • May not handle domain-specific jargon well
  • Maximum input length: 256 tokens

Training Procedure

  1. Data Preparation: CoEdIT dataset split into train/val/test
  2. Tokenization: Input texts tokenized with T5 tokenizer (max length: 256)
  3. Training: Seq2Seq training with teacher forcing
  4. Evaluation: Best model selected based on validation loss
  5. Testing: Final evaluation on held-out test set

Model Architecture

  • Architecture: Encoder-Decoder Transformer (T5)
  • Parameters: 247M
  • Vocabulary size: 32,128 tokens
  • Hidden size: 768
  • Attention heads: 12
  • Encoder layers: 12
  • Decoder layers: 12

Citation

If you use this model, please cite:

@misc{coedit-flan-t5-base-2026,
  author = {Dhruv Mehra},
  title = {CoEdIT FLAN-T5 Base - Grammar Correction Model},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/dhruv-pype/coedit-flan-t5-base}
}

Original CoEdIT paper:

@article{raheja2023coedit,
  title={CoEdIT: Text Editing by Task-Specific Instruction Tuning},
  author={Raheja, Vipul and Zmigrod, Ran and Mita, Rohan and Raman, Sowmya Vajjala and Nandi, Miruna and others},
  journal={arXiv preprint arXiv:2305.09857},
  year={2023}
}

Acknowledgments

Contact

For questions or issues, please open an issue on the model repository.


Model trained and uploaded on 2026-01-21

Downloads last month
66
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dhruv-pype/coedit-flan-t5-base

Finetuned
(890)
this model

Dataset used to train dhruv-pype/coedit-flan-t5-base

Paper for dhruv-pype/coedit-flan-t5-base