SriSanthM's picture
Update README.md
0d0405f verified
metadata
license: apache-2.0
language:
  - en
base_model:
  - answerdotai/ModernBERT-base
pipeline_tag: fill-mask
tags:
  - bert
  - diffusion

ModernBERT-Diffusion: Fine-tuned for 50% MASK Ratio

This repository provides a fine-tuned version of ModernBERT, specifically adapted for masked language modeling with a 50% MASK ratio. The model is trained on instruction-following data, where approximately half of the tokens in the assistant's response are randomly masked. This approach encourages the model to effectively reconstruct missing information and generate coherent, contextually appropriate completions in dialogue settings.

This project was inspired by johnowhitaker/modernbert-diffusion.

GITHUB: BERT-Diffusion

Training Details

  • Base Model: ModernBERT-large
  • Fine-tuning Objective: Masked language modeling with 50% of assistant tokens masked
  • Data Format: Instruction-style conversations