File size: 1,081 Bytes
898ceb1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0d0405f
 
898ceb1
 
 
 
e86210f
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
license: apache-2.0
language:
- en
base_model:
- answerdotai/ModernBERT-base
pipeline_tag: fill-mask
tags:
- bert
- diffusion
---
# ModernBERT-Diffusion: Fine-tuned for 50% MASK Ratio

This repository provides a fine-tuned version of [ModernBERT](https://huggingface.co/answerdotai/ModernBERT-large), specifically adapted for masked language modeling with a **50% MASK ratio**. The model is trained on instruction-following data, where approximately half of the tokens in the assistant's response are randomly masked. This approach encourages the model to effectively reconstruct missing information and generate coherent, contextually appropriate completions in dialogue settings.

This project was inspired by [johnowhitaker/modernbert-diffusion](https://huggingface.co/johnowhitaker/modernbert-diffusion).

GITHUB: [BERT-Diffusion](https://github.com/Pr0fe5s0r/BERT-Diffusion)

## Training Details

- **Base Model:** ModernBERT-large
- **Fine-tuning Objective:** Masked language modeling with 50% of assistant tokens masked
- **Data Format:** Instruction-style conversations