You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

./refinebert-finetuned

Model Summary

A diffusion-style masked language model fine-tuned from philipp-zettl/modernbert-diffusion-universal on the tatsu-lab/alpaca dataset.

Model Details

Intended Use

Intended for tasks related to tatsu-lab/alpaca.

Example

from refinebert.diffusion_engine import MaskedDiffusionEngine

engine = MaskedDiffusionEngine("./refinebert-finetuned")
prompt = "N/A (See generation logs)"
output = engine.generate(prompt, num_new_tokens=N/A, steps=N/A, guidance_scale=N/A)
print(output)

Training Data

Single-dataset fine-tuning.

Dataset Mix

| tatsu-lab/alpaca | 100% | Fine-tuning Target |

Fine-tuned specifically on the tatsu-lab/alpaca dataset.

Training Procedure

  • Steps: 14630
  • Batch size: 8
  • Sequence length: 256
  • Learning rate: 5e-05
  • CFG dropout probability: N/A
  • Samples loaded into RAM: N/A

Training Time & Hardware

  • Duration: 1h 39m 48s
  • Hardware: NVIDIA GeForce RTX 4070 Laptop GPU x1 (CUDA available)

Metrics (Training)

Metric Value
Training Loss 2.1540
Epochs 5
Global Step 14630

Limitations & Considerations

  • The model is trained with a masked-token diffusion objective and may not behave like an autoregressive LM.
  • Data sources may have licensing or content constraints—review source dataset cards before deployment.
  • Performance can vary substantially by mode (Fine-tuning) and prompt structure.
Downloads last month
10
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for philipp-zettl/modernbert-diffusion-alpaca-ft

Dataset used to train philipp-zettl/modernbert-diffusion-alpaca-ft

Collection including philipp-zettl/modernbert-diffusion-alpaca-ft