ducanhdinh/jepa_proof_bert

BERT base pretrained from scratch với hai mục tiêu:

  • Masked Language Modeling (MLM) — 80/10/10 replacement rule, mask probability 0.15
  • Next Sentence Prediction (NSP)

Thông số huấn luyện

Tham số Giá trị
Max sequence length 256
Batch size 256
Epochs 10
Learning rate 0.0001
MLM probability 0.15

Cách dùng

from transformers import BertForPreTraining, BertTokenizerFast
import torch

tokenizer = BertTokenizerFast.from_pretrained("ducanhdinh/jepa_proof_bert")
model     = BertForPreTraining.from_pretrained("ducanhdinh/jepa_proof_bert")

encoded = tokenizer("Hello world!", return_tensors="pt")
with torch.no_grad():
    output = model(**encoded)
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support