ducanhdinh/jepa_proof_bert

BERT base pretrained from scratch với hai mục tiêu:

Masked Language Modeling (MLM) — 80/10/10 replacement rule, mask probability 0.15
Next Sentence Prediction (NSP)

Thông số huấn luyện

Tham số	Giá trị
Max sequence length	256
Batch size	256
Epochs	10
Learning rate	0.0001
MLM probability	0.15

Cách dùng

from transformers import BertForPreTraining, BertTokenizerFast
import torch

tokenizer = BertTokenizerFast.from_pretrained("ducanhdinh/jepa_proof_bert")
model     = BertForPreTraining.from_pretrained("ducanhdinh/jepa_proof_bert")

encoded = tokenizer("Hello world!", return_tensors="pt")
with torch.no_grad():
    output = model(**encoded)

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support