metadata
language: en
license: apache-2.0
tags:
- bert
- masked-language-modeling
- next-sentence-prediction
- pretraining
ducanhdinh/jepa_proof_bert
BERT base pretrained from scratch với hai mục tiêu:
- Masked Language Modeling (MLM) — 80/10/10 replacement rule, mask probability
0.15 - Next Sentence Prediction (NSP)
Thông số huấn luyện
| Tham số | Giá trị |
|---|---|
| Max sequence length | 256 |
| Batch size | 256 |
| Epochs | 10 |
| Learning rate | 0.0001 |
| MLM probability | 0.15 |
Cách dùng
from transformers import BertForPreTraining, BertTokenizerFast
import torch
tokenizer = BertTokenizerFast.from_pretrained("ducanhdinh/jepa_proof_bert")
model = BertForPreTraining.from_pretrained("ducanhdinh/jepa_proof_bert")
encoded = tokenizer("Hello world!", return_tensors="pt")
with torch.no_grad():
output = model(**encoded)