ducanhdinh/jepa_proof_bert
BERT base pretrained from scratch với hai mục tiêu:
- Masked Language Modeling (MLM) — 80/10/10 replacement rule, mask probability
0.15 - Next Sentence Prediction (NSP)
Thông số huấn luyện
| Tham số | Giá trị |
|---|---|
| Max sequence length | 256 |
| Batch size | 256 |
| Epochs | 10 |
| Learning rate | 0.0001 |
| MLM probability | 0.15 |
Cách dùng
from transformers import BertForPreTraining, BertTokenizerFast
import torch
tokenizer = BertTokenizerFast.from_pretrained("ducanhdinh/jepa_proof_bert")
model = BertForPreTraining.from_pretrained("ducanhdinh/jepa_proof_bert")
encoded = tokenizer("Hello world!", return_tensors="pt")
with torch.no_grad():
output = model(**encoded)
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support