| --- |
| language: en |
| license: apache-2.0 |
| tags: |
| - bert |
| - masked-language-modeling |
| - next-sentence-prediction |
| - pretraining |
| --- |
| |
| # ducanhdinh/jepa_proof_bert |
|
|
| BERT base pretrained from scratch với hai mục tiêu: |
| - **Masked Language Modeling (MLM)** — 80/10/10 replacement rule, mask probability `0.15` |
| - **Next Sentence Prediction (NSP)** |
|
|
| ## Thông số huấn luyện |
|
|
| | Tham số | Giá trị | |
| |---|---| |
| | Max sequence length | 256 | |
| | Batch size | 256 | |
| | Epochs | 10 | |
| | Learning rate | 0.0001 | |
| | MLM probability | 0.15 | |
|
|
| ## Cách dùng |
|
|
| ```python |
| from transformers import BertForPreTraining, BertTokenizerFast |
| import torch |
| |
| tokenizer = BertTokenizerFast.from_pretrained("ducanhdinh/jepa_proof_bert") |
| model = BertForPreTraining.from_pretrained("ducanhdinh/jepa_proof_bert") |
| |
| encoded = tokenizer("Hello world!", return_tensors="pt") |
| with torch.no_grad(): |
| output = model(**encoded) |
| ``` |
|
|