Instructions to use Taykhoom/CodonBERT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Taykhoom/CodonBERT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="Taykhoom/CodonBERT", trust_remote_code=True)# Load model directly from transformers import AutoModelForMaskedLM model = AutoModelForMaskedLM.from_pretrained("Taykhoom/CodonBERT", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -54,11 +54,11 @@ weights (`bert.*` prefix) are extracted directly; the MLM and NSP heads are disc
|
|
| 54 |
|
| 55 |
## Parity Verification
|
| 56 |
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
expected BF16
|
| 62 |
|
| 63 |
## Related Models
|
| 64 |
|
|
@@ -143,8 +143,9 @@ with torch.no_grad():
|
|
| 143 |
logits = model_mlm(**enc).logits # (1, seq_len, 69)
|
| 144 |
```
|
| 145 |
|
| 146 |
-
|
| 147 |
-
|
|
|
|
| 148 |
|
| 149 |
### Fine-tuning
|
| 150 |
|
|
|
|
| 54 |
|
| 55 |
## Parity Verification
|
| 56 |
|
| 57 |
+
All verified on GPU with PyTorch 2.7 / CUDA 12:
|
| 58 |
+
|
| 59 |
+
- **Hidden states (eager, sdpa):** identical to original at all 13 levels (max abs diff < 8e-6)
|
| 60 |
+
- **MLM logits:** `BertForMaskedLM` logits identical to original `BertForPreTraining` (max abs diff < 9e-6)
|
| 61 |
+
- **Flash attention 2:** verified against eager (bf16) at non-padding positions (max diff < 0.25, expected BF16 accumulation across 12 layers)
|
| 62 |
|
| 63 |
## Related Models
|
| 64 |
|
|
|
|
| 143 |
logits = model_mlm(**enc).logits # (1, seq_len, 69)
|
| 144 |
```
|
| 145 |
|
| 146 |
+
The MLM head weights are fully preserved: the prediction transform (dense + GELU +
|
| 147 |
+
LayerNorm), the decoder weight (tied to the word embedding in the original, stored
|
| 148 |
+
explicitly here), and the output bias are all converted from the original checkpoint.
|
| 149 |
|
| 150 |
### Fine-tuning
|
| 151 |
|