thebajajra
/

RexBERT-mini

Model card Files Files and versions

thebajajra commited on Aug 19, 2025

Commit

60d043f

·

verified ·

1 Parent(s): 62bbcdb

Update README.md

Files changed (1) hide show

README.md +0 -1

README.md CHANGED Viewed

@@ -58,7 +58,6 @@ logits = model(**inputs).logits  # use top-k on tok.mask_token_id
 - **Layers / heads / width:** 19 encoder layers, 8 attention heads, hidden size 512; intermediate (MLP) size 768; GELU activations.
 - **Attention:** Local window 128 with **global attention every 3 layers**; RoPE θ=160k (local & global).
 - **Positional strategy:** `position_embedding_type: "sans_pos"`.
-- **Dropout:** attention/embedding/MLP dropouts set to 0.0 in the published config.
 ## Training data & procedure

 - **Layers / heads / width:** 19 encoder layers, 8 attention heads, hidden size 512; intermediate (MLP) size 768; GELU activations.
 - **Attention:** Local window 128 with **global attention every 3 layers**; RoPE θ=160k (local & global).
 - **Positional strategy:** `position_embedding_type: "sans_pos"`.
 ## Training data & procedure