Fill-Mask
Transformers
PyTorch
Safetensors
English
modernbert
ecommerce
e-commerce
retail
marketplace
shopping
amazon
ebay
alibaba
google
rakuten
bestbuy
walmart
flipkart
wayfair
shein
target
etsy
shopify
taobao
asos
carrefour
costco
overstock
pretraining
encoder
language-modeling
foundation-model
Instructions to use thebajajra/RexBERT-mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use thebajajra/RexBERT-mini with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="thebajajra/RexBERT-mini")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("thebajajra/RexBERT-mini") model = AutoModelForMaskedLM.from_pretrained("thebajajra/RexBERT-mini") - Inference
- Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -58,7 +58,6 @@ logits = model(**inputs).logits # use top-k on tok.mask_token_id
|
|
| 58 |
- **Layers / heads / width:** 19 encoder layers, 8 attention heads, hidden size 512; intermediate (MLP) size 768; GELU activations.
|
| 59 |
- **Attention:** Local window 128 with **global attention every 3 layers**; RoPE θ=160k (local & global).
|
| 60 |
- **Positional strategy:** `position_embedding_type: "sans_pos"`.
|
| 61 |
-
- **Dropout:** attention/embedding/MLP dropouts set to 0.0 in the published config.
|
| 62 |
|
| 63 |
## Training data & procedure
|
| 64 |
|
|
|
|
| 58 |
- **Layers / heads / width:** 19 encoder layers, 8 attention heads, hidden size 512; intermediate (MLP) size 768; GELU activations.
|
| 59 |
- **Attention:** Local window 128 with **global attention every 3 layers**; RoPE θ=160k (local & global).
|
| 60 |
- **Positional strategy:** `position_embedding_type: "sans_pos"`.
|
|
|
|
| 61 |
|
| 62 |
## Training data & procedure
|
| 63 |
|