thebajajra commited on
Commit
41d8281
·
verified ·
1 Parent(s): 393d1e8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -17
README.md CHANGED
@@ -43,7 +43,7 @@ datasets:
43
  [![Data](https://img.shields.io/badge/🤗%20Training%20Data-Ecomniverse-yellow)](https://huggingface.co/datasets/thebajajra/Ecom-niverse)
44
  [![GitHub](https://img.shields.io/badge/GitHub-Code-blue)](https://github.com/bajajra/RexBERT)
45
 
46
- > **TL;DR**: An encoder-only transformer (BERT-style) for **e-commerce** applications, trained in three phases—**Pre-training**, **Context Extension**, and **Decay**—to power product search, attribute extraction, classification, and embeddings use cases. The model has been trained on 2.3T+ tokens along with 350B+ e-commerce-specific tokens
47
 
48
  ---
49
 
@@ -220,7 +220,7 @@ trainer.train()
220
 
221
  ## Model Architecture & Compatibility
222
 
223
- - **Architecture:** Encoder-only, BERT-style **base** model.
224
  - **Libraries:** Works with **🤗 Transformers**; supports **fill-mask** and **feature-extraction** pipelines.
225
  - **Context length:** Increased during the **Context Extension** phase—ensure `max_position_embeddings` in `config.json` matches your desired max length.
226
  - **Files:** `config.json`, tokenizer files, and (optionally) heads for MLM or classification.
@@ -246,19 +246,4 @@ trainer.train()
246
 
247
  - **Author/maintainer:** [Rahul Bajaj](https://huggingface.co/thebajajra)
248
 
249
- ---
250
-
251
- ## Citation
252
-
253
- If you use RexBERT-base in your work, please cite it:
254
-
255
- ```bibtex
256
- @software{rexbert_base_2025,
257
- title = {RexBERT-base: An e-commerce domain encoder},
258
- author = {Bajajra, Rahul Bajaj},
259
- year = {2025},
260
- url = {https://huggingface.co/thebajajra/RexBERT-base}
261
- }
262
- ```
263
-
264
  ---
 
43
  [![Data](https://img.shields.io/badge/🤗%20Training%20Data-Ecomniverse-yellow)](https://huggingface.co/datasets/thebajajra/Ecom-niverse)
44
  [![GitHub](https://img.shields.io/badge/GitHub-Code-blue)](https://github.com/bajajra/RexBERT)
45
 
46
+ > **TL;DR**: An encoder-only transformer (ModernBERT-style) for **e-commerce** applications, trained in three phases—**Pre-training**, **Context Extension**, and **Decay**—to power product search, attribute extraction, classification, and embeddings use cases. The model has been trained on 2.3T+ tokens along with 350B+ e-commerce-specific tokens
47
 
48
  ---
49
 
 
220
 
221
  ## Model Architecture & Compatibility
222
 
223
+ - **Architecture:** Encoder-only, ModernBERT-style **base** model.
224
  - **Libraries:** Works with **🤗 Transformers**; supports **fill-mask** and **feature-extraction** pipelines.
225
  - **Context length:** Increased during the **Context Extension** phase—ensure `max_position_embeddings` in `config.json` matches your desired max length.
226
  - **Files:** `config.json`, tokenizer files, and (optionally) heads for MLM or classification.
 
246
 
247
  - **Author/maintainer:** [Rahul Bajaj](https://huggingface.co/thebajajra)
248
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
249
  ---