Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -15,12 +15,12 @@ license: mit
 ---
 ## Preprocessing & class imbalance
-Sentences were **lowercased** (no stemming/lemmatization) and tokenized with the base tokenizer from [`bilalzafar/cb-bert-mlm`](https://huggingface.co/bilalzafar/cb-bert-mlm) using **max\_length=320** with truncation and **dynamic padding** via `DataCollatorWithPadding`. To address imbalance, training used **Focal Loss (γ=1.0)** with **class weights** computed from the *train* split (`class_weight="balanced"`) applied in the loss, plus a **WeightedRandomSampler** with √(inverse-frequency) **per-sample weights**.
 ---
 ## Training procedure
-Training used **[`bilalzafar/cb-bert-mlm`](https://huggingface.co/bilalzafar/cb-bert-mlm)** as the base, with a 3-label **`AutoModelForSequenceClassification`** head. Optimization was **AdamW** (HF Trainer) with **learning rate 2e-5**, **batch size 16** (train/eval), and up to **8 epochs** with **early stopping (patience=2)**—best epoch \~**6**. A **warmup ratio of 0.06**, **weight decay 0.01**, and **fp16** precision were applied. Runs were seeded (**42**) and executed on **Google Colab (T4)**.
 ---

 ---
 ## Preprocessing & class imbalance
+Sentences were **lowercased** (no stemming/lemmatization) and tokenized with the base tokenizer from [`bilalzafar/cb-bert-mlm`](https://huggingface.co/bilalzafar/cb-bert-mlm) using **max\_length=320** with truncation and **dynamic padding** via `DataCollatorWithPadding`. To address imbalance, training used *Focal Loss (γ=1.0)* with **class weights** computed from the *train* split (`class_weight="balanced"`) applied in the loss, plus a *WeightedRandomSampler* with √(inverse-frequency) *per-sample weights*.
 ---
 ## Training procedure
+Training used **[`bilalzafar/cb-bert-mlm`](https://huggingface.co/bilalzafar/cb-bert-mlm)** as the base, with a 3-label `AutoModelForSequenceClassification` head. Optimization was *AdamW* (HF Trainer) with *learning rate 2e-5*, *batch size 16* (train/eval), and up to *8 epochs* with early stopping (patience=2)*—best epoch \~*6*. A *warmup ratio of 0.06*, *weight decay 0.01*, and *fp16* precision were applied. Runs were seeded (*42*) and executed on *Google Colab (T4)*.
 ---