DataScienceWFSR
/

modernbert-food-product-category-base

+---
+language:
+- en
+metrics:
+- f1
+base_model:
+- answerdotai/ModernBERT-base
+pipeline_tag: text-classification
+---
+# ModernBERT Food Product Category Classification Model - Baseline
+## Model Details
+### Model Description
+This model is finetuned on multi-class food product-category text classification using ModernBERT.
+- **Developed by:** [DataScienceWFSR](https://huggingface.co/DataScienceWFSR)
+- **Model type:** Text Classification
+- **Language(s) (NLP):** English
+- **Finetuned from model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
+### Model Sources
+- **Repository:** [https://github.com/WFSRDataScience/SemEval2025Task9](https://github.com/WFSRDataScience/SemEval2025Task9)
+- **Paper :** [https://arxiv.org/abs/2504.20703](https://arxiv.org/abs/2504.20703)
+## How to Get Started With the Model
+Use the code below to get started with the model in PyTorch.
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+from huggingface_hub import hf_hub_download
+import pandas as pd
+model, category, augmentation = 'modernbert', 'product-category', 'base'
+repo_id = f"DataScienceWFSR/{model}-food-{category}-{augmentation}"
+lb_path = hf_hub_download(repo_id=repo_id, filename=f"labelencoder_{category}.pkl")
+lb =  pd.read_pickle(lb_path)
+tokenizer = AutoTokenizer.from_pretrained(repo_id)
+model = AutoModelForSequenceClassification.from_pretrained(repo_id)
+model.eval()
+sample = ('Case Number: 039-94 Date Opened: 10/20/1994 Date Closed: 03/06/1995 Recall Class: 1'
+        ' Press Release (Y/N): N Domestic Est. Number: 07188 M Name: PREPARED FOODS Imported '
+        'Product (Y/N): N Foreign Estab. Number: N/A City: SANTA TERESA State: NM Country: USA'
+        ' Product: HAM, SLICED Problem: BACTERIA Description: LISTERIA '
+        'Total Pounds Recalled: 3,920 Pounds Recovered: 3,920')
+inputs = tokenizer(sample, return_tensors="pt")
+with torch.no_grad():
+    outputs = model(**inputs)
+predictions = outputs.logits.argmax(dim=-1)
+predicted_label = lb.inverse_transform(predictions.numpy())[0]
+print(f"The predicted label is: {predicted_label}")
+```
+## Training Details
+### Training Data
+Training and Validation data provided by SemEval-2025 Task 9 organizers : `Food Recall Incidents` dataset (only English) [link](https://github.com/food-hazard-detection-semeval-2025/food-hazard-detection-semeval-2025.github.io/tree/main/data)
+### Training Procedure
+#### Training Hyperparameters
+- batch_size: `8`
+- epochs: `10`
+- lr_scheduler: `cosine`
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data & Metrics
+#### Testing Data
+Test data: 997 samples ([link](https://github.com/food-hazard-detection-semeval-2025/food-hazard-detection-semeval-2025.github.io/blob/main/data/incidents_test.csv))
+#### Metrics
+F<sub>1</sub>-macro
+### Results
+F<sub>1</sub>-macro scores for each model in the official test set utilizing the `text` field per category and subtasks scores (ST1 and ST2) rounded to 3 decimals. With bold, we indicated the model's specific results.
+| Model                | hazard-category | product-category | hazard | product |  ST1  |  ST2  |
+|----------------------|----------------:|-----------------:|-------:|--------:|------:|------:|
+| BERT<sub>base</sub>         | 0.747 | 0.757 | 0.581 | 0.170 | 0.753 | 0.382 |
+| BERT<sub>CW</sub>       | 0.760 | 0.761 | 0.671 | 0.280 | 0.762 | 0.491 |
+| BERT<sub>SR</sub>           | 0.770 | 0.754 | 0.666 | 0.275 | 0.764 | 0.478 |
+| BERT<sub>RW</sub>           | 0.752 | 0.757 | 0.651 | 0.275 | 0.756 | 0.467 |
+| DistilBERT<sub>base</sub>   | 0.761 | 0.757 | 0.593 | 0.154 | 0.760 | 0.378 |
+| DistilBERT<sub>CW</sub>     | 0.766 | 0.753 | 0.635 | 0.246 | 0.763 | 0.449 |
+| DistilBERT<sub>SR</sub>     | 0.756 | 0.759 | 0.644 | 0.240 | 0.763 | 0.448 |
+| DistilBERT<sub>RW</sub>     | 0.749 | 0.747 | 0.647 | 0.261 | 0.753 | 0.462 |
+| RoBERTa<sub>base</sub>      | 0.760 | 0.753 | 0.579 | 0.123 | 0.755 | 0.356 |
+| RoBERTa<sub>CW</sub>        | 0.773 | 0.739 | 0.630 | 0.000 | 0.760 | 0.315 |
+| RoBERTa<sub>SR</sub>        | 0.777 | 0.755 | 0.637 | 0.000 | 0.767 | 0.319 |
+| RoBERTa<sub>RW</sub>        | 0.757 | 0.611 | 0.615 | 0.000 | 0.686 | 0.308 |
+| **ModernBERT<sub>base</sub>**   | **0.781** | **0.745** | **0.667** | **0.275** | **0.769** | **0.485** |
+| ModernBERT<sub>CW</sub>     | 0.761 | 0.712 | 0.609 | 0.252 | 0.741 | 0.441 |
+| ModernBERT<sub>SR</sub>     | 0.790 | 0.728 | 0.591 | 0.253 | 0.761 | 0.434 |
+| ModernBERT<sub>RW</sub>     | 0.761 | 0.751 | 0.629 | 0.237 | 0.759 | 0.440 |
+## Technical Specifications
+### Compute Infrastructure
+#### Hardware
+NVIDIA A100 80GB and NVIDIA GeForce RTX 3070 Ti
+#### Software
+| Library           | Version | URL                                                                 |
+|-------------------|--------:|---------------------------------------------------------------------|
+| Transformers      |   4.49.0 | https://huggingface.co/docs/transformers/index                      |
+| PyTorch           |   2.6.0  | https://pytorch.org/                                                |
+| SpaCy             |   3.8.4  | https://spacy.io/                                                   |
+| Scikit-learn      |   1.6.0  | https://scikit-learn.org/stable/                                    |
+| Pandas            |   2.2.3  | https://pandas.pydata.org/                                          |
+| Optuna            |   4.2.1  | https://optuna.org/                                                 |
+| NumPy             |   2.0.2  | https://numpy.org/                                                  |
+| NLP AUG           |  1.1.11  | https://nlpaug.readthedocs.io/en/latest/index.html                  |
+| BeautifulSoup4    |  4.12.3  | https://www.crummy.com/software/BeautifulSoup/bs4/doc/#             |
+## Citation
+**BibTeX:**
+For the original paper:
+```
+@inproceedings{brightcookies-semeval2025-task9,
+    title="BrightCookies at {S}em{E}val-2025 Task 9: Exploring Data Augmentation for Food Hazard Classification},
+    author="Papadopoulou, Foteini and Mutlu, Osman  and Özen, Neris and van der Velden, Bas H. M. and Hendrickx, Iris  and Hürriyetoğlu, Ali",
+    booktitle = "Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)",
+    month = jul,
+    year = "2025",
+    address = "Vienna, Austria",
+    publisher = "Association for Computational Linguistics",
+}
+```
+For the SemEval2025 Task9:
+```
+@inproceedings{semeval2025-task9,
+    title = "{S}em{E}val-2025 Task 9: The Food Hazard Detection Challenge",
+    author = "Randl, Korbinian and Pavlopoulos, John and Henriksson, Aron and Lindgren, Tony and Bakagianni, Juli",
+    booktitle = "Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)",
+    month = jul,
+    year = "2025",
+    address = "Vienna, Austria",
+    publisher = "Association for Computational Linguistics",
+}
+```
+## Model Card Authors and Contact
+Authors: Foteini Papadopoulou, Osman Mutlu, Neris Özen,
+Bas H.M. van der Velden, Iris Hendrickx, Ali Hürriyetoğlu
+Contact: ali.hurriyetoglu@wur.nl