---
language: en
tags:
  - test-smells
  - roberta
  - multi-label-classification
  - software-quality
  - qa
  - transformers
license: mit
datasets:
  - custom
library_name: transformers
pipeline_tag: text-classification
widget:
  - text: "Verify that the system shows a confirmation message and then restart the service."
---

# 🧪 RoBERTa-Based Test Smell Classifier (Multi-Label)

This is a fine-tuned [`roberta-base`](https://huggingface.co/roberta-base) model for detecting test smells in natural language test cases. It was trained as a **multi-label classifier** on structured manual test cases, labeled automatically using LLaMA 3.1 Nemotron.

---

## 🔍 What It Detects

The model classifies a test case into one or more of the following **6 test smell categories**:

- **Ambiguous Test**: Vague or unclear instructions.
- **Conditional Test**: Outcome depends on runtime state or configuration.
- **Eager Action**: Multiple logically distinct actions in one step.
- **Redundant Step**: Duplicate or unnecessary steps.
- **Mixed Concerns**: Mixes setup, execution, and cleanup in a single step.

---

## 📊 Training and Evaluation

- Base Model: [`roberta-base`](https://huggingface.co/roberta-base)
- Format: Multi-label classification using sigmoid activation
- Dataset: ~1,600 real-world test cases from Ubuntu OS, XWiki, and OpenOffice (structured into plain text)
- Labeling: Zero-shot labeling via LLaMA 3.1 Nemotron
- Metrics on held-out test set:
  - **Micro-F1**: 0.7475
  - **Macro-F1**: 0.7079
  - **Precision**: 0.753
  - **Recall**: 0.800


## 📦 How to Use

```python
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-classification", model="lukassokcevic/test-smell-classifier")