| | --- |
| | language: en |
| | tags: |
| | - test-smells |
| | - roberta |
| | - multi-label-classification |
| | - software-quality |
| | - qa |
| | - transformers |
| | license: mit |
| | datasets: |
| | - custom |
| | library_name: transformers |
| | pipeline_tag: text-classification |
| | widget: |
| | - text: "Verify that the system shows a confirmation message and then restart the service." |
| | --- |
| | |
| | # π§ͺ RoBERTa-Based Test Smell Classifier (Multi-Label) |
| |
|
| | This is a fine-tuned [`roberta-base`](https://huggingface.co/roberta-base) model for detecting test smells in natural language test cases. It was trained as a **multi-label classifier** on structured manual test cases, labeled automatically using LLaMA 3.1 Nemotron. |
| |
|
| | --- |
| |
|
| | ## π What It Detects |
| |
|
| | The model classifies a test case into one or more of the following **6 test smell categories**: |
| |
|
| | - **Ambiguous Test**: Vague or unclear instructions. |
| | - **Conditional Test**: Outcome depends on runtime state or configuration. |
| | - **Eager Action**: Multiple logically distinct actions in one step. |
| | - **Redundant Step**: Duplicate or unnecessary steps. |
| | - **Mixed Concerns**: Mixes setup, execution, and cleanup in a single step. |
| |
|
| | --- |
| |
|
| | ## π Training and Evaluation |
| |
|
| | - Base Model: [`roberta-base`](https://huggingface.co/roberta-base) |
| | - Format: Multi-label classification using sigmoid activation |
| | - Dataset: ~1,600 real-world test cases from Ubuntu OS, XWiki, and OpenOffice (structured into plain text) |
| | - Labeling: Zero-shot labeling via LLaMA 3.1 Nemotron |
| | - Metrics on held-out test set: |
| | - **Micro-F1**: 0.7475 |
| | - **Macro-F1**: 0.7079 |
| | - **Precision**: 0.753 |
| | - **Recall**: 0.800 |
| |
|
| |
|
| | ## π¦ How to Use |
| |
|
| | ```python |
| | # Use a pipeline as a high-level helper |
| | from transformers import pipeline |
| | |
| | pipe = pipeline("text-classification", model="lukassokcevic/test-smell-classifier") |
| | |