--- language: en tags: - test-smells - roberta - multi-label-classification - software-quality - qa - transformers license: mit datasets: - custom library_name: transformers pipeline_tag: text-classification widget: - text: "Verify that the system shows a confirmation message and then restart the service." --- # ๐Ÿงช RoBERTa-Based Test Smell Classifier (Multi-Label) This is a fine-tuned [`roberta-base`](https://huggingface.co/roberta-base) model for detecting test smells in natural language test cases. It was trained as a **multi-label classifier** on structured manual test cases, labeled automatically using LLaMA 3.1 Nemotron. --- ## ๐Ÿ” What It Detects The model classifies a test case into one or more of the following **6 test smell categories**: - **Ambiguous Test**: Vague or unclear instructions. - **Conditional Test**: Outcome depends on runtime state or configuration. - **Eager Action**: Multiple logically distinct actions in one step. - **Redundant Step**: Duplicate or unnecessary steps. - **Mixed Concerns**: Mixes setup, execution, and cleanup in a single step. --- ## ๐Ÿ“Š Training and Evaluation - Base Model: [`roberta-base`](https://huggingface.co/roberta-base) - Format: Multi-label classification using sigmoid activation - Dataset: ~1,600 real-world test cases from Ubuntu OS, XWiki, and OpenOffice (structured into plain text) - Labeling: Zero-shot labeling via LLaMA 3.1 Nemotron - Metrics on held-out test set: - **Micro-F1**: 0.7475 - **Macro-F1**: 0.7079 - **Precision**: 0.753 - **Recall**: 0.800 ## ๐Ÿ“ฆ How to Use ```python # Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="lukassokcevic/test-smell-classifier")