|
|
--- |
|
|
language: en |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
tags: |
|
|
- text-classification |
|
|
- github |
|
|
- multi-label |
|
|
- issue-classification |
|
|
datasets: |
|
|
- anasnadeem/github-issues |
|
|
pipeline_tag: text-classification |
|
|
--- |
|
|
|
|
|
## Model Card for the Automatic Issue Classifier (AIC) |
|
|
|
|
|
This model card is for a RoBERTa-based model fine-tuned to classify GitHub issue reports into three categories: **Bug**, **Enhancement**, and **Question**. |
|
|
|
|
|
### Model Details |
|
|
|
|
|
The Automatic Issue Classifier (AIC) is a transformer-based model designed for the multi-label classification of GitHub issue reports. It addresses the challenge that issue submitters can tag a single issue with more than one label. |
|
|
|
|
|
The model is a fine-tuned version of `roberta-base` , which is an optimized variant of BERT. It was developed to overcome the limitations of traditional keyword-based approaches, which often fail to capture the contextual relationships between words in issue reports. |
|
|
|
|
|
The model and an associated industry tool were developed to automatically assign labels to newly reported issues, helping to streamline software maintenance workflows. |
|
|
|
|
|
*** |
|
|
### Intended Use |
|
|
|
|
|
This model is intended to be used for automatically labeling new issue reports in GitHub repositories. Its primary benefits include: |
|
|
* Helping development teams to effectively track and prioritize issues. |
|
|
* Routing issues to the correct developer or team member. |
|
|
* Assisting project managers in optimizing resource allocation. |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
# Load the pipeline |
|
|
|
|
|
classifier = pipeline("text-classification", model="vinaykarman/robertabase", return_all_scores=True) |
|
|
|
|
|
# Example issue text (concatenated title and body) |
|
|
issue_text = """ |
|
|
Title: USBhost: additional functions for keyboard appreciated |
|
|
Body: for USBhost it would be fine to have additional functions to read the USB keyboard: kbhit() getch() getche() getchar() gets() scanf() |
|
|
""" |
|
|
|
|
|
# Get predictions |
|
|
predictions = classifier(issue_text) |
|
|
print(predictions) |
|
|
|
|
|
# Expected output would be a list of scores for each label: |
|
|
# [[{'label': 'bug', 'score': 0.8}, {'label': 'enhancement', 'score': 0.75}, {'label': 'question', 'score': 0.05}]] |