Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,53 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
library_name: transformers
|
| 5 |
+
tags:
|
| 6 |
+
- text-classification
|
| 7 |
+
- github
|
| 8 |
+
- multi-label
|
| 9 |
+
- issue-classification
|
| 10 |
+
datasets:
|
| 11 |
+
- anasnadeem/github-issues
|
| 12 |
+
pipeline_tag: text-classification
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
## Model Card for the Automatic Issue Classifier (AIC)
|
| 16 |
+
|
| 17 |
+
[cite_start]This model card is for a RoBERTa-based model fine-tuned to classify GitHub issue reports into three categories: **Bug**, **Enhancement**, and **Question**[cite: 13].
|
| 18 |
+
|
| 19 |
+
### Model Details
|
| 20 |
+
|
| 21 |
+
[cite_start]The Automatic Issue Classifier (AIC) is a transformer-based model designed for the multi-label classification of GitHub issue reports[cite: 17]. [cite_start]It addresses the challenge that issue submitters can tag a single issue with more than one label[cite: 16].
|
| 22 |
+
|
| 23 |
+
[cite_start]The model is a fine-tuned version of `roberta-base` [cite: 18, 195][cite_start], which is an optimized variant of BERT[cite: 125]. [cite_start]It was developed to overcome the limitations of traditional keyword-based approaches, which often fail to capture the contextual relationships between words in issue reports[cite: 15].
|
| 24 |
+
|
| 25 |
+
[cite_start]The model and an associated industry tool were developed to automatically assign labels to newly reported issues, helping to streamline software maintenance workflows[cite: 21].
|
| 26 |
+
|
| 27 |
+
***
|
| 28 |
+
### Intended Use
|
| 29 |
+
|
| 30 |
+
This model is intended to be used for automatically labeling new issue reports in GitHub repositories[cite: 21]. Its primary benefits include:
|
| 31 |
+
* [cite_start]Helping development teams to effectively track and prioritize issues[cite: 36].
|
| 32 |
+
* [cite_start]Routing issues to the correct developer or team member[cite: 70, 264].
|
| 33 |
+
* Assisting project managers in optimizing resource allocation[cite: 71].
|
| 34 |
+
|
| 35 |
+
```python
|
| 36 |
+
from transformers import pipeline
|
| 37 |
+
|
| 38 |
+
# Load the pipeline
|
| 39 |
+
# Replace 'YourUsername/aic-roberta-base' with the actual model repo on the Hub
|
| 40 |
+
classifier = pipeline("text-classification", model="YourUsername/aic-roberta-base", return_all_scores=True)
|
| 41 |
+
|
| 42 |
+
# Example issue text (concatenated title and body)
|
| 43 |
+
issue_text = """
|
| 44 |
+
Title: USBhost: additional functions for keyboard appreciated
|
| 45 |
+
Body: for USBhost it would be fine to have additional functions to read the USB keyboard: kbhit() getch() getche() getchar() gets() scanf()
|
| 46 |
+
"""
|
| 47 |
+
|
| 48 |
+
# Get predictions
|
| 49 |
+
predictions = classifier(issue_text)
|
| 50 |
+
print(predictions)
|
| 51 |
+
|
| 52 |
+
# Expected output would be a list of scores for each label:
|
| 53 |
+
# [[{'label': 'bug', 'score': 0.8}, {'label': 'enhancement', 'score': 0.75}, {'label': 'question', 'score': 0.05}]]
|