|
|
---
|
|
|
tags:
|
|
|
- multi-label-classification
|
|
|
- software-defect-prediction
|
|
|
metrics:
|
|
|
- hamming-loss
|
|
|
- f1
|
|
|
- precision-at-k
|
|
|
---
|
|
|
|
|
|
# Multi-Label Software Defect Prediction Models
|
|
|
|
|
|
This repository contains trained models for predicting multiple software defects simultaneously.
|
|
|
|
|
|
## Models Included
|
|
|
|
|
|
- **SVM**: Support Vector Machine with One-vs-Rest strategy
|
|
|
- **Logistic Regression**: Multi-label logistic regression
|
|
|
- **Perceptron**: Online learning perceptron for multi-label
|
|
|
- **DNN**: Deep Neural Network with sigmoid output layer
|
|
|
|
|
|
## Features
|
|
|
|
|
|
The models use various software metrics including:
|
|
|
- Code complexity metrics
|
|
|
- Size metrics
|
|
|
- Coupling metrics
|
|
|
- Cohesion metrics
|
|
|
|
|
|
## Performance
|
|
|
|
|
|
| Model | Hamming Loss | Micro-F1 | Macro-F1 |
|
|
|
|-------|--------------|----------|----------|
|
|
|
| SVM | 0.15 | 0.78 | 0.72 |
|
|
|
| Logistic Regression | 0.17 | 0.75 | 0.69 |
|
|
|
| Perceptron | 0.20 | 0.70 | 0.64 |
|
|
|
| DNN | 0.13 | 0.82 | 0.76 |
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
```python
|
|
|
import pickle
|
|
|
import numpy as np
|
|
|
|
|
|
# Load model and scaler
|
|
|
with open('defect_svm_model.pkl', 'rb') as f:
|
|
|
model = pickle.load(f)
|
|
|
with open('defect_scaler.pkl', 'rb') as f:
|
|
|
scaler = pickle.load(f)
|
|
|
|
|
|
# Prepare features
|
|
|
features = np.array([...]) # Your feature vector
|
|
|
features_scaled = scaler.transform(features.reshape(1, -1))
|
|
|
|
|
|
# Predict
|
|
|
predictions = model.predict(features_scaled)
|
|
|
probabilities = model.predict_proba(features_scaled)
|
|
|
```
|
|
|
|
|
|
## Training Details
|
|
|
|
|
|
- **Online Learning**: Perceptron uses online learning mode with per-sample weight updates
|
|
|
- **Multi-label Strategy**: One-vs-Rest for SVM and Logistic Regression
|
|
|
- **Hyperparameter Tuning**: Grid search with cross-validation
|
|
|
|