Commit
Β·
bba28e5
1
Parent(s):
36092d6
add comprehensive project documentation including milestone summaries, a user guide, and design choices, and update the main README.
Browse files- README.md +130 -630
- docs/README.md +19 -7
- docs/design_choices.md +487 -0
- docs/docs/getting-started.md +0 -6
- docs/docs/index.md +0 -10
- docs/milestone_summaries.md +288 -0
- docs/user_guide.md +497 -0
README.md
CHANGED
|
@@ -8,685 +8,185 @@ app_port: 7860
|
|
| 8 |
api_docs_url: /docs
|
| 9 |
---
|
| 10 |
|
| 11 |
-
#
|
| 12 |
-
|
| 13 |
-
The task involves analyzing the relationship between issue characteristics and required skills, developing effective feature extraction methods that combine textual and code-context information, and implementing sophisticated multi-label classification approaches. Students may incorporate additional GitHub metadata to enhance model inputs, but must avoid using third-party classification engines or direct outputs from the provided database. The work requires careful attention to the multi-label nature of the problem, where each issue may require multiple different skills for resolution.
|
| 14 |
-
|
| 15 |
-
## Project Organization
|
| 16 |
-
|
| 17 |
-
```
|
| 18 |
-
βββ LICENSE <- Open-source license if one is chosen
|
| 19 |
-
βββ Makefile <- Makefile with convenience commands like `make data` or `make train`
|
| 20 |
-
βββ README.md <- The top-level README for developers using this project.
|
| 21 |
-
βββ data
|
| 22 |
-
β βββ external <- Data from third party sources.
|
| 23 |
-
β βββ interim <- Intermediate data that has been transformed.
|
| 24 |
-
β βββ processed <- The final, canonical data sets for modeling.
|
| 25 |
-
β βββ raw <- The original, immutable data dump.
|
| 26 |
-
β
|
| 27 |
-
βββ docs <- A default mkdocs project; see www.mkdocs.org for details
|
| 28 |
-
β
|
| 29 |
-
βββ models <- Trained and serialized models, model predictions, or model summaries
|
| 30 |
-
β
|
| 31 |
-
βββ notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
|
| 32 |
-
β the creator's initials, and a short `-` delimited description, e.g.
|
| 33 |
-
β `1.0-jqp-initial-data-exploration`.
|
| 34 |
-
β
|
| 35 |
-
βββ pyproject.toml <- Project configuration file with package metadata for
|
| 36 |
-
β hopcroft_skill_classification_tool_competition and configuration for tools like black
|
| 37 |
-
β
|
| 38 |
-
βββ references <- Data dictionaries, manuals, and all other explanatory materials.
|
| 39 |
-
β
|
| 40 |
-
βββ reports <- Generated analysis as HTML, PDF, LaTeX, etc.
|
| 41 |
-
β βββ figures <- Generated graphics and figures to be used in reporting
|
| 42 |
-
β
|
| 43 |
-
βββ requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
|
| 44 |
-
β generated with `pip freeze > requirements.txt`
|
| 45 |
-
β
|
| 46 |
-
βββ setup.cfg <- Configuration file for flake8
|
| 47 |
-
β
|
| 48 |
-
βββ hopcroft_skill_classification_tool_competition <- Source code for use in this project.
|
| 49 |
-
β
|
| 50 |
-
βββ __init__.py <- Makes hopcroft_skill_classification_tool_competition a Python module
|
| 51 |
-
β
|
| 52 |
-
βββ config.py <- Store useful variables and configuration
|
| 53 |
-
β
|
| 54 |
-
βββ dataset.py <- Scripts to download or generate data
|
| 55 |
-
β
|
| 56 |
-
βββ features.py <- Code to create features for modeling
|
| 57 |
-
β
|
| 58 |
-
βββ modeling
|
| 59 |
-
β βββ __init__.py
|
| 60 |
-
β βββ predict.py <- Code to run model inference with trained models
|
| 61 |
-
β βββ train.py <- Code to train models
|
| 62 |
-
β
|
| 63 |
-
βββ plots.py <- Code to create visualizations
|
| 64 |
-
```
|
| 65 |
-
|
| 66 |
-
--------
|
| 67 |
-
|
| 68 |
-
## Setup
|
| 69 |
-
|
| 70 |
-
### MLflow Credentials Configuration
|
| 71 |
-
|
| 72 |
-
Set up DagsHub credentials for MLflow tracking.
|
| 73 |
-
|
| 74 |
-
**Get your token:** [DagsHub](https://dagshub.com) β Profile β Settings β Tokens
|
| 75 |
-
|
| 76 |
-
#### Option 1: Using `.env` file (Recommended for local development)
|
| 77 |
-
|
| 78 |
-
```bash
|
| 79 |
-
# Copy the template
|
| 80 |
-
cp .env.example .env
|
| 81 |
-
|
| 82 |
-
# Edit .env with your credentials
|
| 83 |
-
```
|
| 84 |
-
|
| 85 |
-
Your `.env` file should contain:
|
| 86 |
-
```
|
| 87 |
-
MLFLOW_TRACKING_URI=https://dagshub.com/se4ai2526-uniba/Hopcroft.mlflow
|
| 88 |
-
MLFLOW_TRACKING_USERNAME=your_username
|
| 89 |
-
MLFLOW_TRACKING_PASSWORD=your_token
|
| 90 |
-
```
|
| 91 |
-
|
| 92 |
-
> [!NOTE]
|
| 93 |
-
> The `.env` file is git-ignored for security. Never commit credentials to version control.
|
| 94 |
-
|
| 95 |
-
#### Option 2: Using Docker Compose
|
| 96 |
-
|
| 97 |
-
When using Docker Compose, the `.env` file is automatically loaded via `env_file` directive in `docker-compose.yml`.
|
| 98 |
-
|
| 99 |
-
```bash
|
| 100 |
-
# Start the service (credentials loaded from .env)
|
| 101 |
-
docker compose up --build
|
| 102 |
-
```
|
| 103 |
-
|
| 104 |
-
--------
|
| 105 |
-
|
| 106 |
-
## CI Configuration
|
| 107 |
|
| 108 |
[](https://github.com/se4ai2526-uniba/Hopcroft/actions/workflows/ci.yml)
|
|
|
|
|
|
|
| 109 |
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
### Secrets
|
| 113 |
-
|
| 114 |
-
To enable DVC model pulling, configure these Repository Secrets:
|
| 115 |
-
|
| 116 |
-
- `DAGSHUB_USERNAME`: DagsHub username.
|
| 117 |
-
- `DAGSHUB_TOKEN`: DagsHub access token.
|
| 118 |
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
## Milestone Summary
|
| 122 |
-
|
| 123 |
-
### Milestone 1
|
| 124 |
-
We compiled the ML Canvas and defined:
|
| 125 |
-
- Problem: multi-label classification of skills for PR/issues.
|
| 126 |
-
- Stakeholders and business/research goals.
|
| 127 |
-
- Data sources (SkillScope DB) and constraints (no external classifiers).
|
| 128 |
-
- Success metrics (micro-F1, imbalance handling, experiment tracking).
|
| 129 |
-
- Risks (label imbalance, text noise, multi-label complexity) and mitigations.
|
| 130 |
-
|
| 131 |
-
### Milestone 2
|
| 132 |
-
We implemented the essential end-to-end infrastructure to go from data to tracked modeling experiments:
|
| 133 |
-
|
| 134 |
-
1. Data Management
|
| 135 |
-
- DVC setup (raw dataset and TF-IDF features tracked) with DagsHub remote; dedicated gitignores for data/models.
|
| 136 |
-
|
| 137 |
-
2. Data Ingestion & EDA
|
| 138 |
-
- `dataset.py` to download/extract SkillScope from Hugging Face (zip β SQLite) with cleanup.
|
| 139 |
-
- Initial exploration notebook `notebooks/1.0-initial-data-exploration.ipynb` (schema, text stats, label distribution).
|
| 140 |
-
|
| 141 |
-
3. Feature Engineering
|
| 142 |
-
- `features.py`: GitHub text cleaning (URL/HTML/markdown removal, normalization, Porter stemming) and TF-IDF (uni+bi-grams) saved as NumPy (`features_tfidf.npy`, `labels_tfidf.npy`).
|
| 143 |
-
|
| 144 |
-
4. Central Config
|
| 145 |
-
- `config.py` with project paths, training settings, RF param grid, MLflow URI/experiments, PCA/ADASYN, feature constants.
|
| 146 |
-
|
| 147 |
-
5. Modeling & Experiments
|
| 148 |
-
- Unified `modeling/train.py` with actions: baseline RF, MLSMOTE, ROS, ADASYN+PCA, LightGBM, LightGBM+MLSMOTE, and inference.
|
| 149 |
-
- GridSearchCV (micro-F1), MLflow logging, removal of all-zero labels, multilabel-stratified splits (with fallback).
|
| 150 |
-
|
| 151 |
-
6. Imbalance Handling
|
| 152 |
-
- Local `mlsmote.py` (multi-label oversampling) with fallback to `RandomOverSampler`; dedicated ADASYN+PCA pipeline.
|
| 153 |
-
|
| 154 |
-
7. Tracking & Reproducibility
|
| 155 |
-
- Remote MLflow (DagsHub) with README credential setup; DVC-tracked models and auxiliary artifacts (e.g., PCA, kept label indices).
|
| 156 |
-
|
| 157 |
-
8. Tooling
|
| 158 |
-
- Updated `requirements.txt` (lightgbm, imbalanced-learn, iterative-stratification, huggingface-hub, dvc, mlflow, nltk, seaborn, etc.) and extended Makefile targets (`data`, `features`).
|
| 159 |
-
|
| 160 |
-
### Milestone 3 (QA)
|
| 161 |
-
We implemented a comprehensive testing and validation framework to ensure data quality and model robustness:
|
| 162 |
-
|
| 163 |
-
1. **Data Cleaning Pipeline**
|
| 164 |
-
- `data_cleaning.py`: Removes duplicates (481 samples), resolves label conflicts via majority voting (640 samples), filters sparse samples incompatible with SMOTE, and ensures train-test separation without leakage.
|
| 165 |
-
- Final cleaned dataset: 6,673 samples (from 7,154 original), 80/20 stratified split.
|
| 166 |
-
|
| 167 |
-
2. **Great Expectations Validation** (10 tests)
|
| 168 |
-
- Database integrity, feature matrix validation (no NaN/Inf, sparsity checks), label format validation (binary {0,1}), feature-label consistency.
|
| 169 |
-
- Label distribution for stratification (min 5 occurrences), SMOTE compatibility (min 10 non-zero features), duplicate detection, train-test separation, label consistency.
|
| 170 |
-
- All 10 tests pass on cleaned data; comprehensive JSON reports in `reports/great_expectations/`.
|
| 171 |
-
|
| 172 |
-
3. **Deepchecks Validation** (24 checks across 2 suites)
|
| 173 |
-
- Data Integrity Suite (92% score): validates duplicates, label conflicts, nulls, data types, feature correlation.
|
| 174 |
-
- Train-Test Validation Suite (100% score): **zero data leakage**, proper train/test split, feature/label drift analysis.
|
| 175 |
-
- Cleaned data achieved production-ready status (96% overall score).
|
| 176 |
-
|
| 177 |
-
4. **Behavioral Testing** (36 tests)
|
| 178 |
-
- Invariance tests (9): typo robustness, synonym substitution, case insensitivity, punctuation/URL noise tolerance.
|
| 179 |
-
- Directional tests (10): keyword addition effects, technical detail impact on predictions.
|
| 180 |
-
- Minimum Functionality Tests (17): basic skill predictions on clear examples (bug fixes, database work, API development, testing, DevOps).
|
| 181 |
-
- All tests passed; comprehensive report in `reports/behavioral/`.
|
| 182 |
-
|
| 183 |
-
5. **Code Quality Analysis**
|
| 184 |
-
- Ruff static analysis: 28 minor issues identified (unsorted imports, unused variables, f-strings), 100% fixable.
|
| 185 |
-
- PEP 8 compliant, Black compatible (line length 88).
|
| 186 |
-
|
| 187 |
-
6. **Documentation**
|
| 188 |
-
- Comprehensive `docs/testing_and_validation.md` with detailed test descriptions, execution commands, and analysis results.
|
| 189 |
-
- Behavioral testing README with test categories, usage examples, and extension guide.
|
| 190 |
-
|
| 191 |
-
7. **Tooling**
|
| 192 |
-
- Makefile targets: `validate-gx`, `validate-deepchecks`, `test-behavioral`, `test-complete`.
|
| 193 |
-
- Automated test execution and report generation.
|
| 194 |
-
|
| 195 |
-
### Milestone 4 (API)
|
| 196 |
-
We implemented a production-ready FastAPI service for skill prediction with MLflow integration:
|
| 197 |
-
|
| 198 |
-
#### Features
|
| 199 |
-
- **REST API Endpoints**:
|
| 200 |
-
- `POST /predict` - Predict skills for a GitHub issue (logs to MLflow)
|
| 201 |
-
- `GET /predictions/{run_id}` - Retrieve prediction by MLflow run ID
|
| 202 |
-
- `GET /predictions` - List recent predictions with pagination
|
| 203 |
-
- `GET /health` - Health check endpoint
|
| 204 |
-
- **Model Management**: Loads trained Random Forest + TF-IDF vectorizer from `models/`
|
| 205 |
-
- **MLflow Tracking**: All predictions logged with metadata, probabilities, and timestamps
|
| 206 |
-
- **Input Validation**: Pydantic models for request/response validation
|
| 207 |
-
- **Interactive Docs**: Auto-generated Swagger UI and ReDoc
|
| 208 |
-
|
| 209 |
-
#### API Usage
|
| 210 |
|
| 211 |
-
|
| 212 |
-
```bash
|
| 213 |
-
# Development mode (auto-reload)
|
| 214 |
-
make api-dev
|
| 215 |
|
| 216 |
-
|
| 217 |
-
make api-run
|
| 218 |
-
```
|
| 219 |
-
Server starts at: [http://127.0.0.1:8000](http://127.0.0.1:8000)
|
| 220 |
|
| 221 |
-
|
| 222 |
|
| 223 |
-
**
|
| 224 |
-
-
|
| 225 |
-
-
|
| 226 |
-
-
|
|
|
|
|
|
|
| 227 |
|
| 228 |
-
|
| 229 |
-
```bash
|
| 230 |
-
# Test all endpoints
|
| 231 |
-
make test-api-all
|
| 232 |
-
|
| 233 |
-
# Individual endpoints
|
| 234 |
-
make test-api-health # Health check
|
| 235 |
-
make test-api-predict # Single prediction
|
| 236 |
-
make test-api-list # List predictions
|
| 237 |
-
```
|
| 238 |
-
|
| 239 |
-
#### Prerequisites
|
| 240 |
-
- Trained model: `models/random_forest_tfidf_gridsearch.pkl`
|
| 241 |
-
- TF-IDF vectorizer: `models/tfidf_vectorizer.pkl` (auto-saved during feature creation)
|
| 242 |
-
- Label names: `models/label_names.pkl` (auto-saved during feature creation)
|
| 243 |
-
|
| 244 |
-
#### MLflow Integration
|
| 245 |
-
- All predictions logged to: `https://dagshub.com/se4ai2526-uniba/Hopcroft.mlflow`
|
| 246 |
-
- Experiment: `skill_prediction_api`
|
| 247 |
-
- Tracked: input text, predictions, probabilities, metadata
|
| 248 |
|
| 249 |
-
|
| 250 |
-
|
| 251 |
-
```
|
| 252 |
-
|
| 253 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 254 |
```
|
| 255 |
|
| 256 |
-
Endpoints:
|
| 257 |
-
- Swagger UI: [http://localhost:8080/docs](http://localhost:8080/docs)
|
| 258 |
-
- Health check: [http://localhost:8080/health](http://localhost:8080/health)
|
| 259 |
-
|
| 260 |
-
### Milestone 5 (Deployment)
|
| 261 |
-
We implemented a complete containerized deployment pipeline for production-ready delivery:
|
| 262 |
-
|
| 263 |
-
1. **Docker Containerization**
|
| 264 |
-
- `docker/Dockerfile`: Multi-stage Python 3.10 slim image with non-root user, system dependencies (git, nginx, curl), DVC integration, and automated startup script.
|
| 265 |
-
- `docker/Dockerfile.streamlit`: Lightweight container for Streamlit GUI with minimal dependencies.
|
| 266 |
-
- `docker/.dockerignore`: Optimized build context excluding unnecessary files.
|
| 267 |
-
|
| 268 |
-
2. **Docker Compose Orchestration**
|
| 269 |
-
- Multi-service architecture: API backend (`hopcroft-api`), Streamlit frontend (`hopcroft-gui`), and monitoring stack.
|
| 270 |
-
- Bridge network (`hopcroft-net`) for inter-service communication.
|
| 271 |
-
- Health checks with automatic restart policies.
|
| 272 |
-
- Bind mounts for development hot-reload, named volumes for persistent storage (`hopcroft-logs`).
|
| 273 |
-
|
| 274 |
-
3. **Hugging Face Spaces Deployment**
|
| 275 |
-
- Docker SDK configuration with port 7860.
|
| 276 |
-
- `docker/scripts/start_space.sh`: Automated startup script that configures DVC credentials, pulls models from DagsHub, and starts FastAPI + Streamlit + Nginx.
|
| 277 |
-
- Secrets management via HF Spaces Variables (`DAGSHUB_USERNAME`, `DAGSHUB_TOKEN`).
|
| 278 |
-
- Live deployment: `https://huggingface.co/spaces/se4ai2526-uniba/Hopcroft`
|
| 279 |
-
|
| 280 |
-
4. **Nginx Reverse Proxy**
|
| 281 |
-
- `docker/nginx.conf`: Routes traffic to API (port 8000) and Streamlit (port 8501) on single port 7860.
|
| 282 |
-
- Path-based routing for API docs, metrics, and web interface.
|
| 283 |
-
|
| 284 |
-
5. **Environment Configuration**
|
| 285 |
-
- `.env.example` template with MLflow and DagsHub credentials.
|
| 286 |
-
- Automatic environment variable injection via `env_file` directive.
|
| 287 |
-
|
| 288 |
-
### Milestone 6 (Monitoring)
|
| 289 |
-
We implemented comprehensive observability and load testing infrastructure:
|
| 290 |
-
|
| 291 |
-
1. **Prometheus Metrics Collection**
|
| 292 |
-
- `prometheus.yml`: Scrape configuration for API metrics (10s interval), self-monitoring, and Pushgateway.
|
| 293 |
-
- Custom metrics: `hopcroft_requests_total`, `hopcroft_request_duration_seconds`, `hopcroft_in_progress_requests`, `hopcroft_prediction_processing_seconds`.
|
| 294 |
-
- PromQL queries for request rate, latency percentiles, and in-progress tracking.
|
| 295 |
-
|
| 296 |
-
2. **Grafana Dashboards**
|
| 297 |
-
- Auto-provisioned datasources and dashboards via `provisioning/` directory.
|
| 298 |
-
- `hopcroft_dashboard.json`: Real-time visualization of API request rate, latency, drift status, and p-value metrics.
|
| 299 |
-
- Credentials: `admin/admin` on port 3000.
|
| 300 |
-
|
| 301 |
-
3. **Alerting System**
|
| 302 |
-
- `alert_rules.yml`: Prometheus alert rules for `ServiceDown`, `HighErrorRate` (>10% 5xx), `SlowRequests` (p95 > 2s).
|
| 303 |
-
- Alertmanager configuration with severity-based routing and inhibition rules.
|
| 304 |
-
- Webhook integration for alert notifications.
|
| 305 |
-
|
| 306 |
-
4. **Data Drift Detection**
|
| 307 |
-
- `prepare_baseline.py`: Extracts 1000-sample reference dataset from training data.
|
| 308 |
-
- `run_drift_check.py`: Kolmogorov-Smirnov two-sample test with Bonferroni correction (p < 0.05).
|
| 309 |
-
- Metrics pushed to Pushgateway: `drift_detected`, `drift_p_value`, `drift_distance`, `drift_check_timestamp`.
|
| 310 |
-
- JSON reports saved to `monitoring/drift/reports/`.
|
| 311 |
-
|
| 312 |
-
5. **Locust Load Testing**
|
| 313 |
-
- `locustfile.py`: Simulated user behavior with weighted tasks (60% single prediction, 20% batch, 20% monitoring).
|
| 314 |
-
- Configurable wait times (1-5s) for realistic traffic simulation.
|
| 315 |
-
- Web UI on port 8089, headless mode support, CSV export for results.
|
| 316 |
-
- Pre-configured for HF Spaces and local Docker environments.
|
| 317 |
-
|
| 318 |
-
6. **Uptime Monitoring (Better Stack)**
|
| 319 |
-
- External monitoring of production endpoints (`/health`, `/openapi.json`, `/docs`).
|
| 320 |
-
- Multi-location checks with email notifications.
|
| 321 |
-
- Incident tracking and resolution screenshots in `monitoring/screenshots/`.
|
| 322 |
-
|
| 323 |
-
7. **CI/CD Pipeline**
|
| 324 |
-
- `.github/workflows/ci.yml`: GitHub Actions workflow triggered on push/PR to main and feature branches.
|
| 325 |
-
- Jobs: Ruff linting, pytest unit tests with HTML reports, DVC model pulling, Docker image build.
|
| 326 |
-
- Secrets: `DAGSHUB_USERNAME`, `DAGSHUB_TOKEN` for model access.
|
| 327 |
-
- Disk space optimization for CI runner.
|
| 328 |
-
|
| 329 |
-
8. **Pushgateway Integration**
|
| 330 |
-
- Collects metrics from short-lived jobs (drift detection scripts).
|
| 331 |
-
- Persistent storage with 5-minute intervals.
|
| 332 |
-
- Scraped by Prometheus for long-term storage and Grafana visualization.
|
| 333 |
-
|
| 334 |
---
|
| 335 |
|
| 336 |
-
##
|
| 337 |
-
|
| 338 |
-
Docker Compose orchestrates both the **API backend** and **Streamlit GUI** services with proper networking and configuration.
|
| 339 |
|
| 340 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 341 |
|
| 342 |
-
|
| 343 |
-
```bash
|
| 344 |
-
cp .env.example .env
|
| 345 |
-
```
|
| 346 |
|
| 347 |
-
|
| 348 |
-
```
|
| 349 |
-
MLFLOW_TRACKING_USERNAME=your_dagshub_username
|
| 350 |
-
MLFLOW_TRACKING_PASSWORD=your_dagshub_token
|
| 351 |
-
```
|
| 352 |
-
|
| 353 |
-
Get your token from: [https://dagshub.com/user/settings/tokens](https://dagshub.com/user/settings/tokens)
|
| 354 |
|
| 355 |
-
###
|
| 356 |
|
| 357 |
-
#### 1. Build and Start All Services
|
| 358 |
-
Build both images and start the containers:
|
| 359 |
```bash
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 360 |
docker compose -f docker/docker-compose.yml up -d --build
|
| 361 |
```
|
| 362 |
|
| 363 |
-
|
| 364 |
-
|
| 365 |
-
|
| 366 |
-
|
| 367 |
|
| 368 |
-
|
| 369 |
-
- **API (FastAPI):** [http://localhost:8080/docs](http://localhost:8080/docs)
|
| 370 |
-
- **GUI (Streamlit):** [http://localhost:8501](http://localhost:8501)
|
| 371 |
-
- **Health Check:** [http://localhost:8080/health](http://localhost:8080/health)
|
| 372 |
|
| 373 |
-
#### 2. Stop All Services
|
| 374 |
-
Stop and remove containers and networks:
|
| 375 |
```bash
|
| 376 |
-
|
| 377 |
-
|
|
|
|
| 378 |
|
| 379 |
-
|
| 380 |
-
|
| 381 |
-
| `-v` | Also remove named volumes (e.g., `hopcroft-logs`): `docker-compose down -v` |
|
| 382 |
-
| `--rmi all` | Also remove images: `docker-compose down --rmi all` |
|
| 383 |
|
| 384 |
-
|
| 385 |
-
|
| 386 |
-
```bash
|
| 387 |
-
docker compose -f docker/docker-compose.yml restart
|
| 388 |
```
|
| 389 |
|
| 390 |
-
|
| 391 |
-
```bash
|
| 392 |
-
docker compose -f docker/docker-compose.yml down
|
| 393 |
-
docker compose -f docker/docker-compose.yml up -d
|
| 394 |
-
```
|
| 395 |
|
| 396 |
-
|
| 397 |
-
View the status of all running services:
|
| 398 |
-
```bash
|
| 399 |
-
docker compose -f docker/docker-compose.yml ps
|
| 400 |
-
```
|
| 401 |
|
| 402 |
-
Or use Docker commands:
|
| 403 |
-
```bash
|
| 404 |
-
docker ps
|
| 405 |
```
|
| 406 |
-
|
| 407 |
-
|
| 408 |
-
|
| 409 |
-
|
| 410 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 411 |
```
|
| 412 |
|
| 413 |
-
|
| 414 |
-
```bash
|
| 415 |
-
docker compose -f docker/docker-compose.yml logs -f hopcroft-api
|
| 416 |
-
docker compose -f docker/docker-compose.yml logs -f hopcroft-gui
|
| 417 |
-
```
|
| 418 |
|
| 419 |
-
|
| 420 |
-
|------|-------------|
|
| 421 |
-
| `-f` | Follow log output (stream new logs) |
|
| 422 |
-
| `--tail 100` | Show only last 100 lines: `docker-compose logs --tail 100` |
|
| 423 |
|
| 424 |
-
|
| 425 |
-
|
| 426 |
-
|
| 427 |
-
|
| 428 |
-
|
| 429 |
-
|
|
|
|
|
|
|
| 430 |
|
| 431 |
-
|
| 432 |
```bash
|
| 433 |
-
|
| 434 |
-
|
| 435 |
-
|
| 436 |
-
|
| 437 |
-
python
|
| 438 |
|
| 439 |
-
|
| 440 |
-
ls -la /app/models/
|
| 441 |
|
| 442 |
-
|
| 443 |
-
printenv | grep MLFLOW
|
| 444 |
-
```
|
| 445 |
-
```
|
| 446 |
|
| 447 |
-
|
|
|
|
|
|
|
| 448 |
|
| 449 |
-
|
| 450 |
|
| 451 |
-
|
| 452 |
-
docker/docker-compose.yml
|
| 453 |
-
βββ hopcroft-api (FastAPI Backend)
|
| 454 |
-
β βββ Build: docker/Dockerfile
|
| 455 |
-
β βββ Port: 8080:8080
|
| 456 |
-
β βββ Network: hopcroft-net
|
| 457 |
-
β βββ Environment: .env (MLflow credentials)
|
| 458 |
-
β βββ Volumes:
|
| 459 |
-
β β βββ ./hopcroft_skill_classification_tool_competition (hot reload)
|
| 460 |
-
β β βββ hopcroft-logs:/app/logs (persistent logs)
|
| 461 |
-
β βββ Health Check: /health endpoint
|
| 462 |
-
β
|
| 463 |
-
βββ hopcroft-gui (Streamlit Frontend)
|
| 464 |
-
β βββ Build: docker/Dockerfile.streamlit
|
| 465 |
-
β βββ Port: 8501:8501
|
| 466 |
-
β βββ Network: hopcroft-net
|
| 467 |
-
β βββ Environment: API_BASE_URL=http://hopcroft-api:8080
|
| 468 |
-
β βββ Volumes:
|
| 469 |
-
β β βββ ./hopcroft_skill_classification_tool_competition/streamlit_app.py (hot reload)
|
| 470 |
-
β βββ Depends on: hopcroft-api (waits for health check)
|
| 471 |
-
β
|
| 472 |
-
βββ hopcroft-net (bridge network)
|
| 473 |
-
```
|
| 474 |
|
| 475 |
-
**External Access:**
|
| 476 |
-
- API: http://localhost:8080
|
| 477 |
-
- GUI: http://localhost:8501
|
| 478 |
-
|
| 479 |
-
**Internal Communication:**
|
| 480 |
-
- GUI β API: http://hopcroft-api:8080 (via Docker network)
|
| 481 |
-
|
| 482 |
-
### Services Description
|
| 483 |
-
|
| 484 |
-
**hopcroft-api (FastAPI Backend)**
|
| 485 |
-
- Purpose: FastAPI backend serving the ML model for skill classification
|
| 486 |
-
- Image: Built from `docker/Dockerfile`
|
| 487 |
-
- Port: 8080 (maps to host 8080)
|
| 488 |
-
- Features:
|
| 489 |
-
- Random Forest model with embedding features
|
| 490 |
-
- MLflow experiment tracking
|
| 491 |
-
- Auto-reload in development mode
|
| 492 |
-
- Health check endpoint
|
| 493 |
-
|
| 494 |
-
**hopcroft-gui (Streamlit Frontend)**
|
| 495 |
-
- Purpose: Streamlit web interface for interactive predictions
|
| 496 |
-
- Image: Built from `docker/Dockerfile.streamlit`
|
| 497 |
-
- Port: 8501 (maps to host 8501)
|
| 498 |
-
- Features:
|
| 499 |
-
- User-friendly interface for skill prediction
|
| 500 |
-
- Real-time communication with API
|
| 501 |
-
- Automatic reconnection on API restart
|
| 502 |
-
- Depends on API health before starting
|
| 503 |
-
|
| 504 |
-
### Development vs Production
|
| 505 |
-
|
| 506 |
-
**Development (default):**
|
| 507 |
-
- Auto-reload enabled (`--reload`)
|
| 508 |
-
- Source code mounted with bind mounts
|
| 509 |
-
- Custom command with hot reload
|
| 510 |
-
- GUI β API via Docker network
|
| 511 |
-
|
| 512 |
-
**Production:**
|
| 513 |
-
- Auto-reload disabled
|
| 514 |
-
- Use built image only
|
| 515 |
-
- Use Dockerfile's CMD
|
| 516 |
-
- GUI β API via Docker network
|
| 517 |
-
|
| 518 |
-
For **production deployment**, modify `docker/docker-compose.yml` to remove bind mounts and disable reload.
|
| 519 |
-
|
| 520 |
-
### Troubleshooting
|
| 521 |
-
|
| 522 |
-
#### Issue: GUI shows "API is not available"
|
| 523 |
-
**Solution:**
|
| 524 |
-
1. Wait 30-60 seconds for API to fully initialize and become healthy
|
| 525 |
-
2. Refresh the GUI page (F5)
|
| 526 |
-
3. Check API health: `curl http://localhost:8080/health`
|
| 527 |
-
4. Check logs: `docker compose -f docker/docker-compose.yml logs hopcroft-api`
|
| 528 |
-
|
| 529 |
-
#### Issue: "500 Internal Server Error" on predictions
|
| 530 |
-
**Solution:**
|
| 531 |
-
1. Verify MLflow credentials in `.env` are correct
|
| 532 |
-
2. Restart services: `docker compose -f docker/docker-compose.yml down && docker compose -f docker/docker-compose.yml up -d`
|
| 533 |
-
3. Check environment variables: `docker exec hopcroft-api printenv | grep MLFLOW`
|
| 534 |
-
|
| 535 |
-
#### Issue: Changes to code not reflected
|
| 536 |
-
**Solution:**
|
| 537 |
-
- For Python code changes: Auto-reload is enabled, wait a few seconds
|
| 538 |
-
- For Dockerfile changes: Rebuild with `docker compose -f docker/docker-compose.yml up -d --build`
|
| 539 |
-
- For `.env` changes: Restart with `docker compose -f docker/docker-compose.yml down && docker compose -f docker/docker-compose.yml up -d`
|
| 540 |
-
|
| 541 |
-
#### Issue: Port already in use
|
| 542 |
-
**Solution:**
|
| 543 |
```bash
|
| 544 |
-
#
|
| 545 |
-
|
| 546 |
-
|
|
|
|
| 547 |
|
| 548 |
-
#
|
| 549 |
-
|
|
|
|
| 550 |
|
| 551 |
-
#
|
|
|
|
| 552 |
```
|
| 553 |
|
|
|
|
| 554 |
|
| 555 |
-
|
| 556 |
-
|
| 557 |
-
## Hugging Face Spaces Deployment
|
| 558 |
-
|
| 559 |
-
This project is configured to run on [Hugging Face Spaces](https://huggingface.co/spaces) using Docker.
|
| 560 |
-
|
| 561 |
-
### 1. Setup Space
|
| 562 |
-
1. Create a new Space on Hugging Face.
|
| 563 |
-
2. Select **Docker** as the SDK.
|
| 564 |
-
3. Choose the **Blank** template or upload your code.
|
| 565 |
-
|
| 566 |
-
### 2. Configure Secrets
|
| 567 |
-
To enable the application to pull models from DagsHub via DVC, you must configure the following **Variables and Secrets** in your Space settings:
|
| 568 |
-
|
| 569 |
-
| Name | Type | Description |
|
| 570 |
-
|------|------|-------------|
|
| 571 |
-
| `DAGSHUB_USERNAME` | Secret | Your DagsHub username. |
|
| 572 |
-
| `DAGSHUB_TOKEN` | Secret | Your DagsHub access token (Settings -> Tokens). |
|
| 573 |
-
|
| 574 |
-
> [!IMPORTANT]
|
| 575 |
-
> These secrets are injected into the container at runtime. The `docker/scripts/start_space.sh` script uses them to authenticate DVC and pull the required model files (`.pkl`) before starting the API and GUI.
|
| 576 |
-
|
| 577 |
-
### 3. Automated Startup
|
| 578 |
-
The deployment follows this automated flow:
|
| 579 |
-
1. **docker/Dockerfile**: Builds the environment, installs dependencies, and sets up Nginx.
|
| 580 |
-
2. **docker/scripts/start_space.sh**:
|
| 581 |
-
- Configures DVC with your secrets.
|
| 582 |
-
- Pulls models from the DagsHub remote.
|
| 583 |
-
- Starts the **FastAPI** backend (port 8000).
|
| 584 |
-
- Starts the **Streamlit** frontend (port 8501).
|
| 585 |
-
- Starts **Nginx** (port 7860) as a reverse proxy to route traffic.
|
| 586 |
-
|
| 587 |
-
### 4. Direct Access
|
| 588 |
-
Once deployed, your Space will be available at:
|
| 589 |
-
`https://huggingface.co/spaces/se4ai2526-uniba/Hopcroft`
|
| 590 |
-
|
| 591 |
-
The API documentation will be accessible at:
|
| 592 |
-
`https://huggingface.co/spaces/se4ai2526-uniba/Hopcroft/docs`
|
| 593 |
-
|
| 594 |
-
--------
|
| 595 |
-
|
| 596 |
-
## Demo UI (Streamlit)
|
| 597 |
-
|
| 598 |
-
The Streamlit GUI provides an interactive web interface for the skill classification API.
|
| 599 |
-
|
| 600 |
-
### Features
|
| 601 |
-
- Real-time skill prediction from GitHub issue text
|
| 602 |
-
- Top-5 predicted skills with confidence scores
|
| 603 |
-
- Full predictions table with all skills
|
| 604 |
-
- API connection status indicator
|
| 605 |
-
- Responsive design
|
| 606 |
-
|
| 607 |
-
### Usage
|
| 608 |
-
1. Ensure both services are running: `docker compose -f docker/docker-compose.yml up -d`
|
| 609 |
-
2. Open the GUI in your browser: [http://localhost:8501](http://localhost:8501)
|
| 610 |
-
3. Enter a GitHub issue description in the text area
|
| 611 |
-
4. Click "Predict Skills" to get predictions
|
| 612 |
-
5. View results in the predictions table
|
| 613 |
-
|
| 614 |
-
### Architecture
|
| 615 |
-
- **Frontend**: Streamlit (Python web framework)
|
| 616 |
-
- **Communication**: HTTP requests to FastAPI backend via Docker network
|
| 617 |
-
- **Independence**: GUI and API run in separate containers
|
| 618 |
-
- **Auto-reload**: GUI code changes are reflected immediately (bind mount)
|
| 619 |
-
> Both must run **simultaneously** in different terminals/containers.
|
| 620 |
-
|
| 621 |
-
### Quick Start
|
| 622 |
-
|
| 623 |
-
1. **Start the FastAPI backend:**
|
| 624 |
-
```bash
|
| 625 |
-
fastapi dev hopcroft_skill_classification_tool_competition/main.py
|
| 626 |
-
```
|
| 627 |
-
|
| 628 |
-
2. **In a new terminal, start Streamlit:**
|
| 629 |
-
```bash
|
| 630 |
-
streamlit run streamlit_app.py
|
| 631 |
-
```
|
| 632 |
-
|
| 633 |
-
3. **Open your browser:**
|
| 634 |
-
- Streamlit UI: http://localhost:8501
|
| 635 |
-
- FastAPI Docs: http://localhost:8000/docs
|
| 636 |
-
|
| 637 |
-
### Features
|
| 638 |
-
|
| 639 |
-
- Interactive web interface for skill prediction
|
| 640 |
-
- Real-time predictions with confidence scores
|
| 641 |
-
- Adjustable confidence threshold
|
| 642 |
-
- Multiple input modes (quick/detailed/examples)
|
| 643 |
-
- Visual result display
|
| 644 |
-
- API health monitoring
|
| 645 |
-
|
| 646 |
-
### Demo Walkthrough
|
| 647 |
-
|
| 648 |
-
#### Main Dashboard
|
| 649 |
-
|
| 650 |
-

|
| 651 |
|
| 652 |
-
|
| 653 |
-
- **Sidebar**: API health status, confidence threshold slider, model info
|
| 654 |
-
- **Three input modes**: Quick Input, Detailed Input, Examples
|
| 655 |
-
#### Quick Input Mode
|
| 656 |
-
|
| 657 |
-

|
| 658 |
-
Simply paste your GitHub issue text and click "Predict Skills"!
|
| 659 |
-
|
| 660 |
-
#### Prediction Results
|
| 661 |
-

|
| 662 |
-
View:
|
| 663 |
-
- **Top predictions** with confidence scores
|
| 664 |
-
- **Full predictions table** with filtering
|
| 665 |
-
- **Processing metrics** (time, model version)
|
| 666 |
-
- **Raw JSON response** (expandable)
|
| 667 |
-
|
| 668 |
-
#### Detailed Input Mode
|
| 669 |
-
|
| 670 |
-

|
| 671 |
-
Add optional metadata:
|
| 672 |
-
- Repository name
|
| 673 |
-
- PR number
|
| 674 |
-
- Detailed description
|
| 675 |
-
|
| 676 |
-
#### Example Gallery
|
| 677 |
-

|
| 678 |
-
|
| 679 |
-
Test with pre-loaded examples:
|
| 680 |
-
- Authentication bugs
|
| 681 |
-
- ML features
|
| 682 |
-
- Database issues
|
| 683 |
-
- UI enhancements
|
| 684 |
-
|
| 685 |
-
|
| 686 |
-
### Usage
|
| 687 |
-
|
| 688 |
-
1. Enter GitHub issue/PR text in the input area
|
| 689 |
-
2. (Optional) Add description, repo name, PR number
|
| 690 |
-
3. Click "Predict Skills"
|
| 691 |
-
4. View results with confidence scores
|
| 692 |
-
5. Adjust threshold slider to filter predictions
|
|
|
|
| 8 |
api_docs_url: /docs
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# Hopcroft Skill Classification
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
|
| 13 |
[](https://github.com/se4ai2526-uniba/Hopcroft/actions/workflows/ci.yml)
|
| 14 |
+
[](https://huggingface.co/spaces/se4ai2526-uniba/Hopcroft)
|
| 15 |
+
[](https://dagshub.com/se4ai2526-uniba/Hopcroft.mlflow)
|
| 16 |
|
| 17 |
+
**Multi-label skill classification for GitHub issues and pull requests** β Automatically identify technical skills required to resolve software issues using machine learning.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
+
## Overview
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
+
Hopcroft is an ML-enabled system that classifies GitHub issues into 217 technical skill categories, enabling automated developer assignment and optimized resource allocation. Built following professional MLOps and Software Engineering standards.
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
+
### Key Features
|
| 26 |
|
| 27 |
+
- π― **Multi-label Classification**: Predict multiple skills per issue
|
| 28 |
+
- π **REST API**: FastAPI with Swagger documentation
|
| 29 |
+
- π₯οΈ **Web Interface**: Streamlit GUI for interactive predictions
|
| 30 |
+
- π **Monitoring**: Prometheus/Grafana dashboards with drift detection
|
| 31 |
+
- π **CI/CD**: GitHub Actions with Docker deployment
|
| 32 |
+
- π **Experiment Tracking**: MLflow on DagsHub
|
| 33 |
|
| 34 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
+
## Architecture
|
| 37 |
+
|
| 38 |
+
```mermaid
|
| 39 |
+
graph TB
|
| 40 |
+
subgraph "Data Layer"
|
| 41 |
+
A[(SkillScope DB)] --> B[Feature Engineering]
|
| 42 |
+
B --> C[TF-IDF / Embeddings]
|
| 43 |
+
end
|
| 44 |
+
|
| 45 |
+
subgraph "ML Pipeline"
|
| 46 |
+
C --> D[Model Training]
|
| 47 |
+
D --> E[(MLflow Tracking)]
|
| 48 |
+
D --> F[Random Forest Model]
|
| 49 |
+
end
|
| 50 |
+
|
| 51 |
+
subgraph "Serving Layer"
|
| 52 |
+
F --> G[FastAPI Service]
|
| 53 |
+
G --> H[/predict]
|
| 54 |
+
G --> I[/predictions]
|
| 55 |
+
G --> J[/health]
|
| 56 |
+
end
|
| 57 |
+
|
| 58 |
+
subgraph "Frontend"
|
| 59 |
+
G --> K[Streamlit GUI]
|
| 60 |
+
end
|
| 61 |
+
|
| 62 |
+
subgraph "Monitoring"
|
| 63 |
+
G --> L[Prometheus]
|
| 64 |
+
L --> M[Grafana]
|
| 65 |
+
N[Drift Detection] --> L
|
| 66 |
+
end
|
| 67 |
+
|
| 68 |
+
subgraph "Deployment"
|
| 69 |
+
O[GitHub Actions] --> P[Docker Build]
|
| 70 |
+
P --> Q[HF Spaces]
|
| 71 |
+
end
|
| 72 |
```
|
| 73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
---
|
| 75 |
|
| 76 |
+
## Documentation
|
|
|
|
|
|
|
| 77 |
|
| 78 |
+
| Document | Description |
|
| 79 |
+
|----------|-------------|
|
| 80 |
+
| π [Milestone Summaries](docs/milestone_summaries.md) | All 6 project phases documented |
|
| 81 |
+
| π [User Guide](docs/user_guide.md) | Setup, API, GUI, testing, monitoring |
|
| 82 |
+
| ποΈ [Design Choices](docs/design_choices.md) | Technical decisions & rationale |
|
| 83 |
+
| π― [ML Canvas](docs/ML%20Canvas.md) | Requirements engineering framework |
|
| 84 |
+
| β
[Testing & Validation](docs/testing_and_validation.md) | QA strategy & results |
|
| 85 |
|
| 86 |
+
---
|
|
|
|
|
|
|
|
|
|
| 87 |
|
| 88 |
+
## Quick Start
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 89 |
|
| 90 |
+
### Docker (Recommended)
|
| 91 |
|
|
|
|
|
|
|
| 92 |
```bash
|
| 93 |
+
# Clone and configure
|
| 94 |
+
git clone https://github.com/se4ai2526-uniba/Hopcroft.git
|
| 95 |
+
cd Hopcroft
|
| 96 |
+
cp .env.example .env
|
| 97 |
+
# Edit .env with your DagsHub credentials
|
| 98 |
+
|
| 99 |
+
# Start services
|
| 100 |
docker compose -f docker/docker-compose.yml up -d --build
|
| 101 |
```
|
| 102 |
|
| 103 |
+
**Access:**
|
| 104 |
+
- π **API Docs**: http://localhost:8080/docs
|
| 105 |
+
- π₯οΈ **GUI**: http://localhost:8501
|
| 106 |
+
- β€οΈ **Health**: http://localhost:8080/health
|
| 107 |
|
| 108 |
+
### Local Development
|
|
|
|
|
|
|
|
|
|
| 109 |
|
|
|
|
|
|
|
| 110 |
```bash
|
| 111 |
+
# Setup environment
|
| 112 |
+
python -m venv venv && source venv/bin/activate # or venv\Scripts\activate on Windows
|
| 113 |
+
pip install -r requirements.txt && pip install -e .
|
| 114 |
|
| 115 |
+
# Start API
|
| 116 |
+
make api-dev
|
|
|
|
|
|
|
| 117 |
|
| 118 |
+
# Start GUI (new terminal)
|
| 119 |
+
streamlit run hopcroft_skill_classification_tool_competition/streamlit_app.py
|
|
|
|
|
|
|
| 120 |
```
|
| 121 |
|
| 122 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
| 123 |
|
| 124 |
+
## Project Structure
|
|
|
|
|
|
|
|
|
|
|
|
|
| 125 |
|
|
|
|
|
|
|
|
|
|
| 126 |
```
|
| 127 |
+
βββ hopcroft_skill_classification_tool_competition/
|
| 128 |
+
β βββ main.py # FastAPI application
|
| 129 |
+
β βββ streamlit_app.py # Streamlit GUI
|
| 130 |
+
β βββ features.py # Feature engineering
|
| 131 |
+
β βββ modeling/ # Training & prediction
|
| 132 |
+
β βββ config.py # Configuration
|
| 133 |
+
βββ data/ # DVC-tracked datasets
|
| 134 |
+
βββ models/ # DVC-tracked models
|
| 135 |
+
βββ tests/ # Pytest test suites
|
| 136 |
+
βββ monitoring/ # Prometheus, Grafana, Locust
|
| 137 |
+
βββ docker/ # Docker configurations
|
| 138 |
+
βββ docs/ # Documentation
|
| 139 |
+
βββ .github/workflows/ # CI/CD pipelines
|
| 140 |
```
|
| 141 |
|
| 142 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
|
| 144 |
+
## API Endpoints
|
|
|
|
|
|
|
|
|
|
| 145 |
|
| 146 |
+
| Method | Endpoint | Description |
|
| 147 |
+
|--------|----------|-------------|
|
| 148 |
+
| `POST` | `/predict` | Classify single issue |
|
| 149 |
+
| `POST` | `/predict/batch` | Batch classification |
|
| 150 |
+
| `GET` | `/predictions` | List recent predictions |
|
| 151 |
+
| `GET` | `/predictions/{id}` | Get by MLflow run ID |
|
| 152 |
+
| `GET` | `/health` | Health check |
|
| 153 |
+
| `GET` | `/metrics` | Prometheus metrics |
|
| 154 |
|
| 155 |
+
**Example:**
|
| 156 |
```bash
|
| 157 |
+
curl -X POST "http://localhost:8080/predict" \
|
| 158 |
+
-H "Content-Type: application/json" \
|
| 159 |
+
-d '{"issue_text": "Fix OAuth2 authentication bug"}'
|
| 160 |
+
```
|
|
|
|
| 161 |
|
| 162 |
+
---
|
|
|
|
| 163 |
|
| 164 |
+
## Live Deployment
|
|
|
|
|
|
|
|
|
|
| 165 |
|
| 166 |
+
- **Application**: https://huggingface.co/spaces/se4ai2526-uniba/Hopcroft
|
| 167 |
+
- **API Docs**: https://huggingface.co/spaces/se4ai2526-uniba/Hopcroft/docs
|
| 168 |
+
- **MLflow**: https://dagshub.com/se4ai2526-uniba/Hopcroft.mlflow
|
| 169 |
|
| 170 |
+
---
|
| 171 |
|
| 172 |
+
## Development
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 173 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 174 |
```bash
|
| 175 |
+
# Run tests
|
| 176 |
+
make test-all # All tests
|
| 177 |
+
make test-behavioral # ML behavioral tests
|
| 178 |
+
make validate-deepchecks # Data validation
|
| 179 |
|
| 180 |
+
# Lint & format
|
| 181 |
+
make lint # Check code style
|
| 182 |
+
make format # Auto-fix issues
|
| 183 |
|
| 184 |
+
# Training
|
| 185 |
+
make train-baseline-tfidf # Train baseline model
|
| 186 |
```
|
| 187 |
|
| 188 |
+
---
|
| 189 |
|
| 190 |
+
## License
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 191 |
|
| 192 |
+
This project was developed as part of the SE4AI 2025-26 course at the University of Bari.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
docs/README.md
CHANGED
|
@@ -1,12 +1,24 @@
|
|
| 1 |
-
|
| 2 |
-
----------
|
| 3 |
|
| 4 |
-
|
| 5 |
|
| 6 |
-
|
| 7 |
|
| 8 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
-
|
| 11 |
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Documentation
|
|
|
|
| 2 |
|
| 3 |
+
This directory contains comprehensive documentation for the Hopcroft Skill Classification system.
|
| 4 |
|
| 5 |
+
## Contents
|
| 6 |
|
| 7 |
+
| Document | Description |
|
| 8 |
+
|----------|-------------|
|
| 9 |
+
| [Milestone Summaries](milestone_summaries.md) | Overview of all 6 project development phases |
|
| 10 |
+
| [User Guide](user_guide.md) | Setup, API, GUI, load testing, and monitoring instructions |
|
| 11 |
+
| [Design Choices](design_choices.md) | Technical justifications and architectural decisions |
|
| 12 |
+
| [ML Canvas](ML%20Canvas.md) | Machine Learning Canvas requirements framework |
|
| 13 |
+
| [Testing & Validation](testing_and_validation.md) | QA strategy with test results and commands |
|
| 14 |
|
| 15 |
+
## Quick Links
|
| 16 |
|
| 17 |
+
- **Getting Started**: See [User Guide - System Setup](user_guide.md#1-system-setup)
|
| 18 |
+
- **API Reference**: See [User Guide - API Usage](user_guide.md#2-api-usage)
|
| 19 |
+
- **Architecture**: See [Design Choices](design_choices.md)
|
| 20 |
+
- **Project History**: See [Milestone Summaries](milestone_summaries.md)
|
| 21 |
+
|
| 22 |
+
## Images
|
| 23 |
+
|
| 24 |
+
The `img/` directory contains screenshots for GUI documentation.
|
docs/design_choices.md
ADDED
|
@@ -0,0 +1,487 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Design Choices
|
| 2 |
+
|
| 3 |
+
Technical justification of the architectural and engineering decisions made during the Hopcroft project development, following professional MLOps and Software Engineering standards.
|
| 4 |
+
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
## Table of Contents
|
| 8 |
+
|
| 9 |
+
1. [Inception (Requirements Engineering)](#1-inception-requirements-engineering)
|
| 10 |
+
2. [Reproducibility (Versioning & Pipelines)](#2-reproducibility-versioning--pipelines)
|
| 11 |
+
3. [Quality Assurance](#3-quality-assurance)
|
| 12 |
+
4. [API (Inference Service)](#4-api-inference-service)
|
| 13 |
+
5. [Deployment (Containerization & CI/CD)](#5-deployment-containerization--cicd)
|
| 14 |
+
6. [Monitoring](#6-monitoring)
|
| 15 |
+
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
## 1. Inception (Requirements Engineering)
|
| 19 |
+
|
| 20 |
+
### Machine Learning Canvas
|
| 21 |
+
|
| 22 |
+
The project adopted the **Machine Learning Canvas** framework to systematically define the problem space before implementation. This structured approach ensures alignment between business objectives and technical solutions.
|
| 23 |
+
|
| 24 |
+
| Canvas Section | Application |
|
| 25 |
+
|----------------|-------------|
|
| 26 |
+
| **Prediction Task** | Multi-label classification of 217 technical skills from GitHub issue text |
|
| 27 |
+
| **Decisions** | Automated developer assignment based on predicted skill requirements |
|
| 28 |
+
| **Value Proposition** | Reduced issue resolution time, optimized resource allocation |
|
| 29 |
+
| **Data Sources** | SkillScope DB (7,245 PRs from 11 Java repositories) |
|
| 30 |
+
| **Making Predictions** | Real-time classification upon issue creation |
|
| 31 |
+
| **Building Models** | Iterative improvement over RF+TF-IDF baseline |
|
| 32 |
+
| **Monitoring** | Continuous evaluation with drift detection |
|
| 33 |
+
|
| 34 |
+
The complete ML Canvas is documented in [ML Canvas.md](./ML%20Canvas.md).
|
| 35 |
+
|
| 36 |
+
### Functional vs Non-Functional Requirements
|
| 37 |
+
|
| 38 |
+
#### Functional Requirements
|
| 39 |
+
|
| 40 |
+
| Requirement | Target | Metric |
|
| 41 |
+
|-------------|--------|--------|
|
| 42 |
+
| **Precision** | β₯ Baseline | True positives / Predicted positives |
|
| 43 |
+
| **Recall** | β₯ Baseline | True positives / Actual positives |
|
| 44 |
+
| **Micro-F1** | > Baseline | Harmonic mean across all labels |
|
| 45 |
+
| **Multi-label Support** | 217 skills | Simultaneous prediction of multiple labels |
|
| 46 |
+
|
| 47 |
+
#### Non-Functional Requirements
|
| 48 |
+
|
| 49 |
+
| Category | Requirement | Implementation |
|
| 50 |
+
|----------|-------------|----------------|
|
| 51 |
+
| **Reproducibility** | Auditable experiments | MLflow tracking, DVC versioning |
|
| 52 |
+
| **Explainability** | Interpretable predictions | Confidence scores per skill |
|
| 53 |
+
| **Performance** | Low latency inference | FastAPI async, model caching |
|
| 54 |
+
| **Scalability** | Batch processing | `/predict/batch` endpoint (max 100) |
|
| 55 |
+
| **Maintainability** | Clean code | Ruff linting, type hints, docstrings |
|
| 56 |
+
|
| 57 |
+
### System-First vs Model-First Development
|
| 58 |
+
|
| 59 |
+
The project adopted a **System-First** approach, prioritizing infrastructure and pipeline development before model optimization:
|
| 60 |
+
|
| 61 |
+
```
|
| 62 |
+
Timeline:
|
| 63 |
+
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 64 |
+
β Phase 1: Infrastructure β Phase 2: Model Development β
|
| 65 |
+
β - DVC/MLflow setup β - Feature engineering β
|
| 66 |
+
β - CI/CD pipeline β - Hyperparameter tuning β
|
| 67 |
+
β - Docker containers β - SMOTE/ADASYN experiments β
|
| 68 |
+
β - API skeleton β - Performance optimization β
|
| 69 |
+
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
**Rationale:**
|
| 73 |
+
- Enables rapid iteration once infrastructure is stable
|
| 74 |
+
- Ensures reproducibility from day one
|
| 75 |
+
- Reduces technical debt during model development
|
| 76 |
+
- Facilitates team collaboration with shared tooling
|
| 77 |
+
|
| 78 |
+
---
|
| 79 |
+
|
| 80 |
+
## 2. Reproducibility (Versioning & Pipelines)
|
| 81 |
+
|
| 82 |
+
### Code Versioning (Git)
|
| 83 |
+
|
| 84 |
+
Standard Git workflow with branch protection:
|
| 85 |
+
|
| 86 |
+
| Branch | Purpose |
|
| 87 |
+
|--------|---------|
|
| 88 |
+
| `main` | Production-ready code |
|
| 89 |
+
| `feature/*` | New development |
|
| 90 |
+
| `milestone/*` | Grouping all features before merging into main |
|
| 91 |
+
|
| 92 |
+
### Data & Model Versioning (DVC)
|
| 93 |
+
|
| 94 |
+
**Design Decision:** Use DVC (Data Version Control) with DagsHub remote storage for large file management.
|
| 95 |
+
|
| 96 |
+
```
|
| 97 |
+
.dvc/config
|
| 98 |
+
βββ remote: origin
|
| 99 |
+
βββ url: https://dagshub.com/se4ai2526-uniba/Hopcroft.dvc
|
| 100 |
+
βββ auth: basic (credentials via environment)
|
| 101 |
+
```
|
| 102 |
+
|
| 103 |
+
**Tracked Artifacts:**
|
| 104 |
+
|
| 105 |
+
| File | Purpose |
|
| 106 |
+
|------|---------|
|
| 107 |
+
| `data/raw/skillscope_data.db` | Original SQLite database |
|
| 108 |
+
| `data/processed/*.npy` | TF-IDF and embedding features |
|
| 109 |
+
| `models/*.pkl` | Trained models and vectorizers |
|
| 110 |
+
|
| 111 |
+
**Versioning Workflow:**
|
| 112 |
+
```bash
|
| 113 |
+
# Track new data
|
| 114 |
+
dvc add data/raw/new_dataset.db
|
| 115 |
+
git add data/raw/.gitignore data/raw/new_dataset.db.dvc
|
| 116 |
+
|
| 117 |
+
# Push to remote
|
| 118 |
+
dvc push
|
| 119 |
+
git commit -m "Add new dataset version"
|
| 120 |
+
git push
|
| 121 |
+
```
|
| 122 |
+
|
| 123 |
+
### Experiment Tracking (MLflow)
|
| 124 |
+
|
| 125 |
+
**Design Decision:** Remote MLflow instance on DagsHub for collaborative experiment tracking.
|
| 126 |
+
|
| 127 |
+
| Configuration | Value |
|
| 128 |
+
|---------------|-------|
|
| 129 |
+
| Tracking URI | `https://dagshub.com/se4ai2526-uniba/Hopcroft.mlflow` |
|
| 130 |
+
| Experiments | `skill_classification`, `skill_prediction_api` |
|
| 131 |
+
|
| 132 |
+
**Logged Metrics:**
|
| 133 |
+
- Training: precision, recall, F1-score, training time
|
| 134 |
+
- Inference: prediction latency, confidence scores, timestamps
|
| 135 |
+
|
| 136 |
+
**Artifact Storage:**
|
| 137 |
+
- Model binaries (`.pkl`)
|
| 138 |
+
- Vectorizers and scalers
|
| 139 |
+
- Hyperparameter configurations
|
| 140 |
+
|
| 141 |
+
### Auditable ML Pipeline
|
| 142 |
+
|
| 143 |
+
The pipeline is designed for complete reproducibility:
|
| 144 |
+
|
| 145 |
+
```
|
| 146 |
+
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
|
| 147 |
+
β dataset.py βββββΆβ features.py βββββΆβ train.py β
|
| 148 |
+
β (DVC pull) β β (TF-IDF) β β (MLflow) β
|
| 149 |
+
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
|
| 150 |
+
β β β
|
| 151 |
+
βΌ βΌ βΌ
|
| 152 |
+
.dvc files .dvc files MLflow Run
|
| 153 |
+
```
|
| 154 |
+
|
| 155 |
+
---
|
| 156 |
+
|
| 157 |
+
## 3. Quality Assurance
|
| 158 |
+
|
| 159 |
+
### Testing Strategy
|
| 160 |
+
|
| 161 |
+
#### Static Analysis (Ruff)
|
| 162 |
+
|
| 163 |
+
**Design Decision:** Use Ruff as the primary linter for speed and comprehensive rule coverage.
|
| 164 |
+
|
| 165 |
+
| Configuration | Value |
|
| 166 |
+
|---------------|-------|
|
| 167 |
+
| Line Length | 88 (Black compatible) |
|
| 168 |
+
| Target Python | 3.10+ |
|
| 169 |
+
| Rule Sets | PEP 8, isort, pyflakes |
|
| 170 |
+
|
| 171 |
+
**CI Integration:**
|
| 172 |
+
```yaml
|
| 173 |
+
- name: Lint with Ruff
|
| 174 |
+
run: make lint
|
| 175 |
+
```
|
| 176 |
+
|
| 177 |
+
#### Dynamic Testing (Pytest)
|
| 178 |
+
|
| 179 |
+
**Test Organization:**
|
| 180 |
+
|
| 181 |
+
```
|
| 182 |
+
tests/
|
| 183 |
+
βββ unit/ # Isolated function tests
|
| 184 |
+
βββ integration/ # Component interaction tests
|
| 185 |
+
βββ system/ # End-to-end tests
|
| 186 |
+
βββ behavioral/ # ML-specific tests
|
| 187 |
+
βββ deepchecks/ # Data validation
|
| 188 |
+
βββ great expectations/ # Schema validation
|
| 189 |
+
```
|
| 190 |
+
|
| 191 |
+
**Markers for Selective Execution:**
|
| 192 |
+
```python
|
| 193 |
+
@pytest.mark.unit
|
| 194 |
+
@pytest.mark.integration
|
| 195 |
+
@pytest.mark.system
|
| 196 |
+
@pytest.mark.slow
|
| 197 |
+
```
|
| 198 |
+
|
| 199 |
+
### Model Validation vs Model Verification
|
| 200 |
+
|
| 201 |
+
| Concept | Definition | Implementation |
|
| 202 |
+
|---------|------------|----------------|
|
| 203 |
+
| **Validation** | Does the model fit user needs? | Micro-F1 vs baseline comparison |
|
| 204 |
+
| **Verification** | Is the model correctly built? | Unit tests, behavioral tests |
|
| 205 |
+
|
| 206 |
+
### Behavioral Testing
|
| 207 |
+
|
| 208 |
+
**Design Decision:** Implement CheckList-inspired behavioral tests to evaluate model robustness beyond accuracy metrics.
|
| 209 |
+
|
| 210 |
+
| Test Type | Count | Purpose |
|
| 211 |
+
|-----------|-------|---------|
|
| 212 |
+
| **Invariance** | 9 | Stability under perturbations (typos, case changes) |
|
| 213 |
+
| **Directional** | 10 | Expected behavior with keyword additions |
|
| 214 |
+
| **Minimum Functionality** | 17 | Basic sanity checks on clear examples |
|
| 215 |
+
|
| 216 |
+
**Example Invariance Test:**
|
| 217 |
+
```python
|
| 218 |
+
def test_case_insensitivity():
|
| 219 |
+
"""Model should predict same skills regardless of case."""
|
| 220 |
+
assert predict("Fix BUG") == predict("fix bug")
|
| 221 |
+
```
|
| 222 |
+
|
| 223 |
+
### Data Quality Checks
|
| 224 |
+
|
| 225 |
+
#### Great Expectations (10 Tests)
|
| 226 |
+
|
| 227 |
+
**Design Decision:** Validate data at pipeline boundaries to catch quality issues early.
|
| 228 |
+
|
| 229 |
+
| Validation Point | Tests |
|
| 230 |
+
|------------------|-------|
|
| 231 |
+
| Raw Database | Schema, row count, required columns |
|
| 232 |
+
| Feature Matrix | No NaN/Inf, sparsity, SMOTE compatibility |
|
| 233 |
+
| Label Matrix | Binary format, distribution, consistency |
|
| 234 |
+
| Train/Test Split | No leakage, stratification |
|
| 235 |
+
|
| 236 |
+
#### Deepchecks (24 Checks)
|
| 237 |
+
|
| 238 |
+
**Suites:**
|
| 239 |
+
- **Data Integrity Suite** (12 checks): Duplicates, nulls, correlations
|
| 240 |
+
- **Train-Test Validation Suite** (12 checks): Leakage, drift, distribution
|
| 241 |
+
|
| 242 |
+
**Status:** Production-ready (96% overall score)
|
| 243 |
+
|
| 244 |
+
---
|
| 245 |
+
|
| 246 |
+
## 4. API (Inference Service)
|
| 247 |
+
|
| 248 |
+
### FastAPI Implementation
|
| 249 |
+
|
| 250 |
+
**Design Decision:** Use FastAPI for async request handling, automatic OpenAPI generation, and native Pydantic validation.
|
| 251 |
+
|
| 252 |
+
**Key Features:**
|
| 253 |
+
- Async lifespan management for model loading
|
| 254 |
+
- Middleware for Prometheus metrics collection
|
| 255 |
+
- Structured exception handling
|
| 256 |
+
|
| 257 |
+
### RESTful Principles
|
| 258 |
+
|
| 259 |
+
**Design Decision:** Follow REST best practices for intuitive API design.
|
| 260 |
+
|
| 261 |
+
| Principle | Implementation |
|
| 262 |
+
|-----------|----------------|
|
| 263 |
+
| **Nouns, not verbs** | `/predictions` instead of `/getPrediction` |
|
| 264 |
+
| **Plural resources** | `/predictions`, `/issues` |
|
| 265 |
+
| **HTTP methods** | GET (retrieve), POST (create) |
|
| 266 |
+
| **Status codes** | 200 (OK), 201 (Created), 404 (Not Found), 500 (Error) |
|
| 267 |
+
|
| 268 |
+
**Endpoint Design:**
|
| 269 |
+
|
| 270 |
+
| Method | Endpoint | Action |
|
| 271 |
+
|--------|----------|--------|
|
| 272 |
+
| `POST` | `/predict` | Create new prediction |
|
| 273 |
+
| `POST` | `/predict/batch` | Create batch predictions |
|
| 274 |
+
| `GET` | `/predictions` | List predictions |
|
| 275 |
+
| `GET` | `/predictions/{run_id}` | Get specific prediction |
|
| 276 |
+
|
| 277 |
+
### OpenAPI/Swagger Documentation
|
| 278 |
+
|
| 279 |
+
**Auto-generated documentation at runtime:**
|
| 280 |
+
- Swagger UI: `/docs`
|
| 281 |
+
- ReDoc: `/redoc`
|
| 282 |
+
- OpenAPI JSON: `/openapi.json`
|
| 283 |
+
|
| 284 |
+
**Pydantic Models for Schema Enforcement:**
|
| 285 |
+
```python
|
| 286 |
+
class IssueInput(BaseModel):
|
| 287 |
+
issue_text: str
|
| 288 |
+
repo_name: Optional[str] = None
|
| 289 |
+
pr_number: Optional[int] = None
|
| 290 |
+
|
| 291 |
+
class PredictionResponse(BaseModel):
|
| 292 |
+
run_id: str
|
| 293 |
+
predictions: List[SkillPrediction]
|
| 294 |
+
model_version: str
|
| 295 |
+
```
|
| 296 |
+
|
| 297 |
+
---
|
| 298 |
+
|
| 299 |
+
## 5. Deployment (Containerization & CI/CD)
|
| 300 |
+
|
| 301 |
+
### Docker Containerization
|
| 302 |
+
|
| 303 |
+
**Design Decision:** Multi-stage Docker builds with security best practices.
|
| 304 |
+
|
| 305 |
+
**Dockerfile Features:**
|
| 306 |
+
- Python 3.10 slim base image (minimal footprint)
|
| 307 |
+
- Non-root user for security
|
| 308 |
+
- DVC integration for model pulling
|
| 309 |
+
- Health check endpoint configuration
|
| 310 |
+
|
| 311 |
+
**Multi-Service Architecture:**
|
| 312 |
+
|
| 313 |
+
```
|
| 314 |
+
docker-compose.yml
|
| 315 |
+
βββ hopcroft-api (FastAPI)
|
| 316 |
+
β βββ Port: 8080
|
| 317 |
+
β βββ Volumes: source code, logs
|
| 318 |
+
β βββ Health check: /health
|
| 319 |
+
β
|
| 320 |
+
βββ hopcroft-gui (Streamlit)
|
| 321 |
+
β βββ Port: 8501
|
| 322 |
+
β βββ Depends on: hopcroft-api
|
| 323 |
+
β βββ Environment: API_BASE_URL
|
| 324 |
+
β
|
| 325 |
+
βββ hopcroft-net (Bridge network)
|
| 326 |
+
```
|
| 327 |
+
|
| 328 |
+
**Design Rationale:**
|
| 329 |
+
- Separation of concerns (API vs GUI)
|
| 330 |
+
- Independent scaling
|
| 331 |
+
- Health-based dependency management
|
| 332 |
+
- Shared network for internal communication
|
| 333 |
+
|
| 334 |
+
### CI/CD Pipeline (GitHub Actions)
|
| 335 |
+
|
| 336 |
+
**Design Decision:** Implement Continuous Delivery for ML (CD4ML) with automated testing and image builds.
|
| 337 |
+
|
| 338 |
+
**Pipeline Stages:**
|
| 339 |
+
|
| 340 |
+
```yaml
|
| 341 |
+
Jobs:
|
| 342 |
+
unit-tests:
|
| 343 |
+
- Checkout code
|
| 344 |
+
- Setup Python 3.10
|
| 345 |
+
- Install dependencies
|
| 346 |
+
- Ruff linting
|
| 347 |
+
- Pytest unit tests
|
| 348 |
+
- Upload test report (on failure)
|
| 349 |
+
|
| 350 |
+
build-image:
|
| 351 |
+
- Needs: unit-tests
|
| 352 |
+
- Configure DVC credentials
|
| 353 |
+
- Pull models
|
| 354 |
+
- Build Docker image
|
| 355 |
+
```
|
| 356 |
+
|
| 357 |
+
**Triggers:**
|
| 358 |
+
- Push to `main`, `feature/*`
|
| 359 |
+
- Pull requests to `main`
|
| 360 |
+
|
| 361 |
+
**Secrets Management:**
|
| 362 |
+
- `DAGSHUB_USERNAME`: DagsHub authentication
|
| 363 |
+
- `DAGSHUB_TOKEN`: DagsHub access token
|
| 364 |
+
|
| 365 |
+
### Hugging Face Spaces Hosting
|
| 366 |
+
|
| 367 |
+
**Design Decision:** Deploy on HF Spaces for free GPU-enabled hosting with Docker SDK support.
|
| 368 |
+
|
| 369 |
+
**Configuration:**
|
| 370 |
+
```yaml
|
| 371 |
+
---
|
| 372 |
+
title: Hopcroft Skill Classification
|
| 373 |
+
sdk: docker
|
| 374 |
+
app_port: 7860
|
| 375 |
+
---
|
| 376 |
+
```
|
| 377 |
+
|
| 378 |
+
**Startup Flow:**
|
| 379 |
+
1. `start_space.sh` configures DVC credentials
|
| 380 |
+
2. Pull models from DagsHub
|
| 381 |
+
3. Start FastAPI (port 8000)
|
| 382 |
+
4. Start Streamlit (port 8501)
|
| 383 |
+
5. Start Nginx (port 7860) for routing
|
| 384 |
+
|
| 385 |
+
**Nginx Reverse Proxy:**
|
| 386 |
+
- `/` β Streamlit GUI
|
| 387 |
+
- `/docs`, `/predict`, `/predictions` β FastAPI
|
| 388 |
+
- `/prometheus` β Prometheus metrics
|
| 389 |
+
|
| 390 |
+
---
|
| 391 |
+
|
| 392 |
+
## 6. Monitoring
|
| 393 |
+
|
| 394 |
+
### Resource-Level Monitoring
|
| 395 |
+
|
| 396 |
+
**Design Decision:** Implement Prometheus metrics for real-time observability.
|
| 397 |
+
|
| 398 |
+
| Metric | Type | Purpose |
|
| 399 |
+
|--------|------|---------|
|
| 400 |
+
| `hopcroft_requests_total` | Counter | Request volume by endpoint |
|
| 401 |
+
| `hopcroft_request_duration_seconds` | Histogram | Latency distribution (P50, P90, P99) |
|
| 402 |
+
| `hopcroft_in_progress_requests` | Gauge | Concurrent request load |
|
| 403 |
+
| `hopcroft_prediction_processing_seconds` | Summary | Model inference time |
|
| 404 |
+
|
| 405 |
+
**Middleware Implementation:**
|
| 406 |
+
```python
|
| 407 |
+
@app.middleware("http")
|
| 408 |
+
async def monitor_requests(request, call_next):
|
| 409 |
+
IN_PROGRESS.inc()
|
| 410 |
+
with REQUEST_LATENCY.labels(method, endpoint).time():
|
| 411 |
+
response = await call_next(request)
|
| 412 |
+
REQUESTS_TOTAL.labels(method, endpoint, status).inc()
|
| 413 |
+
IN_PROGRESS.dec()
|
| 414 |
+
return response
|
| 415 |
+
```
|
| 416 |
+
|
| 417 |
+
### Performance-Level Monitoring
|
| 418 |
+
|
| 419 |
+
**Model Staleness Indicators:**
|
| 420 |
+
- Prediction confidence trends over time
|
| 421 |
+
- Drift detection alerts
|
| 422 |
+
- Error rate monitoring
|
| 423 |
+
|
| 424 |
+
### Drift Detection Strategy
|
| 425 |
+
|
| 426 |
+
**Design Decision:** Implement statistical drift detection using Kolmogorov-Smirnov test with Bonferroni correction.
|
| 427 |
+
|
| 428 |
+
| Component | Details |
|
| 429 |
+
|-----------|---------|
|
| 430 |
+
| **Algorithm** | KS Two-Sample Test |
|
| 431 |
+
| **Baseline** | 1000 samples from training data |
|
| 432 |
+
| **Threshold** | p-value < 0.05 (Bonferroni corrected) |
|
| 433 |
+
| **Execution** | Scheduled via cron or manual trigger |
|
| 434 |
+
|
| 435 |
+
**Drift Types Monitored:**
|
| 436 |
+
|
| 437 |
+
| Type | Definition | Detection Method |
|
| 438 |
+
|------|------------|------------------|
|
| 439 |
+
| **Data Drift** | Feature distribution shift | KS test on input features |
|
| 440 |
+
| **Target Drift** | Label distribution shift | Chi-square test on predictions |
|
| 441 |
+
| **Concept Drift** | Relationship change | Performance degradation monitoring |
|
| 442 |
+
|
| 443 |
+
**Metrics Published to Pushgateway:**
|
| 444 |
+
- `drift_detected`: Binary indicator (0/1)
|
| 445 |
+
- `drift_p_value`: Statistical significance
|
| 446 |
+
- `drift_distance`: KS distance metric
|
| 447 |
+
- `drift_check_timestamp`: Last check time
|
| 448 |
+
|
| 449 |
+
### Alerting Configuration
|
| 450 |
+
|
| 451 |
+
**Prometheus Alert Rules:**
|
| 452 |
+
|
| 453 |
+
| Alert | Condition | Severity |
|
| 454 |
+
|-------|-----------|----------|
|
| 455 |
+
| `ServiceDown` | Target down for 5m | Critical |
|
| 456 |
+
| `HighErrorRate` | 5xx rate > 10% | Warning |
|
| 457 |
+
| `SlowRequests` | P95 latency > 2s | Warning |
|
| 458 |
+
| `DriftDetected` | drift_detected = 1 | Warning |
|
| 459 |
+
|
| 460 |
+
**Alertmanager Integration:**
|
| 461 |
+
- Severity-based routing
|
| 462 |
+
- Email notifications
|
| 463 |
+
- Inhibition rules to prevent alert storms
|
| 464 |
+
|
| 465 |
+
### Grafana Visualization
|
| 466 |
+
|
| 467 |
+
**Dashboard Panels:**
|
| 468 |
+
1. API Request Rate (time series)
|
| 469 |
+
2. API Latency Percentiles (heatmap)
|
| 470 |
+
3. Drift Detection Status (stat panel)
|
| 471 |
+
4. Drift P-Value Trend (time series)
|
| 472 |
+
5. Error Rate (gauge)
|
| 473 |
+
|
| 474 |
+
**Data Sources:**
|
| 475 |
+
- Prometheus: Real-time metrics
|
| 476 |
+
- Pushgateway: Batch job metrics (drift detection)
|
| 477 |
+
|
| 478 |
+
### HF Spaces Deployment
|
| 479 |
+
|
| 480 |
+
Both Prometheus and Grafana are deployed on Hugging Face Spaces via Nginx reverse proxy:
|
| 481 |
+
|
| 482 |
+
| Service | Production URL |
|
| 483 |
+
|---------|----------------|
|
| 484 |
+
| Prometheus | `https://dacrow13-hopcroft-skill-classification.hf.space/prometheus/` |
|
| 485 |
+
| Grafana | `https://dacrow13-hopcroft-skill-classification.hf.space/grafana/` |
|
| 486 |
+
|
| 487 |
+
This enables real-time monitoring of the production deployment without additional infrastructure.
|
docs/docs/getting-started.md
DELETED
|
@@ -1,6 +0,0 @@
|
|
| 1 |
-
Getting started
|
| 2 |
-
===============
|
| 3 |
-
|
| 4 |
-
This is where you describe how to get set up on a clean install, including the
|
| 5 |
-
commands necessary to get the raw data (using the `sync_data_from_s3` command,
|
| 6 |
-
for example), and then how to make the cleaned, final data sets.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
docs/docs/index.md
DELETED
|
@@ -1,10 +0,0 @@
|
|
| 1 |
-
# Hopcroft_Skill-Classification-Tool-Competition documentation!
|
| 2 |
-
|
| 3 |
-
## Description
|
| 4 |
-
|
| 5 |
-
The task involves analyzing the relationship between issue characteristics and required skills, developing effective feature extraction methods that combine textual and code-context information, and implementing sophisticated multi-label classification approaches. Students may incorporate additional GitHub metadata to enhance model inputs, but must avoid using third-party classification engines or direct outputs from the provided database. The work requires careful attention to the multi-label nature of the problem, where each issue may require multiple different skills for resolution.
|
| 6 |
-
|
| 7 |
-
## Commands
|
| 8 |
-
|
| 9 |
-
The Makefile contains the central entry points for common tasks related to this project.
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
docs/milestone_summaries.md
ADDED
|
@@ -0,0 +1,288 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Milestone Summaries
|
| 2 |
+
|
| 3 |
+
This document provides a comprehensive overview of all six project milestones, documenting the evolution of the Hopcroft Skill Classification system from requirements engineering through production monitoring.
|
| 4 |
+
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
## Milestone 1: Requirements Engineering
|
| 8 |
+
|
| 9 |
+
**Objective:** Define the problem space, stakeholders, and success criteria using the Machine Learning Canvas framework.
|
| 10 |
+
|
| 11 |
+
### Key Deliverables
|
| 12 |
+
|
| 13 |
+
| Component | Description |
|
| 14 |
+
|-----------|-------------|
|
| 15 |
+
| **Prediction Task** | Multi-label classification of 217 technical skills from GitHub issue/PR text |
|
| 16 |
+
| **Stakeholders** | Project managers, team leads, developers |
|
| 17 |
+
| **Data Source** | SkillScope DB with 7,245 merged PRs from 11 Java repositories |
|
| 18 |
+
| **Success Metrics** | Micro-F1 score improvement over baseline, precision/recall balance |
|
| 19 |
+
|
| 20 |
+
### ML Canvas Framework
|
| 21 |
+
|
| 22 |
+
The complete ML Canvas is documented in [ML Canvas.md](./ML%20Canvas.md), covering:
|
| 23 |
+
|
| 24 |
+
- **Value Proposition**: Automated task assignment optimization
|
| 25 |
+
- **Decisions**: Resource allocation for issue resolution
|
| 26 |
+
- **Data Collection**: Automated labeling via API call detection
|
| 27 |
+
- **Impact Simulation**: Outperform SkillScope RF + TF-IDF baseline
|
| 28 |
+
- **Monitoring**: Continuous evaluation with drift detection
|
| 29 |
+
|
| 30 |
+
### Identified Risks & Mitigations
|
| 31 |
+
|
| 32 |
+
| Risk | Mitigation Strategy |
|
| 33 |
+
|------|---------------------|
|
| 34 |
+
| Label imbalance (217 classes) | SMOTE, MLSMOTE, ADASYN oversampling |
|
| 35 |
+
| Text noise (URLs, HTML, code) | Custom preprocessing pipeline |
|
| 36 |
+
| Multi-label complexity | MultiOutputClassifier with stratified splits |
|
| 37 |
+
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
+
## Milestone 2: Data Management & Experiment Tracking
|
| 41 |
+
|
| 42 |
+
**Objective:** Establish end-to-end infrastructure for reproducible ML experiments.
|
| 43 |
+
|
| 44 |
+
### Data Pipeline
|
| 45 |
+
|
| 46 |
+
```
|
| 47 |
+
data/raw/ β dataset.py β data/processed/
|
| 48 |
+
(SkillScope SQLite) (HuggingFace) (Clean CSV)
|
| 49 |
+
β
|
| 50 |
+
features.py
|
| 51 |
+
β
|
| 52 |
+
data/processed/
|
| 53 |
+
(TF-IDF/Embeddings)
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
### Key Components
|
| 57 |
+
|
| 58 |
+
1. **Data Management**
|
| 59 |
+
- DVC setup with DagsHub remote storage
|
| 60 |
+
- Git-ignored data and model directories
|
| 61 |
+
- Version-controlled `.dvc` files for reproducibility
|
| 62 |
+
|
| 63 |
+
2. **Data Ingestion**
|
| 64 |
+
- `dataset.py`: Downloads SkillScope from Hugging Face
|
| 65 |
+
- Extracts SQLite database with cleanup
|
| 66 |
+
|
| 67 |
+
3. **Feature Engineering**
|
| 68 |
+
- `features.py`: Text cleaning pipeline
|
| 69 |
+
- URL/HTML/Markdown removal
|
| 70 |
+
- Normalization and Porter stemming
|
| 71 |
+
- TF-IDF vectorization (uni+bi-grams)
|
| 72 |
+
- Sentence embedding generation
|
| 73 |
+
|
| 74 |
+
4. **Configuration**
|
| 75 |
+
- `config.py`: Centralized paths, hyperparameters, MLflow URI
|
| 76 |
+
|
| 77 |
+
5. **Experiment Tracking**
|
| 78 |
+
- MLflow with DagsHub remote
|
| 79 |
+
- Logged metrics: precision, recall, F1-score
|
| 80 |
+
- Artifact storage: models, vectorizers, scalers
|
| 81 |
+
|
| 82 |
+
### Training Actions
|
| 83 |
+
|
| 84 |
+
| Action | Description |
|
| 85 |
+
|--------|-------------|
|
| 86 |
+
| `baseline` | Random Forest with TF-IDF |
|
| 87 |
+
| `mlsmote` | Multi-label SMOTE oversampling |
|
| 88 |
+
| `ros` | Random Oversampling |
|
| 89 |
+
| `adasyn-pca` | ADASYN + PCA dimensionality reduction |
|
| 90 |
+
| `lightgbm` | LightGBM classifier |
|
| 91 |
+
|
| 92 |
+
---
|
| 93 |
+
|
| 94 |
+
## Milestone 3: Quality Assurance
|
| 95 |
+
|
| 96 |
+
**Objective:** Implement comprehensive testing and validation framework for data quality and model robustness.
|
| 97 |
+
|
| 98 |
+
### Data Cleaning Pipeline
|
| 99 |
+
|
| 100 |
+
| Metric | Before | After | Resolution |
|
| 101 |
+
|--------|--------|-------|------------|
|
| 102 |
+
| Total Samples | 7,154 | 6,673 | -481 duplicates |
|
| 103 |
+
| Duplicates | 481 | 0 | Exact match removal |
|
| 104 |
+
| Label Conflicts | 640 | 0 | Majority voting |
|
| 105 |
+
| Data Leakage | Present | 0 | Train/test separation |
|
| 106 |
+
|
| 107 |
+
### Validation Frameworks
|
| 108 |
+
|
| 109 |
+
#### Great Expectations (10 Tests)
|
| 110 |
+
|
| 111 |
+
| Test | Purpose | Status |
|
| 112 |
+
|------|---------|--------|
|
| 113 |
+
| Database Schema | Validate SQLite structure | β
Pass |
|
| 114 |
+
| TF-IDF Matrix | No NaN/Inf, sparsity checks | β
Pass |
|
| 115 |
+
| Binary Labels | Values in {0,1} | β
Pass |
|
| 116 |
+
| Feature-Label Alignment | Row count consistency | β
Pass |
|
| 117 |
+
| Label Distribution | Min 5 occurrences per label | β
Pass |
|
| 118 |
+
| SMOTE Compatibility | Min 10 non-zero features | β
Pass |
|
| 119 |
+
| Multi-Output Format | >50% multi-label samples | β
Pass |
|
| 120 |
+
| Duplicate Detection | No duplicate features | β
Pass |
|
| 121 |
+
| Train-Test Separation | Zero intersection | β
Pass |
|
| 122 |
+
| Label Consistency | Same features β same labels | β
Pass |
|
| 123 |
+
|
| 124 |
+
#### Deepchecks (24 Checks)
|
| 125 |
+
|
| 126 |
+
- **Data Integrity Suite**: 92% score (12 checks)
|
| 127 |
+
- **Train-Test Validation Suite**: 100% score (12 checks)
|
| 128 |
+
- **Overall Status**: Production-ready (96% combined)
|
| 129 |
+
|
| 130 |
+
#### Behavioral Testing (36 Tests)
|
| 131 |
+
|
| 132 |
+
| Category | Tests | Description |
|
| 133 |
+
|----------|-------|-------------|
|
| 134 |
+
| Invariance | 9 | Typo, case, punctuation robustness |
|
| 135 |
+
| Directional | 10 | Keyword addition effects |
|
| 136 |
+
| Minimum Functionality | 17 | Basic skill predictions |
|
| 137 |
+
|
| 138 |
+
### Code Quality
|
| 139 |
+
|
| 140 |
+
- **Ruff Analysis**: 28 minor issues (100% fixable)
|
| 141 |
+
- **Standards**: PEP 8 compliant, Black compatible
|
| 142 |
+
|
| 143 |
+
Full details: [testing_and_validation.md](./testing_and_validation.md)
|
| 144 |
+
|
| 145 |
+
---
|
| 146 |
+
|
| 147 |
+
## Milestone 4: API Development
|
| 148 |
+
|
| 149 |
+
**Objective:** Implement production-ready REST API for skill prediction with MLflow integration.
|
| 150 |
+
|
| 151 |
+
### Endpoints
|
| 152 |
+
|
| 153 |
+
| Method | Endpoint | Description |
|
| 154 |
+
|--------|----------|-------------|
|
| 155 |
+
| `POST` | `/predict` | Single issue prediction |
|
| 156 |
+
| `POST` | `/predict/batch` | Batch predictions (max 100) |
|
| 157 |
+
| `GET` | `/predictions/{run_id}` | Retrieve by MLflow Run ID |
|
| 158 |
+
| `GET` | `/predictions` | List recent predictions |
|
| 159 |
+
| `GET` | `/health` | Service health check |
|
| 160 |
+
| `GET` | `/metrics` | Prometheus metrics |
|
| 161 |
+
|
| 162 |
+
### Features
|
| 163 |
+
|
| 164 |
+
- **FastAPI Framework**: Async request handling, auto-generated OpenAPI docs
|
| 165 |
+
- **MLflow Integration**: All predictions logged with metadata
|
| 166 |
+
- **Pydantic Validation**: Request/response schema enforcement
|
| 167 |
+
- **Prometheus Metrics**: Request counters, latency histograms, gauges
|
| 168 |
+
|
| 169 |
+
### Documentation Access
|
| 170 |
+
|
| 171 |
+
- Swagger UI: `/docs`
|
| 172 |
+
- ReDoc: `/redoc`
|
| 173 |
+
- OpenAPI JSON: `/openapi.json`
|
| 174 |
+
|
| 175 |
+
---
|
| 176 |
+
|
| 177 |
+
## Milestone 5: Deployment & Containerization
|
| 178 |
+
|
| 179 |
+
**Objective:** Implement containerized deployment with CI/CD pipeline for production delivery.
|
| 180 |
+
|
| 181 |
+
### Docker Architecture
|
| 182 |
+
|
| 183 |
+
```
|
| 184 |
+
docker/docker-compose.yml
|
| 185 |
+
βββ hopcroft-api (FastAPI Backend)
|
| 186 |
+
β βββ Port: 8080
|
| 187 |
+
β βββ Health Check: /health
|
| 188 |
+
β βββ Volumes: source code, logs
|
| 189 |
+
β
|
| 190 |
+
βββ hopcroft-gui (Streamlit Frontend)
|
| 191 |
+
β βββ Port: 8501
|
| 192 |
+
β βββ Depends on: hopcroft-api
|
| 193 |
+
β
|
| 194 |
+
βββ hopcroft-net (Bridge Network)
|
| 195 |
+
```
|
| 196 |
+
|
| 197 |
+
### Hugging Face Spaces Deployment
|
| 198 |
+
|
| 199 |
+
| Component | Configuration |
|
| 200 |
+
|-----------|---------------|
|
| 201 |
+
| SDK | Docker |
|
| 202 |
+
| Port | 7860 |
|
| 203 |
+
| Startup Script | `docker/scripts/start_space.sh` |
|
| 204 |
+
| Secrets | `DAGSHUB_USERNAME`, `DAGSHUB_TOKEN` |
|
| 205 |
+
|
| 206 |
+
**Startup Flow:**
|
| 207 |
+
1. Configure DVC with secrets
|
| 208 |
+
2. Pull models from DagsHub
|
| 209 |
+
3. Start FastAPI (port 8000)
|
| 210 |
+
4. Start Streamlit (port 8501)
|
| 211 |
+
5. Start Nginx reverse proxy (port 7860)
|
| 212 |
+
|
| 213 |
+
### CI/CD Pipeline (GitHub Actions)
|
| 214 |
+
|
| 215 |
+
```yaml
|
| 216 |
+
Triggers: push/PR to main, feature/*
|
| 217 |
+
Jobs:
|
| 218 |
+
1. unit-tests
|
| 219 |
+
- Ruff linting
|
| 220 |
+
- Pytest unit tests
|
| 221 |
+
- HTML report generation
|
| 222 |
+
|
| 223 |
+
2. build-image (requires unit-tests)
|
| 224 |
+
- DVC model pull
|
| 225 |
+
- Docker image build
|
| 226 |
+
```
|
| 227 |
+
|
| 228 |
+
---
|
| 229 |
+
|
| 230 |
+
## Milestone 6: Monitoring & Observability
|
| 231 |
+
|
| 232 |
+
**Objective:** Implement comprehensive monitoring infrastructure with drift detection.
|
| 233 |
+
|
| 234 |
+
### Prometheus Metrics
|
| 235 |
+
|
| 236 |
+
| Metric | Type | Description |
|
| 237 |
+
|--------|------|-------------|
|
| 238 |
+
| `hopcroft_requests_total` | Counter | Total requests by method/endpoint |
|
| 239 |
+
| `hopcroft_request_duration_seconds` | Histogram | Request latency distribution |
|
| 240 |
+
| `hopcroft_in_progress_requests` | Gauge | Currently processing requests |
|
| 241 |
+
| `hopcroft_prediction_processing_seconds` | Summary | Model inference time |
|
| 242 |
+
|
| 243 |
+
### Grafana Dashboards
|
| 244 |
+
|
| 245 |
+
- **API Request Rate**: Real-time requests per second
|
| 246 |
+
- **API Latency**: P50, P90, P99 percentiles
|
| 247 |
+
- **Drift Detection Status**: Binary indicator (0/1)
|
| 248 |
+
- **Drift P-Value**: Statistical significance metric
|
| 249 |
+
|
| 250 |
+
### Data Drift Detection
|
| 251 |
+
|
| 252 |
+
| Component | Details |
|
| 253 |
+
|-----------|---------|
|
| 254 |
+
| Algorithm | Kolmogorov-Smirnov Two-Sample Test |
|
| 255 |
+
| Baseline | 1000 samples from training data |
|
| 256 |
+
| Threshold | p-value < 0.05 (Bonferroni corrected) |
|
| 257 |
+
| Metrics | `drift_detected`, `drift_p_value`, `drift_distance` |
|
| 258 |
+
|
| 259 |
+
### Alerting Rules
|
| 260 |
+
|
| 261 |
+
| Alert | Condition |
|
| 262 |
+
|-------|-----------|
|
| 263 |
+
| `ServiceDown` | Target unreachable for 5m |
|
| 264 |
+
| `HighErrorRate` | 5xx rate > 10% for 5m |
|
| 265 |
+
| `SlowRequests` | P95 latency > 2s |
|
| 266 |
+
|
| 267 |
+
### Load Testing (Locust)
|
| 268 |
+
|
| 269 |
+
| Task | Weight | Endpoint |
|
| 270 |
+
|------|--------|----------|
|
| 271 |
+
| Single Prediction | 60% | `POST /predict` |
|
| 272 |
+
| Batch Prediction | 20% | `POST /predict/batch` |
|
| 273 |
+
| Monitoring | 20% | `GET /health`, `/predictions` |
|
| 274 |
+
|
| 275 |
+
### HF Spaces Monitoring Access
|
| 276 |
+
|
| 277 |
+
Both Prometheus and Grafana are available on the production deployment:
|
| 278 |
+
|
| 279 |
+
| Service | URL |
|
| 280 |
+
|---------|-----|
|
| 281 |
+
| Prometheus | https://dacrow13-hopcroft-skill-classification.hf.space/prometheus/ |
|
| 282 |
+
| Grafana | https://dacrow13-hopcroft-skill-classification.hf.space/grafana/ |
|
| 283 |
+
|
| 284 |
+
### Uptime Monitoring (Better Stack)
|
| 285 |
+
|
| 286 |
+
- External monitoring from multiple locations
|
| 287 |
+
- Email notifications on failures
|
| 288 |
+
- Tracked endpoints: `/health`, `/openapi.json`, `/docs`
|
docs/user_guide.md
ADDED
|
@@ -0,0 +1,497 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# User Guide
|
| 2 |
+
|
| 3 |
+
Complete operational guide for the Hopcroft Skill Classification system covering all components: API, GUI, load testing, and monitoring.
|
| 4 |
+
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
## Table of Contents
|
| 8 |
+
|
| 9 |
+
1. [System Setup](#1-system-setup)
|
| 10 |
+
2. [API Usage](#2-api-usage)
|
| 11 |
+
3. [GUI (Streamlit)](#3-gui-streamlit)
|
| 12 |
+
4. [Load Testing (Locust)](#4-load-testing-locust)
|
| 13 |
+
5. [Monitoring (Prometheus & Grafana)](#5-monitoring-prometheus--grafana)
|
| 14 |
+
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
## 1. System Setup
|
| 18 |
+
|
| 19 |
+
### Prerequisites
|
| 20 |
+
|
| 21 |
+
| Requirement | Version | Purpose |
|
| 22 |
+
|-------------|---------|---------|
|
| 23 |
+
| Python | 3.10+ | Runtime environment |
|
| 24 |
+
| Docker | 20.10+ | Containerization |
|
| 25 |
+
| Docker Compose | 2.0+ | Multi-service orchestration |
|
| 26 |
+
| Git | 2.30+ | Version control |
|
| 27 |
+
|
| 28 |
+
### Option A: Docker Setup (Recommended)
|
| 29 |
+
|
| 30 |
+
**1. Clone and Configure**
|
| 31 |
+
|
| 32 |
+
```bash
|
| 33 |
+
git clone https://github.com/se4ai2526-uniba/Hopcroft.git
|
| 34 |
+
cd Hopcroft
|
| 35 |
+
|
| 36 |
+
# Create environment file
|
| 37 |
+
cp .env.example .env
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
**2. Edit `.env` with Your Credentials**
|
| 41 |
+
|
| 42 |
+
```env
|
| 43 |
+
MLFLOW_TRACKING_URI=https://dagshub.com/se4ai2526-uniba/Hopcroft.mlflow
|
| 44 |
+
MLFLOW_TRACKING_USERNAME=your_dagshub_username
|
| 45 |
+
MLFLOW_TRACKING_PASSWORD=your_dagshub_token
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
> [!TIP]
|
| 49 |
+
> Get your DagsHub token at: https://dagshub.com/user/settings/tokens
|
| 50 |
+
|
| 51 |
+
**3. Start All Services**
|
| 52 |
+
|
| 53 |
+
```bash
|
| 54 |
+
docker compose -f docker/docker-compose.yml up -d --build
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
**4. Verify Services**
|
| 58 |
+
|
| 59 |
+
| Service | URL | Purpose |
|
| 60 |
+
|---------|-----|---------|
|
| 61 |
+
| API (Swagger) | http://localhost:8080/docs | Interactive API documentation |
|
| 62 |
+
| GUI (Streamlit) | http://localhost:8501 | Web interface |
|
| 63 |
+
| Health Check | http://localhost:8080/health | Service status |
|
| 64 |
+
|
| 65 |
+
### Option B: Virtual Environment Setup
|
| 66 |
+
|
| 67 |
+
**1. Create Virtual Environment**
|
| 68 |
+
|
| 69 |
+
```bash
|
| 70 |
+
python -m venv venv
|
| 71 |
+
|
| 72 |
+
# Windows
|
| 73 |
+
venv\Scripts\activate
|
| 74 |
+
|
| 75 |
+
# Linux/macOS
|
| 76 |
+
source venv/bin/activate
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
**2. Install Dependencies**
|
| 80 |
+
|
| 81 |
+
```bash
|
| 82 |
+
pip install -r requirements.txt
|
| 83 |
+
pip install -e .
|
| 84 |
+
```
|
| 85 |
+
|
| 86 |
+
**3. Configure DVC (for Model Access)**
|
| 87 |
+
|
| 88 |
+
```bash
|
| 89 |
+
dvc remote modify origin --local auth basic
|
| 90 |
+
dvc remote modify origin --local user YOUR_DAGSHUB_USERNAME
|
| 91 |
+
dvc remote modify origin --local password YOUR_DAGSHUB_TOKEN
|
| 92 |
+
dvc pull
|
| 93 |
+
```
|
| 94 |
+
|
| 95 |
+
**4. Start Services Manually**
|
| 96 |
+
|
| 97 |
+
```bash
|
| 98 |
+
# Terminal 1: Start API
|
| 99 |
+
make api-dev
|
| 100 |
+
|
| 101 |
+
# Terminal 2: Start Streamlit
|
| 102 |
+
streamlit run hopcroft_skill_classification_tool_competition/streamlit_app.py
|
| 103 |
+
```
|
| 104 |
+
|
| 105 |
+
### Docker Compose Commands Reference
|
| 106 |
+
|
| 107 |
+
| Command | Description |
|
| 108 |
+
|---------|-------------|
|
| 109 |
+
| `docker compose -f docker/docker-compose.yml up -d` | Start in background |
|
| 110 |
+
| `docker compose -f docker/docker-compose.yml down` | Stop all services |
|
| 111 |
+
| `docker compose -f docker/docker-compose.yml logs -f` | Stream logs |
|
| 112 |
+
| `docker compose -f docker/docker-compose.yml ps` | Check status |
|
| 113 |
+
| `docker compose -f docker/docker-compose.yml restart` | Restart services |
|
| 114 |
+
|
| 115 |
+
---
|
| 116 |
+
|
| 117 |
+
## 2. API Usage
|
| 118 |
+
|
| 119 |
+
### Base URLs
|
| 120 |
+
|
| 121 |
+
| Environment | URL |
|
| 122 |
+
|-------------|-----|
|
| 123 |
+
| Local (Docker) | http://localhost:8080 |
|
| 124 |
+
| Local (Dev) | http://localhost:8000 |
|
| 125 |
+
| Production (HF Spaces) | https://se4ai2526-uniba-hopcroft.hf.space |
|
| 126 |
+
|
| 127 |
+
### Endpoints Overview
|
| 128 |
+
|
| 129 |
+
| Method | Endpoint | Description |
|
| 130 |
+
|--------|----------|-------------|
|
| 131 |
+
| `POST` | `/predict` | Predict skills for single issue |
|
| 132 |
+
| `POST` | `/predict/batch` | Batch prediction (max 100) |
|
| 133 |
+
| `GET` | `/predictions` | List recent predictions |
|
| 134 |
+
| `GET` | `/predictions/{run_id}` | Get prediction by ID |
|
| 135 |
+
| `GET` | `/health` | Health check |
|
| 136 |
+
| `GET` | `/metrics` | Prometheus metrics |
|
| 137 |
+
|
| 138 |
+
### Interactive Documentation
|
| 139 |
+
|
| 140 |
+
Access Swagger UI for interactive testing:
|
| 141 |
+
- **Swagger**: http://localhost:8080/docs
|
| 142 |
+
- **ReDoc**: http://localhost:8080/redoc
|
| 143 |
+
|
| 144 |
+
### Example Requests
|
| 145 |
+
|
| 146 |
+
#### Single Prediction
|
| 147 |
+
|
| 148 |
+
```bash
|
| 149 |
+
curl -X POST "http://localhost:8080/predict" \
|
| 150 |
+
-H "Content-Type: application/json" \
|
| 151 |
+
-d '{
|
| 152 |
+
"issue_text": "Fix authentication bug in OAuth2 login flow",
|
| 153 |
+
"repo_name": "my-project",
|
| 154 |
+
"pr_number": 42
|
| 155 |
+
}'
|
| 156 |
+
```
|
| 157 |
+
|
| 158 |
+
**Response:**
|
| 159 |
+
```json
|
| 160 |
+
{
|
| 161 |
+
"run_id": "abc123...",
|
| 162 |
+
"predictions": [
|
| 163 |
+
{"skill": "authentication", "confidence": 0.92},
|
| 164 |
+
{"skill": "security", "confidence": 0.78},
|
| 165 |
+
{"skill": "oauth", "confidence": 0.65}
|
| 166 |
+
],
|
| 167 |
+
"model_version": "1.0.0",
|
| 168 |
+
"timestamp": "2025-01-05T15:00:00Z"
|
| 169 |
+
}
|
| 170 |
+
```
|
| 171 |
+
|
| 172 |
+
#### Batch Prediction
|
| 173 |
+
|
| 174 |
+
```bash
|
| 175 |
+
curl -X POST "http://localhost:8080/predict/batch" \
|
| 176 |
+
-H "Content-Type: application/json" \
|
| 177 |
+
-d '{
|
| 178 |
+
"issues": [
|
| 179 |
+
{"issue_text": "Database connection timeout"},
|
| 180 |
+
{"issue_text": "UI button not responding"}
|
| 181 |
+
]
|
| 182 |
+
}'
|
| 183 |
+
```
|
| 184 |
+
|
| 185 |
+
#### List Predictions
|
| 186 |
+
|
| 187 |
+
```bash
|
| 188 |
+
curl "http://localhost:8080/predictions?limit=10&skip=0"
|
| 189 |
+
```
|
| 190 |
+
|
| 191 |
+
#### Health Check
|
| 192 |
+
|
| 193 |
+
```bash
|
| 194 |
+
curl "http://localhost:8080/health"
|
| 195 |
+
```
|
| 196 |
+
|
| 197 |
+
**Response:**
|
| 198 |
+
```json
|
| 199 |
+
{
|
| 200 |
+
"status": "healthy",
|
| 201 |
+
"model_loaded": true,
|
| 202 |
+
"model_version": "1.0.0"
|
| 203 |
+
}
|
| 204 |
+
```
|
| 205 |
+
|
| 206 |
+
### Makefile Shortcuts
|
| 207 |
+
|
| 208 |
+
```bash
|
| 209 |
+
make test-api-health # Test health endpoint
|
| 210 |
+
make test-api-predict # Test prediction
|
| 211 |
+
make test-api-list # List predictions
|
| 212 |
+
make test-api-all # Run all API tests
|
| 213 |
+
```
|
| 214 |
+
|
| 215 |
+
---
|
| 216 |
+
|
| 217 |
+
## 3. GUI (Streamlit)
|
| 218 |
+
|
| 219 |
+
### Access Points
|
| 220 |
+
|
| 221 |
+
| Environment | URL |
|
| 222 |
+
|-------------|-----|
|
| 223 |
+
| Local (Docker) | http://localhost:8501 |
|
| 224 |
+
| Production | https://se4ai2526-uniba-hopcroft.hf.space |
|
| 225 |
+
|
| 226 |
+
### Features
|
| 227 |
+
|
| 228 |
+
- **Real-time Prediction**: Instant skill classification
|
| 229 |
+
- **Confidence Scores**: Probability for each predicted skill
|
| 230 |
+
- **Multiple Input Modes**: Quick input, detailed input, examples
|
| 231 |
+
- **API Health Indicator**: Connection status in sidebar
|
| 232 |
+
|
| 233 |
+
### User Interface
|
| 234 |
+
|
| 235 |
+
#### Main Dashboard
|
| 236 |
+
|
| 237 |
+

|
| 238 |
+
|
| 239 |
+
The sidebar displays:
|
| 240 |
+
- API connection status
|
| 241 |
+
- Confidence threshold slider
|
| 242 |
+
- Model information
|
| 243 |
+
|
| 244 |
+
#### Quick Input Mode
|
| 245 |
+
|
| 246 |
+

|
| 247 |
+
|
| 248 |
+
1. Paste GitHub issue text
|
| 249 |
+
2. Click "Predict Skills"
|
| 250 |
+
3. View results instantly
|
| 251 |
+
|
| 252 |
+
#### Detailed Input Mode
|
| 253 |
+
|
| 254 |
+

|
| 255 |
+
|
| 256 |
+
Optional metadata fields:
|
| 257 |
+
- Repository name
|
| 258 |
+
- PR number
|
| 259 |
+
- Extended description
|
| 260 |
+
|
| 261 |
+
#### Prediction Results
|
| 262 |
+
|
| 263 |
+

|
| 264 |
+
|
| 265 |
+
Results display:
|
| 266 |
+
- Top-5 predicted skills with confidence bars
|
| 267 |
+
- Full predictions table with filtering
|
| 268 |
+
- Processing time metrics
|
| 269 |
+
- Raw JSON response (expandable)
|
| 270 |
+
|
| 271 |
+
#### Example Gallery
|
| 272 |
+
|
| 273 |
+

|
| 274 |
+
|
| 275 |
+
Pre-loaded test cases:
|
| 276 |
+
- Authentication bugs
|
| 277 |
+
- ML feature requests
|
| 278 |
+
- Database issues
|
| 279 |
+
- UI enhancements
|
| 280 |
+
|
| 281 |
+
---
|
| 282 |
+
|
| 283 |
+
## 4. Load Testing (Locust)
|
| 284 |
+
|
| 285 |
+
### Installation
|
| 286 |
+
|
| 287 |
+
```bash
|
| 288 |
+
pip install locust
|
| 289 |
+
```
|
| 290 |
+
|
| 291 |
+
### Configuration
|
| 292 |
+
|
| 293 |
+
The Locust configuration is in `monitoring/locust/locustfile.py`:
|
| 294 |
+
|
| 295 |
+
| Task | Weight | Endpoint |
|
| 296 |
+
|------|--------|----------|
|
| 297 |
+
| Single Prediction | 60% (weight: 3) | `POST /predict` |
|
| 298 |
+
| Batch Prediction | 20% (weight: 1) | `POST /predict/batch` |
|
| 299 |
+
| Monitoring | 20% (weight: 1) | `GET /health`, `/predictions` |
|
| 300 |
+
|
| 301 |
+
### Running Load Tests
|
| 302 |
+
|
| 303 |
+
#### Web UI Mode
|
| 304 |
+
|
| 305 |
+
```bash
|
| 306 |
+
cd monitoring/locust
|
| 307 |
+
locust
|
| 308 |
+
```
|
| 309 |
+
|
| 310 |
+
Then open: http://localhost:8089
|
| 311 |
+
|
| 312 |
+
Configure in the Web UI:
|
| 313 |
+
- **Number of users**: Total concurrent users
|
| 314 |
+
- **Spawn rate**: Users per second to add
|
| 315 |
+
- **Host**: Target URL (e.g., `http://localhost:8080`)
|
| 316 |
+
|
| 317 |
+
#### Headless Mode
|
| 318 |
+
|
| 319 |
+
```bash
|
| 320 |
+
locust --headless \
|
| 321 |
+
--users 50 \
|
| 322 |
+
--spawn-rate 10 \
|
| 323 |
+
--run-time 5m \
|
| 324 |
+
--host http://localhost:8080 \
|
| 325 |
+
--csv results
|
| 326 |
+
```
|
| 327 |
+
|
| 328 |
+
### Target URLs
|
| 329 |
+
|
| 330 |
+
| Environment | Host URL |
|
| 331 |
+
|-------------|----------|
|
| 332 |
+
| Local Docker | `http://localhost:8080` |
|
| 333 |
+
| Local Dev | `http://localhost:8000` |
|
| 334 |
+
| HF Spaces | `https://dacrow13-hopcroft-skill-classification.hf.space` |
|
| 335 |
+
|
| 336 |
+
### Interpreting Results
|
| 337 |
+
|
| 338 |
+
| Metric | Description | Target |
|
| 339 |
+
|--------|-------------|--------|
|
| 340 |
+
| RPS | Requests per second | Higher = better |
|
| 341 |
+
| Median Response Time | 50th percentile latency | < 500ms |
|
| 342 |
+
| 95th Percentile | Worst-case latency | < 2s |
|
| 343 |
+
| Failure Rate | Percentage of errors | < 1% |
|
| 344 |
+
|
| 345 |
+
---
|
| 346 |
+
|
| 347 |
+
## 5. Monitoring (Prometheus & Grafana)
|
| 348 |
+
|
| 349 |
+
### Access Points
|
| 350 |
+
|
| 351 |
+
**Local Development:**
|
| 352 |
+
|
| 353 |
+
| Service | URL |
|
| 354 |
+
|---------|-----|
|
| 355 |
+
| Prometheus | http://localhost:9090 |
|
| 356 |
+
| Grafana | http://localhost:3000 |
|
| 357 |
+
| Pushgateway | http://localhost:9091 |
|
| 358 |
+
|
| 359 |
+
**Hugging Face Spaces (Production):**
|
| 360 |
+
|
| 361 |
+
| Service | URL |
|
| 362 |
+
|---------|-----|
|
| 363 |
+
| Prometheus | https://dacrow13-hopcroft-skill-classification.hf.space/prometheus/ |
|
| 364 |
+
| Grafana | https://dacrow13-hopcroft-skill-classification.hf.space/grafana/ |
|
| 365 |
+
|
| 366 |
+
### Prometheus Metrics
|
| 367 |
+
|
| 368 |
+
Access the metrics endpoint: http://localhost:8080/metrics
|
| 369 |
+
|
| 370 |
+
#### Available Metrics
|
| 371 |
+
|
| 372 |
+
| Metric | Type | Description |
|
| 373 |
+
|--------|------|-------------|
|
| 374 |
+
| `hopcroft_requests_total` | Counter | Total requests by method/endpoint |
|
| 375 |
+
| `hopcroft_request_duration_seconds` | Histogram | Request latency distribution |
|
| 376 |
+
| `hopcroft_in_progress_requests` | Gauge | Currently processing requests |
|
| 377 |
+
| `hopcroft_prediction_processing_seconds` | Summary | Model inference time |
|
| 378 |
+
|
| 379 |
+
#### Useful PromQL Queries
|
| 380 |
+
|
| 381 |
+
**Request Rate (per second)**
|
| 382 |
+
```promql
|
| 383 |
+
rate(hopcroft_requests_total[1m])
|
| 384 |
+
```
|
| 385 |
+
|
| 386 |
+
**Average Latency**
|
| 387 |
+
```promql
|
| 388 |
+
rate(hopcroft_request_duration_seconds_sum[5m]) / rate(hopcroft_request_duration_seconds_count[5m])
|
| 389 |
+
```
|
| 390 |
+
|
| 391 |
+
**In-Progress Requests**
|
| 392 |
+
```promql
|
| 393 |
+
hopcroft_in_progress_requests
|
| 394 |
+
```
|
| 395 |
+
|
| 396 |
+
**Model Prediction Time (P90)**
|
| 397 |
+
```promql
|
| 398 |
+
hopcroft_prediction_processing_seconds{quantile="0.9"}
|
| 399 |
+
```
|
| 400 |
+
|
| 401 |
+
### Grafana Dashboards
|
| 402 |
+
|
| 403 |
+
The pre-configured dashboard includes:
|
| 404 |
+
|
| 405 |
+
| Panel | Description |
|
| 406 |
+
|-------|-------------|
|
| 407 |
+
| API Request Rate | Real-time requests per endpoint |
|
| 408 |
+
| API Latency | Response time distribution |
|
| 409 |
+
| Drift Detection Status | Binary indicator (0=No Drift, 1=Drift) |
|
| 410 |
+
| Drift P-Value | Statistical significance |
|
| 411 |
+
| Drift Distance | KS test distance metric |
|
| 412 |
+
|
| 413 |
+
### Data Drift Detection
|
| 414 |
+
|
| 415 |
+
#### Prepare Baseline (One-time)
|
| 416 |
+
|
| 417 |
+
```bash
|
| 418 |
+
cd monitoring/drift/scripts
|
| 419 |
+
python prepare_baseline.py
|
| 420 |
+
```
|
| 421 |
+
|
| 422 |
+
#### Run Drift Check
|
| 423 |
+
|
| 424 |
+
```bash
|
| 425 |
+
python run_drift_check.py
|
| 426 |
+
```
|
| 427 |
+
|
| 428 |
+
#### Verify Results
|
| 429 |
+
|
| 430 |
+
```bash
|
| 431 |
+
# Check Pushgateway
|
| 432 |
+
curl http://localhost:9091/metrics | grep drift
|
| 433 |
+
|
| 434 |
+
# PromQL queries
|
| 435 |
+
drift_detected
|
| 436 |
+
drift_p_value
|
| 437 |
+
drift_distance
|
| 438 |
+
```
|
| 439 |
+
|
| 440 |
+
### Alerting Rules
|
| 441 |
+
|
| 442 |
+
Pre-configured alerts in `monitoring/prometheus/alert_rules.yml`:
|
| 443 |
+
|
| 444 |
+
| Alert | Condition | Severity |
|
| 445 |
+
|-------|-----------|----------|
|
| 446 |
+
| `ServiceDown` | Target down for 5m | Critical |
|
| 447 |
+
| `HighErrorRate` | 5xx > 10% for 5m | Warning |
|
| 448 |
+
| `SlowRequests` | P95 > 2s | Warning |
|
| 449 |
+
|
| 450 |
+
### Starting Monitoring Stack
|
| 451 |
+
|
| 452 |
+
```bash
|
| 453 |
+
# Start all monitoring services
|
| 454 |
+
docker compose up -d
|
| 455 |
+
|
| 456 |
+
# Verify containers
|
| 457 |
+
docker compose ps
|
| 458 |
+
|
| 459 |
+
# Check Prometheus targets
|
| 460 |
+
curl http://localhost:9090/targets
|
| 461 |
+
```
|
| 462 |
+
|
| 463 |
+
---
|
| 464 |
+
|
| 465 |
+
## Troubleshooting
|
| 466 |
+
|
| 467 |
+
### Common Issues
|
| 468 |
+
|
| 469 |
+
#### API Returns 500 Error
|
| 470 |
+
|
| 471 |
+
1. Check `.env` credentials are correct
|
| 472 |
+
2. Restart services: `docker compose down && docker compose up -d`
|
| 473 |
+
3. Verify model files: `docker exec hopcroft-api ls -la /app/models/`
|
| 474 |
+
|
| 475 |
+
#### GUI Shows "API Unavailable"
|
| 476 |
+
|
| 477 |
+
1. Wait 30-60 seconds for API initialization
|
| 478 |
+
2. Check API health: `curl http://localhost:8080/health`
|
| 479 |
+
3. View logs: `docker compose logs hopcroft-api`
|
| 480 |
+
|
| 481 |
+
#### Port Already in Use
|
| 482 |
+
|
| 483 |
+
```bash
|
| 484 |
+
# Check port usage
|
| 485 |
+
netstat -ano | findstr :8080
|
| 486 |
+
|
| 487 |
+
# Stop conflicting containers
|
| 488 |
+
docker compose down
|
| 489 |
+
```
|
| 490 |
+
|
| 491 |
+
#### DVC Pull Fails
|
| 492 |
+
|
| 493 |
+
```bash
|
| 494 |
+
# Clean cache and retry
|
| 495 |
+
rm -rf .dvc/cache
|
| 496 |
+
dvc pull
|
| 497 |
+
```
|