maurocarlu's picture
nginx endpoints addition - grafana documentation update
70cbf15

User Guide

Complete operational guide for the Hopcroft Skill Classification system covering all components: API, GUI, load testing, and monitoring.


Table of Contents

  1. System Setup
  2. API Usage
  3. GUI (Streamlit)
  4. Load Testing (Locust)
  5. Monitoring (Prometheus & Grafana)

1. System Setup (Local)

Prerequisites

Requirement Version Purpose
Python 3.10+ Runtime environment
Docker 20.10+ Containerization
Docker Compose 2.0+ Multi-service orchestration
Git 2.30+ Version control

Option A: Docker Setup

1. Clone and Configure

git clone https://github.com/se4ai2526-uniba/Hopcroft.git
cd Hopcroft

# Create environment file
cp .env.example .env

2. Edit .env with Your Credentials

MLFLOW_TRACKING_URI=https://dagshub.com/se4ai2526-uniba/Hopcroft.mlflow
MLFLOW_TRACKING_USERNAME=your_dagshub_username
MLFLOW_TRACKING_PASSWORD=your_dagshub_token

Get your DagsHub token at: https://dagshub.com/user/settings/tokens

3. Start All Services

docker compose -f docker/docker-compose.yml up -d --build

4. Verify Services

Service URL Purpose
API (Swagger) http://localhost:8080/docs Interactive API documentation
GUI (Streamlit) http://localhost:8501 Web interface
Health Check http://localhost:8080/health Service status

Option B: Virtual Environment Setup

1. Create Virtual Environment

python -m venv venv

# Windows
venv\Scripts\activate

# Linux/macOS
source venv/bin/activate

2. Install Dependencies

pip install -r requirements.txt
pip install -e .

3. Configure DVC (for Model Access)

dvc remote modify origin --local auth basic
dvc remote modify origin --local user YOUR_DAGSHUB_USERNAME
dvc remote modify origin --local password YOUR_DAGSHUB_TOKEN
dvc pull

4. Start Services Manually

# Terminal 1: Start API
make api-dev

# Terminal 2: Start Streamlit
streamlit run hopcroft_skill_classification_tool_competition/streamlit_app.py

Docker Compose Commands Reference

Command Description
docker compose -f docker/docker-compose.yml up -d Start in background
docker compose -f docker/docker-compose.yml down Stop all services
docker compose -f docker/docker-compose.yml logs -f Stream logs
docker compose -f docker/docker-compose.yml ps Check status
docker compose -f docker/docker-compose.yml restart Restart services

2. API Usage

Base URLs

Environment URL
Local (Docker) http://localhost:8080
Local (Dev) http://localhost:8000
Production (HF Spaces) https://dacrow13-hopcroft-skill-classification.hf.space/docs

Endpoints Overview

Method Endpoint Description
POST /predict Predict skills for single issue
POST /predict/batch Batch prediction (max 100)
GET /predictions List recent predictions
GET /predictions/{run_id} Get prediction by ID
GET /health Health check
GET /metrics Prometheus metrics

Interactive Documentation

Access Swagger UI for interactive testing:

Example Requests

Single Prediction

curl -X POST "http://localhost:8080/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "issue_text": "Fix authentication bug in OAuth2 login flow",
    "repo_name": "my-project",
    "pr_number": 42
  }'

Response:

{
  "run_id": "abc123...",
  "predictions": [
    {"skill": "authentication", "confidence": 0.92},
    {"skill": "security", "confidence": 0.78},
    {"skill": "oauth", "confidence": 0.65}
  ],
  "model_version": "1.0.0",
  "timestamp": "2025-01-05T15:00:00Z"
}

Batch Prediction

curl -X POST "http://localhost:8080/predict/batch" \
  -H "Content-Type: application/json" \
  -d '{
    "issues": [
      {"issue_text": "Database connection timeout"},
      {"issue_text": "UI button not responding"}
    ]
  }'

List Predictions

curl "http://localhost:8080/predictions?limit=10&skip=0"

Health Check

curl "http://localhost:8080/health"

Response:

{
  "status": "healthy",
  "model_loaded": true,
  "model_version": "1.0.0"
}

Makefile Shortcuts

make test-api-health      # Test health endpoint
make test-api-predict     # Test prediction
make test-api-list        # List predictions
make test-api-all         # Run all API tests

3. GUI (Streamlit)

Access Points

Features

  • Real-time Prediction: Instant skill classification
  • Confidence Scores: Probability for each predicted skill
  • Multiple Input Modes: Quick input, detailed input, examples
  • API Health Indicator: Connection status in sidebar

User Interface

Main Dashboard

Main Dashboard

The sidebar displays:

  • API connection status
  • Confidence threshold slider
  • Model information

Quick Input Mode

Quick Input

  1. Paste GitHub issue text
  2. Click "Predict Skills"
  3. View results instantly

Detailed Input Mode

Detailed Input

Optional metadata fields:

  • Repository name
  • PR number
  • Extended description

Prediction Results

Results

Results display:

  • Top-5 predicted skills with confidence bars
  • Full predictions table with filtering
  • Processing time metrics
  • Raw JSON response (expandable)

Example Gallery

Examples

Pre-loaded test cases:

  • Authentication bugs
  • ML feature requests
  • Database issues
  • UI enhancements

4. Load Testing (Locust)

Installation

pip install locust

Configuration

The Locust configuration is in monitoring/locust/locustfile.py:

Task Weight Endpoint
Single Prediction 60% (weight: 3) POST /predict
Batch Prediction 20% (weight: 1) POST /predict/batch
Monitoring 20% (weight: 1) GET /health, /predictions

Running Load Tests

Web UI Mode

cd monitoring/locust
locust

Then open: http://localhost:8089

Configure in the Web UI:

  • Number of users: Total concurrent users
  • Spawn rate: Users per second to add
  • Host: Target URL (e.g., http://localhost:8080)

Headless Mode

locust --headless \
  --users 50 \
  --spawn-rate 10 \
  --run-time 5m \
  --host http://localhost:8080 \
  --csv results

Target URLs

Environment Host URL
Local Docker http://localhost:8080
Local Dev http://localhost:8000
HF Spaces https://dacrow13-hopcroft-skill-classification.hf.space

Interpreting Results

Metric Description Target
RPS Requests per second Higher = better
Median Response Time 50th percentile latency < 500ms
95th Percentile Worst-case latency < 2s
Failure Rate Percentage of errors < 1%

Locust Results


5. Monitoring (Prometheus & Grafana)

Access Points

Local Development:

Service URL
Prometheus http://localhost:9090
Grafana http://localhost:3000
Pushgateway http://localhost:9091

Hugging Face Spaces (Production):

Prometheus Metrics

Access the metrics endpoint: http://localhost:8080/metrics

Available Metrics

Metric Type Description
hopcroft_requests_total Counter Total requests by method/endpoint
hopcroft_request_duration_seconds Histogram Request latency distribution
hopcroft_in_progress_requests Gauge Currently processing requests
hopcroft_prediction_processing_seconds Summary Model inference time

Useful PromQL Queries

Request Rate (per second)

rate(hopcroft_requests_total[1m])

Average Latency

rate(hopcroft_request_duration_seconds_sum[5m]) / rate(hopcroft_request_duration_seconds_count[5m])

In-Progress Requests

hopcroft_in_progress_requests

Model Prediction Time (P90)

hopcroft_prediction_processing_seconds{quantile="0.9"}

Grafana Dashboards

The pre-configured dashboard includes:

Panel Description
Request Rate Real-time requests per second
Request Latency (p50, p95) Response time percentiles
In-Progress Requests Currently processing requests
Error Rate (5xx) Percentage of failed requests
Model Prediction Time Average model inference latency
Requests by Endpoint Traffic distribution per endpoint

Data Drift Detection

Prepare Baseline (One-time)

cd monitoring/drift/scripts
python prepare_baseline.py

Run Drift Check

python run_drift_check.py

Verify Results

# Check Pushgateway
curl http://localhost:9091/metrics | grep drift

# PromQL queries
drift_detected
drift_p_value
drift_distance

Alerting Rules

Pre-configured alerts in monitoring/prometheus/alert_rules.yml:

Alert Condition Severity
ServiceDown Target down for 5m Critical
HighErrorRate 5xx > 10% for 5m Warning
SlowRequests P95 > 2s Warning

Starting Monitoring Stack

# Start all monitoring services
docker compose up -d

# Verify containers
docker compose ps

# Check Prometheus targets
curl http://localhost:9090/targets

Troubleshooting

Common Issues

API Returns 500 Error

  1. Check .env credentials are correct
  2. Restart services: docker compose down && docker compose up -d
  3. Verify model files: docker exec hopcroft-api ls -la /app/models/

GUI Shows "API Unavailable"

  1. Wait 30-60 seconds for API initialization
  2. Check API health: curl http://localhost:8080/health
  3. View logs: docker compose logs hopcroft-api

Port Already in Use

# Check port usage
netstat -ano | findstr :8080

# Stop conflicting containers
docker compose down

DVC Pull Fails

# Clean cache and retry
rm -rf .dvc/cache
dvc pull