example

Sleeping

jessehostetler commited on Nov 12, 2025

Commit

1d3b709

Merge branch 'dyff-824-refactor-codebase' into 'main'

DYFF-824: Finalize the example submission repository

See merge request ul-dsri/sandbox/sachin-sharma-in/ml-inference-service!3

Files changed (15) hide show

.dockerignore +38 -0
.env.example +11 -0
Dockerfile +26 -0
README.md +253 -326
app/api/controllers.py +16 -55
app/api/routes/prediction.py +13 -10
app/api/routes/resnet_service_manager.py +0 -19
app/core/app.py +55 -9
app/core/config.py +0 -29
app/core/dependencies.py +11 -13
app/core/lifespan.py +0 -43
app/core/logging.py +16 -39
app/services/base.py +30 -0
app/services/inference.py +56 -129
test_main.http +3 -2

.dockerignore ADDED Viewed

	@@ -0,0 +1,38 @@

+__pycache__
+*.pyc
+*.pyo
+*.pyd
+.Python
+*.so
+*.egg
+*.egg-info
+dist
+build
+.venv
+venv
+ENV
+env
+.git
+.gitignore
+.idea
+.vscode
+.claude
+*.md
+README.md
+Dockerfile
+.dockerignore
+test_*.http
+test_results
+scripts/test_datasets
+.pytest_cache
+.coverage
+htmlcov
+*.log
+.DS_Store
+.python-version

.env.example ADDED Viewed

	@@ -0,0 +1,11 @@

+# App Configuration
+APP_NAME="ML Inference Service"
+APP_VERSION="0.1.0"
+DEBUG=false
+# Server
+HOST="0.0.0.0"
+PORT=8000
+# Model
+MODEL_NAME="microsoft/resnet-18"

Dockerfile ADDED Viewed

	@@ -0,0 +1,26 @@

+FROM python:3.12-slim as builder
+WORKDIR /build
+COPY requirements.txt .
+RUN pip install --no-cache-dir --user -r requirements.txt
+FROM python:3.12-slim
+WORKDIR /app
+RUN useradd -m -u 1000 appuser
+COPY --from=builder --chown=appuser:appuser /root/.local /home/appuser/.local
+COPY --chown=appuser:appuser app ./app
+COPY --chown=appuser:appuser models ./models
+COPY --chown=appuser:appuser main.py .
+USER appuser
+ENV PATH=/home/appuser/.local/bin:$PATH \
+    PYTHONUNBUFFERED=1
+EXPOSE 8000
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

README.md CHANGED Viewed

@@ -1,419 +1,323 @@
-# ML Inference Service (FastAPI)
-A production-ready **FastAPI** web service that serves **image classification** models.
-This repo ships with a working example using **ResNet-18** (downloaded from Hugging Face) under `models/resnet-18/` and exposes a simple **REST** endpoint.
----
-## ✨ What you get
-- FastAPI application with clean layering (routes → controller → service)
-- Hot-loaded model on startup (single instance reused per request)
-- Hugging Face–compatible local model folder (`config.json`, weights, preprocessor, etc.)
-- Example endpoint: `POST /predict/resnet` that accepts a base64 image and returns:
-  - `prediction` (class label)
-  - `confidence` (softmax probability)
-  - `predicted_label` (class index)
-  - `model` (model id)
-  - `mediaType` (echoed)
----
-## 🧭 Project Layout
-```
-ml-inference-service/
-├─ main.py
-├─ app/
-│  ├─ __init__.py
-│  ├─ core/
-│  │  ├─ app.py               # App factory & router wiring
-│  │  ├─ config.py            # Settings (app name/version/debug)
-│  │  ├─ dependencies.py      # DI for model services
-│  │  ├─ lifespan.py          # Startup: load model & register service
-│  │  └─ logging.py           # Logger setup
-│  ├─ api/
-│  │  ├─ models.py            # Pydantic request/response
-│  │  ├─ controllers.py       # HTTP → service orchestration
-│  │  └─ routes/
-│  │     ├─ prediction.py     # `POST /predict/resnet`
-│  │     └─ resnet_service_manager.py (legacy, unused)
-│  └─ services/
-│     └─ inference.py         # ResNetInferenceService (load/predict)
-├─ models/
-│  └─ resnet-18/              # Sample HF-style model folder
-├─ scripts/
-│  ├─ model_download.bash        # One-liner to snapshot HF weights locally
-│  ├─ generate_test_datasets.py  # Generate PyArrow datasets for testing
-│  ├─ test_datasets.py           # Test generated datasets against API
-│  └─ test_datasets/             # Generated PyArrow test datasets (100 files)
-├─ requirements.in / requirements.txt
-└─ test_main.http                # Example request you can run from IDEs
-```
----
-## 🚀 Quickstart
-### 1) Install dependencies (Python 3.9+)
 ```bash
 python -m venv .venv
-source .venv/bin/activate   # Windows: .venv\Scripts\activate
 pip install -r requirements.txt
-```
-### 2) Download the sample model (ResNet‑18) locally
-```bash
 bash scripts/model_download.bash
 ```
-This populates `models/resnet-18/` with Hugging Face artifacts (`config.json`, weights, `preprocessor_config.json`, etc.).
-### 3) Run the server
 ```bash
-uvicorn main:app --reload
 ```
-Server listens on `http://127.0.0.1:8000`.
-### 4) Call the API
-- Use `test_main.http` from your IDE (VSCode/IntelliJ) **or** curl:
 ```bash
-curl -X POST http://127.0.0.1:8000/predict/resnet   -H "Content-Type: application/json"   -d '{
-    "image": { "mediaType": "image/jpeg", "data": "<base64-encoded-bytes>" }
   }'
 ```
-**Response (example):**
 ```json
 {
   "prediction": "tiger cat",
-  "confidence": 0.9971,
   "predicted_label": 282,
   "model": "microsoft/resnet-18",
   "mediaType": "image/jpeg"
 }
 ```
----
-## 🧩 Bring Your Own Model (BYOM)
-There are **two** ways to integrate your own model.
-### Option A — *Drop-in replacement (zero code changes)*
-If your model is a **Hugging Face image classification** model that works with
-`AutoImageProcessor` and `ResNetForImageClassification` **or** a compatible
-`*ForImageClassification` class from `transformers`, you can simply place the
-model folder alongside `resnet-18` and point the service at it.
-1. Put your HF-style folder under `models/<your-model-name>/` containing at least:
-   - `config.json`
-   - weights (e.g., `pytorch_model.bin` or `model.safetensors`)
-   - `preprocessor_config.json` / `image_processor` files
-2. **Choose one** of these approaches:
-   - **Simplest**: Replace the contents of `models/resnet-18/` with your model files *but keep the folder name*. The existing `/predict/resnet` endpoint will now serve your model.
-   - **Preferred**: Change the model id used at startup:
-     - Open `app/core/lifespan.py` and modify the service initialization:
-       ```python
-       resnet_service = ResNetInferenceService(
-           model_name="your-org/your-model",  # used for local folder name
-           use_local_model=True               # loads from models/your-model/
-       )
-       ```
-     - Ensure your local folder is `models/your-model/`.
-> How folder naming works: when `use_local_model=True`, the service derives the
-> local directory as `models/<last-segment-of-model_name>`. For
-> `"microsoft/resnet-18"` that becomes `models/resnet-18`. For
-> `"your-org/awesome-vit-base"`, it becomes `models/awesome-vit-base`.
-That’s it. No code changes elsewhere if your model is a standard image classifier.
----
-### Option B — *New task/model type (minimal code: new service + route)*
-If you are **not** serving a Hugging Face image classifier (e.g., object detection,
-segmentation, text models), implement a small service class and a route mirroring
-the `ResNetInferenceService` flow.
-1. **Create your service** (copy and adapt `ResNetInferenceService`):
-   - File: `app/services/<your_model>_service.py`
-   - Responsibilities you must implement:
-     - `__init__(model_name: str, use_local_model: bool)` → set `self.model_path`
-     - `load_model()` → load weights & preprocessor
-     - `predict(image: PIL.Image.Image) -> Dict[str, Any]` → run inference and return a dict with:
-       ```python
-       {
-         "prediction": "<your label or structured result>",
-         "confidence": <float 0..1>,
-         "predicted_label": <int or meaningful code>,
-         "model": "<model id>"
-       }
-       ```
-       *Feel free to extend the payload; just update the API schema accordingly.*
-2. **Wire the dependency**:
-   - Register your service at startup in `app/core/lifespan.py` similar to ResNet:
-     ```python
-     from app.core.dependencies import set_resnet_service  # or create your own set/get
-     from app.services.your_model_service import YourModelService
-     svc = YourModelService(model_name="your-org/your-model", use_local_model=True)
-     svc.load_model()
-     set_resnet_service(svc)  # or create set_your_model_service(...)
-     ```
-   - Optionally create **new getters/setters** in `app/core/dependencies.py` if you serve multiple models in parallel (one getter per model).
-3. **Add a route**:
-   - Create `app/api/routes/your_model.py` analogous to `prediction.py`:
-     ```python
-     from fastapi import APIRouter, Depends
-     from app.api.controllers import PredictionController
-     from app.api.models import ImageRequest, PredictionResponse
-     from app.core.dependencies import get_resnet_service  # or your getter
-     from app.services.your_model_service import YourModelService
-     router = APIRouter()
-     @router.post("/predict/your-model", response_model=PredictionResponse)
-     async def predict_image(request: ImageRequest, service: YourModelService = Depends(get_resnet_service)):
-         controller = PredictionController(service)  # reuse the controller
-         return await controller.predict(request)
-     ```
-   - Register the router in `app/core/app.py`:
-     ```python
-     from app.api.routes import your_model as your_model_routes
-     app.include_router(your_model_routes.router)
-     ```
-4. **Adjust schemas if needed**:
-   - The default `PredictionResponse` in `app/api/models.py` is for single-label classification. For other tasks, either extend it or define a new response model and use it in your route’s `response_model=`.
-> **Tip**: Keep your controller thin and push all model-specific logic into your service class. The server glue (DI + routes) stays identical across models.
----
-## 🧪 Validating your setup
-- **Startup logs** should include: `Initializing ResNet service with local model: models/<folder>` and `Model and processor loaded successfully`.
-- Hitting your endpoint should return a **200** with a JSON body like the example above.
-- If you see `Local model directory not found`, check your `models/<name>/` path and filenames.
----
-## 🔌 Request & Response Shapes
-### Request
-```json
-{
-  "image": {
-    "mediaType": "image/jpeg",
-    "data": "<base64-encoded image bytes>"
-  }
-}
-```
-### Response
-```json
-{
-  "prediction": "string label",
-  "confidence": 0.0,
-  "predicted_label": 0,
-  "model": "your-org/your-model",
-  "mediaType": "image/jpeg"
-}
 ```
----
-## ⚙️ Configuration
-Basic settings live in `app/core/config.py`. Out of the box we keep it simple:
-- `app_name`, `app_version`, `debug`
-If you want to make the **model** configurable without touching code, extend `Settings` with a `model_name` env var and consume it in `lifespan.py` when creating your service instance.
-Example:
 ```python
-# app/core/config.py
-from pydantic_settings import BaseSettings
-from pydantic import Field
-class Settings(BaseSettings):
-    app_name: str = Field("ML Inference Service")
-    app_version: str = Field("0.1.0")
-    debug: bool = Field(False)
-    model_name: str = Field("microsoft/resnet-18", description="HF model id used at startup")
-settings = Settings()
-# app/core/lifespan.py
-from app.core.config import settings
-svc = ResNetInferenceService(model_name=settings.model_name, use_local_model=True)
 ```
-Then set `MODEL_NAME=your-org/your-model` in your environment (Pydantic will map `model_name` from `MODEL_NAME`).
----
-## 📦 Packaging & Deployment
-- **Dev**: `uvicorn main:app --reload`
-- **Prod**: Use a process manager (e.g., `gunicorn -k uvicorn.workers.UvicornWorker`) and add health checks.
-- **Containerize**: Copy only `requirements.txt` and source, install wheels, and bake the `models/` folder into the image or mount it as a volume.
-- **CPU vs GPU**: This example uses CPU by default. If you have CUDA, install a CUDA-enabled PyTorch build and set device placement in your service.
----
-## 🧪 PyArrow Test Datasets
-This project includes a comprehensive **PyArrow-based dataset generation system** designed specifically for academic challenges and ML model validation. The system generates **100 standardized test datasets** that allow participants to validate their models against consistent, reproducible test cases.
-### 🏗️ Why Both? `.parquet` + `_metadata.json`
 ```
-standard_test_001.parquet         # Actual test data (images, requests, responses)
-standard_test_001_metadata.json   # Human-readable description and stats
 ```
-### 📊 Dataset Categories (25 each = 100 total)
-#### 1. **Standard Test Cases** (`standard_test_*.parquet`)
-**Purpose**: Baseline functionality validation
-**Content**: Normal images with expected successful predictions
-- **Image Types**: Random patterns, geometric shapes, gradients, text overlays, solid colors
-- **Formats**: JPEG, PNG with proper MIME types
-- **Sizes**: 224x224, 256x256, 299x299, 384x384 (common ML input sizes)
-- **Expected Behavior**: HTTP 200 responses with valid prediction structure
-#### 2. **Edge Case Tests** (`edge_case_*.parquet`)
-**Purpose**: Robustness and error handling validation
-**Content**: Challenging scenarios that test model resilience
-- **Tiny Images**: 32x32, 1x1 pixels (tests preprocessing robustness)
-- **Huge Images**: 2048x2048 (tests memory management and resizing)
-- **Extreme Aspect Ratios**: 1000x50 (tests preprocessing assumptions)
-- **Corrupted Data**: Invalid base64, malformed requests (tests error handling)
-- **Expected Behavior**: Graceful degradation, proper error responses
-#### 3. **Performance Benchmarks** (`performance_test_*.parquet`)
-**Purpose**: Latency and throughput measurement
-**Content**: Varying batch sizes for performance profiling
-- **Batch Sizes**: 1, 5, 10, 25, 50, 100 images per test
-- **Latency Tracking**: Expected max response times based on batch size
-- **Throughput Metrics**: Requests per second under different loads
-- **Expected Behavior**: Consistent performance within acceptable bounds
-#### 4. **Model Comparison** (`model_comparison_*.parquet`)
-**Purpose**: Cross-model validation and benchmarking
-**Content**: Identical inputs tested across different model architectures
-- **Model Types**: ResNet-18/50, ViT, ConvNext, Swin Transformer
-- **Consistent Inputs**: Same 10 base images per dataset
-- **Comparative Analysis**: Enables direct performance comparison between models
-- **Expected Behavior**: Architecture-specific but structurally consistent responses
-### 🛠️ Generation Process
-The dataset generation follows a **deterministic, reproducible approach**:
-#### Step 1: Synthetic Image Creation
-```python
-# Why synthetic images instead of real photos?
-# 1. Copyright-free for academic distribution
-# 3. Programmatically generated edge cases
-def create_synthetic_image(width, height, image_type):
-    if image_type == "random":
-        # RGB noise - tests model noise robustness
-        array = np.random.randint(0, 256, (height, width, 3))
-    elif image_type == "geometric":
-        # Shapes and patterns - tests feature detection
-        # ... geometric pattern generation
-    # ... other synthetic types
-```
-#### Step 2: API Request Structure Generation
-```python
-# Matches exact API format for drop-in testing
 {
-    "image": {
-        "mediaType": "image/jpeg",  # Proper MIME types
-        "data": "<base64-encoded-image>"  # Standard encoding
-    }
 }
 ```
-#### Step 3: Expected Response Generation
-```python
-# Realistic prediction responses with proper structure
 {
-    "prediction": "tiger_cat",           # ImageNet-style labels
-    "confidence": 0.8742,                # Realistic confidence scores
-    "predicted_label": 282,              # Numeric label indices
-    "model": "microsoft/resnet-18",      # Model identification
-    "mediaType": "image/jpeg"            # Echo input format
 }
 ```
-#### Step 4: PyArrow Table Creation
-```python
-# Columnar storage for efficient querying
-table = pa.table({
-    "dataset_id": [...],        # Unique dataset identifier
-    "image_id": [...],          # Individual image identifier
-    "api_request": [...],       # JSON-serialized requests
-    "expected_response": [...], # JSON-serialized expected responses
-    "test_category": [...],     # Category classification
-    "difficulty": [...],        # Complexity indicator
-    # ... additional metadata columns
-})
-```
-### 🚀 Usage Guide
-**1. Generate Test Datasets**
 ```bash
-# Create all 100 datasets (~2-5 minutes depending on hardware)
 python scripts/generate_test_datasets.py
-# What this creates:
-# - scripts/test_datasets/*.parquet (actual test data)
-# - scripts/test_datasets/*_metadata.json (human-readable info)
-# - scripts/test_datasets/datasets_summary.json (overview)
 ```
-**2. Validate API**
 ```bash
-# Start your ML service
 uvicorn main:app --reload
 # Quick test (5 samples per dataset)
 python scripts/test_datasets.py --quick
-# Full validation (all samples)
 python scripts/test_datasets.py
-# Category-specific testing
 python scripts/test_datasets.py --category edge_case
-python scripts/test_datasets.py --category performance
 ```
-### 📈 Testing Output and Metrics
-The test runner provides comprehensive validation metrics:
 ```
-🏁 DATASET TESTING SUMMARY
 ============================================================
 Datasets tested: 100
 Successful datasets: 95
@@ -425,7 +329,7 @@ Test duration: 45.2s
 Performance:
   Avg latency: 123.4ms
   Median latency: 98.7ms
-  Min latency: 45.2ms
   Max latency: 2,341.0ms
   Requests/sec: 27.6
@@ -434,6 +338,29 @@ Category breakdown:
   edge_case: 25 datasets, 76.8% avg success
   performance: 25 datasets, 91.1% avg success
   model_comparison: 25 datasets, 89.3% avg success
-Failed datasets: edge_case_023, edge_case_019, performance_012
 ```

+# ML Inference Service
+FastAPI service for serving ML models over HTTP. Comes with ResNet-18 for image classification out of the box, but you can swap in any model you want.
+## Quick Start
+**Local development:**
 ```bash
+# Install dependencies
 python -m venv .venv
+source .venv/bin/activate
 pip install -r requirements.txt
+# Download the example model
 bash scripts/model_download.bash
+# Run it
+uvicorn main:app --reload
 ```
+Server runs on `http://127.0.0.1:8000`. Check `/docs` for the interactive API documentation.
+**Docker:**
 ```bash
+# Build
+docker build -t ml-inference-service:test .
+# Run
+docker run -d --name ml-inference-test -p 8000:8000 ml-inference-service:test
+# Check logs
+docker logs -f ml-inference-test
+# Stop
+docker stop ml-inference-test && docker rm ml-inference-test
 ```
+## Testing the API
 ```bash
+# Using curl
+curl -X POST http://localhost:8000/predict \
+  -H "Content-Type: application/json" \
+  -d '{
+    "image": {
+      "mediaType": "image/jpeg",
+      "data": "<base64-encoded-image>"
+    }
   }'
 ```
+Example response:
 ```json
 {
   "prediction": "tiger cat",
+  "confidence": 0.394,
   "predicted_label": 282,
   "model": "microsoft/resnet-18",
   "mediaType": "image/jpeg"
 }
 ```
+## Project Structure
+```
+ml-inference-service/
+├── main.py                      # Entry point
+├── app/
+│   ├── core/
+│   │   ├── app.py               # App factory, config, DI, lifecycle
+│   │   └── logging.py           # Logging setup
+│   ├── api/
+│   │   ├── models.py            # Request/response schemas
+│   │   ├── controllers.py       # Business logic
+│   │   └── routes/
+│   │       └── prediction.py    # POST /predict
+│   └── services/
+│       ├── base.py              # Abstract InferenceService class
+│       └── inference.py         # ResNet implementation
+├── models/
+│   └── microsoft/
+│       └── resnet-18/           # Model weights and config
+├── scripts/
+│   ├── model_download.bash
+│   ├── generate_test_datasets.py
+│   └── test_datasets.py
+├── Dockerfile                   # Multi-stage build
+├── .env.example                 # Environment config template
+└── requirements.txt
 ```
+The key design decision here is that `app/core/app.py` consolidates everything—config, dependency injection, lifecycle, and the app factory. This avoids the mess of managing global state across multiple files.
+## How to Plug In Your Own Model
+The whole service is built around one abstract base class: `InferenceService`. Implement it for your model, and everything else just works.
+### Step 1: Create Your Service Class
 ```python
+# app/services/your_model_service.py
+from app.services.base import InferenceService
+from app.api.models import ImageRequest, PredictionResponse
+import asyncio
+class YourModelService(InferenceService[ImageRequest, PredictionResponse]):
+    def __init__(self, model_name: str):
+        self.model_name = model_name
+        self.model_path = f"models/{model_name}"
+        self.model = None
+        self._is_loaded = False
+    async def load_model(self) -> None:
+        """Load your model here. Called once at startup."""
+        self.model = load_your_model(self.model_path)
+        self._is_loaded = True
+    async def predict(self, request: ImageRequest) -> PredictionResponse:
+        """Run inference. Offload heavy work to thread pool."""
+        return await asyncio.to_thread(self._predict_sync, request)
+    def _predict_sync(self, request: ImageRequest) -> PredictionResponse:
+        """Actual inference happens here."""
+        image = decode_base64_image(request.image.data)
+        result = self.model(image)
+        return PredictionResponse(
+            prediction=result.label,
+            confidence=result.confidence,
+            predicted_label=result.class_id,
+            model=self.model_name,
+            mediaType=request.image.mediaType
+        )
+    @property
+    def is_loaded(self) -> bool:
+        return self._is_loaded
 ```
+**Important:** Use `asyncio.to_thread()` to run CPU-heavy inference in a background thread. This keeps the server responsive while your model is working.
+### Step 2: Register Your Service
+Open `app/core/app.py` and find the lifespan function:
+```python
+# Change this line:
+service = ResNetInferenceService(model_name="microsoft/resnet-18")
+# To this:
+service = YourModelService(model_name="your-org/your-model")
+```
+That's it. The `/predict` endpoint now serves your model.
+### Model Files
+Put your model files under `models/` with the full org/model structure:
 ```
+models/
+└── your-org/
+    └── your-model/
+        ├── config.json
+        ├── weights.bin
+        └── (other files)
 ```
+No renaming, no dropping the org prefix—it just mirrors the Hugging Face structure.
+## Configuration
+Settings are managed via environment variables or a `.env` file. See `.env.example` for all available options.
+**Default values:**
+- `APP_NAME`: "ML Inference Service"
+- `APP_VERSION`: "0.1.0"
+- `DEBUG`: false
+- `HOST`: "0.0.0.0"
+- `PORT`: 8000
+- `MODEL_NAME`: "microsoft/resnet-18"
+**To customize:**
+```bash
+# Copy the example
+cp .env.example .env
+# Edit values
+vim .env
+```
+Or set environment variables directly:
+```bash
+export MODEL_NAME="google/vit-base-patch16-224"
+uvicorn main:app --reload
+```
+## Deployment
+**Development:**
+```bash
+uvicorn main:app --reload
+```
+**Production:**
+```bash
+gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
+```
+The service runs on CPU by default. For GPU inference, install CUDA-enabled PyTorch and modify your service to move tensors to the GPU device.
+**Docker:**
+- Multi-stage build keeps the image small
+- Runs as non-root user (`appuser`)
+- Python dependencies installed in user site-packages
+- Model files baked into the image
+## What Happens When You Start the Server
+```
+INFO: Starting ML Inference Service...
+INFO: Initializing ResNet service: models/microsoft/resnet-18
+INFO: Loading model from models/microsoft/resnet-18
+INFO: Model loaded: 1000 classes
+INFO: Startup completed successfully
+INFO: Uvicorn running on http://0.0.0.0:8000
+```
+If you see "Model directory not found", check that your model files exist at the expected path with the full org/model structure.
+## API Reference
+**Endpoint:** `POST /predict`
+**Request:**
+```json
 {
+  "image": {
+    "mediaType": "image/jpeg",  // or "image/png"
+    "data": "<base64-encoded-image>"
+  }
 }
 ```
+**Response:**
+```json
 {
+  "prediction": "string",      // Human-readable label
+  "confidence": 0.0,           // Softmax probability
+  "predicted_label": 0,        // Numeric class index
+  "model": "org/model-name",   // Model identifier
+  "mediaType": "image/jpeg"    // Echoed from request
 }
 ```
+**Docs:**
+- Swagger UI: `http://localhost:8000/docs`
+- ReDoc: `http://localhost:8000/redoc`
+- OpenAPI JSON: `http://localhost:8000/openapi.json`
+## PyArrow Test Datasets
+We've included a test dataset system for validating your model. It generates 100 standardized test cases covering normal inputs, edge cases, performance benchmarks, and model comparisons.
+### Generate Datasets
 ```bash
 python scripts/generate_test_datasets.py
 ```
+This creates:
+- `scripts/test_datasets/*.parquet` - Test data (images, requests, expected responses)
+- `scripts/test_datasets/*_metadata.json` - Human-readable descriptions
+- `scripts/test_datasets/datasets_summary.json` - Overview of all datasets
+### Run Tests
 ```bash
+# Start your service first
 uvicorn main:app --reload
 # Quick test (5 samples per dataset)
 python scripts/test_datasets.py --quick
+# Full validation
 python scripts/test_datasets.py
+# Test specific category
 python scripts/test_datasets.py --category edge_case
 ```
+### Dataset Categories (25 datasets each)
+**1. Standard Tests** (`standard_test_*.parquet`)
+- Normal images: random patterns, shapes, gradients
+- Common sizes: 224x224, 256x256, 299x299, 384x384
+- Formats: JPEG, PNG
+- Purpose: Baseline validation
+**2. Edge Cases** (`edge_case_*.parquet`)
+- Tiny images (32x32, 1x1)
+- Huge images (2048x2048)
+- Extreme aspect ratios (1000x50)
+- Corrupted data, malformed requests
+- Purpose: Test error handling
+**3. Performance Benchmarks** (`performance_test_*.parquet`)
+- Batch sizes: 1, 5, 10, 25, 50, 100 images
+- Latency and throughput tracking
+- Purpose: Performance profiling
+**4. Model Comparisons** (`model_comparison_*.parquet`)
+- Same inputs across different architectures
+- Models: ResNet-18/50, ViT, ConvNext, Swin
+- Purpose: Cross-model benchmarking
+### Test Output
 ```
+DATASET TESTING SUMMARY
 ============================================================
 Datasets tested: 100
 Successful datasets: 95
 Performance:
   Avg latency: 123.4ms
   Median latency: 98.7ms
+  p95 latency: 342.1ms
   Max latency: 2,341.0ms
   Requests/sec: 27.6
   edge_case: 25 datasets, 76.8% avg success
   performance: 25 datasets, 91.1% avg success
   model_comparison: 25 datasets, 89.3% avg success
+```
+## Common Issues
+**Port 8000 already in use:**
+```bash
+# Find what's using it
+lsof -i :8000
+# Or just use a different port
+uvicorn main:app --port 8080
 ```
+**Model not loading:**
+- Check the path: models should be in `models/<org>/<model-name>/`
+- Make sure you ran `bash scripts/model_download.bash`
+- Check logs for the exact error
+**Slow inference:**
+- Inference runs on CPU by default
+- For GPU: install CUDA PyTorch and modify service to use GPU device
+- Consider using smaller models or quantization
+## License
+MIT

app/api/controllers.py CHANGED Viewed

@@ -1,75 +1,36 @@
-"""
-Controllers for handling API business logic.
-"""
-import base64
-import io
 from fastapi import HTTPException
-from PIL import Image
 from app.core.logging import logger
-from app.services.inference import ResNetInferenceService
 from app.api.models import ImageRequest, PredictionResponse
 class PredictionController:
-    """Controller for ML prediction endpoints."""
     @staticmethod
-    async def predict_resnet(
-            request: ImageRequest,
-            resnet_service: ResNetInferenceService
     ) -> PredictionResponse:
-        """
-        Classify an image using ResNet-18 from base64 encoded data.
-        """
         try:
-            # Validate service availability
-            if not resnet_service:
-                raise HTTPException(
-                    status_code=503,
-                    detail="Service not initialized"
-                )
-            # Validate media type
             if not request.image.mediaType.startswith('image/'):
-                raise HTTPException(
-                    status_code=400,
-                    detail=f"Invalid media type: {request.image.mediaType}"
-                )
-            # Decode base64 image data
-            try:
-                image_data = base64.b64decode(request.image.data)
-            except Exception as decode_error:
-                raise HTTPException(
-                    status_code=400,
-                    detail=f"Invalid base64 data: {str(decode_error)}"
-                )
-            # Load and validate image
-            try:
-                image = Image.open(io.BytesIO(image_data))
-            except Exception as img_error:
-                raise HTTPException(
-                    status_code=400,
-                    detail=f"Invalid image file: {str(img_error)}"
-                )
-            # Perform prediction
-            result = resnet_service.predict(image)
-            # Return structured response
-            return PredictionResponse(
-                prediction=result["prediction"],
-                confidence=result["confidence"],
-                model=result["model"],
-                predicted_label=result["predicted_label"],
-                mediaType=request.image.mediaType
-            )
         except HTTPException:
             raise
         except Exception as e:
             logger.error(f"Prediction failed: {e}")
-            raise HTTPException(status_code=500, detail=str(e))

+"""API controllers for request handling and validation."""
+import asyncio
 from fastapi import HTTPException
 from app.core.logging import logger
+from app.services.base import InferenceService
 from app.api.models import ImageRequest, PredictionResponse
 class PredictionController:
+    """Controller for prediction endpoints."""
     @staticmethod
+    async def predict(
+        request: ImageRequest,
+        service: InferenceService
     ) -> PredictionResponse:
+        """Run inference using the configured service."""
         try:
+            if not service or not service.is_loaded:
+                raise HTTPException(503, "Service not available")
             if not request.image.mediaType.startswith('image/'):
+                raise HTTPException(400, f"Invalid media type: {request.image.mediaType}")
+            return await asyncio.to_thread(service.predict, request)
         except HTTPException:
             raise
+        except ValueError as e:
+            logger.error(f"Invalid input: {e}")
+            raise HTTPException(400, str(e))
         except Exception as e:
             logger.error(f"Prediction failed: {e}")
+            raise HTTPException(500, "Internal server error")

app/api/routes/prediction.py CHANGED Viewed

@@ -1,20 +1,23 @@
-"""
-ML Prediction routes.
-"""
 from fastapi import APIRouter, Depends
 from app.api.controllers import PredictionController
 from app.api.models import ImageRequest, PredictionResponse
-from app.core.dependencies import get_resnet_service
-from app.services.inference import ResNetInferenceService
 router = APIRouter()
-@router.post("/predict/resnet", response_model=PredictionResponse)
-async def predict_image(
     request: ImageRequest,
-    resnet_service: ResNetInferenceService = Depends(get_resnet_service)
 ):
-    """Classify an image using ResNet-18 from base64 encoded data."""
-    return await PredictionController.predict_resnet(request, resnet_service)

+"""Prediction API routes."""
 from fastapi import APIRouter, Depends
 from app.api.controllers import PredictionController
 from app.api.models import ImageRequest, PredictionResponse
+from app.core.dependencies import get_inference_service
+from app.services.base import InferenceService
 router = APIRouter()
+@router.post("/predict", response_model=PredictionResponse)
+async def predict(
     request: ImageRequest,
+    service: InferenceService = Depends(get_inference_service)
 ):
+    """
+    Run inference on base64-encoded image.
+    Returns prediction, confidence, predicted label, model name, and media type.
+    """
+    return await PredictionController.predict(request, service)

app/api/routes/resnet_service_manager.py DELETED Viewed

@@ -1,19 +0,0 @@
-# """
-# Dependency injection for FastAPI.
-# """
-# from typing import Optional
-# from app.services.inference import ResNetInferenceService
-#
-# # Global service instance
-# _resnet_service: Optional[ResNetInferenceService] = None
-#
-#
-# def get_resnet_service() -> Optional[ResNetInferenceService]:
-#     """Get the ResNet service instance."""
-#     return _resnet_service
-#
-#
-# def set_resnet_service(service: ResNetInferenceService) -> None:
-#     """Set the global ResNet service instance."""
-#     global _resnet_service
-#     _resnet_service = service

app/core/app.py CHANGED Viewed

@@ -1,16 +1,63 @@
-"""
-FastAPI application factory.
-"""
 from fastapi import FastAPI
-from app.core.config import settings
-from app.core.lifespan import lifespan
 from app.api.routes import prediction
-def create_app() -> FastAPI:
-    """Application factory."""
     app = FastAPI(
         title=settings.app_name,
         description="ML inference service for image classification",
@@ -19,7 +66,6 @@ def create_app() -> FastAPI:
         lifespan=lifespan
     )
-    # Include only prediction router
     app.include_router(prediction.router)
-    return app

+"""FastAPI application factory and core infrastructure."""
+import asyncio
+import warnings
+from contextlib import asynccontextmanager
+from typing import AsyncGenerator, Optional
 from fastapi import FastAPI
+from pydantic import Field
+from pydantic_settings import BaseSettings
+from app.core.logging import logger
+from app.core.dependencies import set_inference_service
+from app.services.inference import ResNetInferenceService
 from app.api.routes import prediction
+class Settings(BaseSettings):
+    """Application settings. Override via environment variables or .env file."""
+    app_name: str = Field(default="ML Inference Service")
+    app_version: str = Field(default="0.1.0")
+    debug: bool = Field(default=False)
+    host: str = Field(default="0.0.0.0")
+    port: int = Field(default=8000)
+    class Config:
+        env_file = ".env"
+settings = Settings()
+@asynccontextmanager
+async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
+    """Application lifecycle: startup/shutdown."""
+    logger.info("Starting ML Inference Service...")
+    try:
+        with warnings.catch_warnings():
+            warnings.filterwarnings("ignore", category=FutureWarning)
+            # Replace ResNetInferenceService with your own implementation
+            service = ResNetInferenceService(model_name="microsoft/resnet-18")
+            await asyncio.to_thread(service.load_model)
+            set_inference_service(service)
+        logger.info("Startup completed successfully")
+    except Exception as e:
+        logger.error(f"Startup failed: {e}")
+        raise
+    yield
+    logger.info("Shutting down...")
+def create_app() -> FastAPI:
+    """Create and configure FastAPI application."""
     app = FastAPI(
         title=settings.app_name,
         description="ML inference service for image classification",
         lifespan=lifespan
     )
     app.include_router(prediction.router)
+    return app

app/core/config.py DELETED Viewed

@@ -1,29 +0,0 @@
-"""
-Basic configuration management.
-Starting simple - just app settings. We'll expand as needed.
-"""
-from pydantic import Field
-from pydantic_settings import BaseSettings  # Changed import
-class Settings(BaseSettings):
-    """Application settings with environment variable support."""
-    # Basic app settings
-    app_name: str = Field(default="ML Inference Service", description="Application name")
-    app_version: str = Field(default="0.1.0", description="Application version")
-    debug: bool = Field(default=False, description="Debug mode")
-    # Server settings
-    host: str = Field(default="0.0.0.0", description="Server host")
-    port: int = Field(default=8000, description="Server port")
-    class Config:
-        """Load from .env file if it exists."""
-        env_file = ".env"
-# Global settings instance
-settings = Settings()

app/core/dependencies.py CHANGED Viewed

@@ -1,19 +1,17 @@
-"""
-Dependency injection for FastAPI.
-"""
 from typing import Optional
-from app.services.inference import ResNetInferenceService
-# Global service instance
-_resnet_service: Optional[ResNetInferenceService] = None
-def get_resnet_service() -> Optional[ResNetInferenceService]:
-    """Get the ResNet service instance."""
-    return _resnet_service
-def set_resnet_service(service: ResNetInferenceService) -> None:
-    """Set the global ResNet service instance."""
-    global _resnet_service
-    _resnet_service = service

+"""Dependency injection for services."""
 from typing import Optional
+from app.services.base import InferenceService
+_inference_service: Optional[InferenceService] = None
+def get_inference_service() -> Optional[InferenceService]:
+    """Get inference service for dependency injection."""
+    return _inference_service
+def set_inference_service(service: InferenceService) -> None:
+    """Set inference service. Called internally during startup."""
+    global _inference_service
+    _inference_service = service

app/core/lifespan.py DELETED Viewed

@@ -1,43 +0,0 @@
-"""
-Application lifespan management.
-"""
-import warnings
-from contextlib import asynccontextmanager
-from typing import AsyncGenerator
-from fastapi import FastAPI
-from app.core.logging import logger
-from app.core.dependencies import set_resnet_service
-from app.services.inference import ResNetInferenceService
-@asynccontextmanager
-async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
-    """Application lifespan manager."""
-    # Startup
-    logger.info("Starting ML Inference Service...")
-    try:
-        with warnings.catch_warnings():
-            warnings.filterwarnings("ignore", category=FutureWarning)
-            # Initialize and load ResNet service
-            resnet_service = ResNetInferenceService(
-                model_name="microsoft/resnet-18",
-                use_local_model=True
-            )
-            resnet_service.load_model()
-            set_resnet_service(resnet_service)
-        logger.info("Startup completed successfully")
-    except Exception as e:
-        logger.error(f"Startup failed: {e}")
-        raise
-    yield  # App runs here
-    # Shutdown
-    logger.info("Shutting down...")

app/core/logging.py CHANGED Viewed

@@ -1,49 +1,26 @@
-"""
-Logging configuration for the application.
-"""
 import logging
 import sys
-from typing import Optional
-from app.core.config import settings
-class LoggerSetup:
-    """Logger setup utility class."""
-    @staticmethod
-    def setup_logging(
-            logger_name: Optional[str] = None,
-            level: Optional[str] = None,
-            format_string: Optional[str] = None
-    ) -> logging.Logger:
-        """Set up and configure a logger."""
-        logger = logging.getLogger(logger_name or settings.app_name)
-        # Avoid duplicate handlers
-        if logger.handlers:
-            return logger
-        # Set level
-        log_level = getattr(logging, (level or "INFO").upper())
-        logger.setLevel(log_level)
-        # Create console handler
-        handler = logging.StreamHandler(sys.stdout)
-        handler.setLevel(log_level)
-        # Create formatter
-        formatter = logging.Formatter(
-            format_string or "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
-        )
-        handler.setFormatter(formatter)
-        # Add handler to logger
-        logger.addHandler(handler)
-        return logger
-# Create application logger
-logger = LoggerSetup.setup_logging()

+"""Logging configuration."""
 import logging
 import sys
+def setup_logging(logger_name: str = "ML Inference Service") -> logging.Logger:
+    """Setup and configure logger."""
+    logger = logging.getLogger(logger_name)
+    if logger.handlers:
+        return logger
+    logger.setLevel(logging.INFO)
+    handler = logging.StreamHandler(sys.stdout)
+    handler.setLevel(logging.INFO)
+    formatter = logging.Formatter(
+        "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
+    )
+    handler.setFormatter(formatter)
+    logger.addHandler(handler)
+    return logger
+logger = setup_logging()

app/services/base.py ADDED Viewed

	@@ -0,0 +1,30 @@

+"""Abstract base class for ML inference services."""
+from abc import ABC, abstractmethod
+from typing import Generic, TypeVar
+from pydantic import BaseModel
+TRequest = TypeVar('TRequest', bound=BaseModel)
+TResponse = TypeVar('TResponse', bound=BaseModel)
+class InferenceService(ABC, Generic[TRequest, TResponse]):
+    """
+    Base class for inference services. Subclass this to integrate your model.
+    """
+    @abstractmethod
+    def load_model(self) -> None:
+        """Load model weights and processors. Called once at startup."""
+        pass
+    @abstractmethod
+    def predict(self, request: TRequest) -> TResponse:
+        """Run inference and return typed response."""
+        pass
+    @property
+    @abstractmethod
+    def is_loaded(self) -> bool:
+        """Check if model is loaded and ready."""
+        pass

app/services/inference.py CHANGED Viewed

@@ -1,152 +1,79 @@
-"""
-Inference service for machine learning models.
-This service handles the business logic for ML inference,
-following the Single Responsibility Principle.
-"""
 import os
-from typing import Dict, Any
 import torch
 from PIL import Image
 from transformers import AutoImageProcessor, ResNetForImageClassification
 from app.core.logging import logger
-class ResNetInferenceService:
-    """
-    ResNet inference service.
-    Handles loading and inference for ResNet models.
-    Follows the Singleton pattern - loads model once.
-    """
-    def __init__(self, model_name: str = "microsoft/resnet-18", use_local_model: bool = True):
-        """
-        Initialize the ResNet service.
-        Args:
-            model_name: HuggingFace model identifier
-        """
         self.model_name = model_name
-        self.use_local_model = use_local_model
         self.model = None
         self.processor = None
         self._is_loaded = False
-        if use_local_model:
-            self.model_path = os.path.join("models", model_name.split("/")[-1])
-            logger.info(f"Initializing ResNet service with local model: {self.model_path}")
-        else:
-            self.model_path = model_name
-            logger.info(f"Initializing ResNet service with remote model: {model_name}")
     def load_model(self) -> None:
-        """
-        Load the ResNet model and processor.
-        This method loads the model once and reuses it for all requests.
-        """
         if self._is_loaded:
-            logger.debug("Model already loaded, skipping...")
             return
-        try:
-            if self.use_local_model:
-                if not os.path.exists(self.model_path):
-                    raise FileNotFoundError(f"Local model directory not found: {self.model_path}")
-                config_path = os.path.join(self.model_path, "config.json")
-                if not os.path.exists(config_path):
-                    raise FileNotFoundError(f"Model config not found: {config_path}")
-                logger.info(f"Loading ResNet model from local directory: {self.model_path}")
-            else:
-                logger.info(f"Loading ResNet model from HuggingFace Hub: {self.model_name}")
-            # Suppress warnings during model loading
-            import warnings
-            with warnings.catch_warnings():
-                warnings.filterwarnings("ignore", category=FutureWarning)
-                warnings.filterwarnings("ignore", message="Could not find image processor class")
-                # Load processor and model from local directory or remote
-                self.processor = AutoImageProcessor.from_pretrained(
-                    self.model_path,
-                    local_files_only=self.use_local_model
-                )
-                self.model = ResNetForImageClassification.from_pretrained(
-                    self.model_path,
-                    local_files_only=self.use_local_model
-                )
-            self._is_loaded = True
-            logger.info("ResNet model loaded successfully")
-            logger.info(f"Model architecture: {self.model.config.architectures}")
-            logger.info(f"Model has {len(self.model.config.id2label)} classes")
-        except Exception as e:
-            logger.error(f"Failed to load ResNet model: {e}")
-            if self.use_local_model:
-                logger.error("Hint: Make sure the model was downloaded correctly with dwl.bash")
-            raise
-    def predict(self, image: Image.Image) -> Dict[str, Any]:
-        """
-        Perform inference on an image.
-        Args:
-            image: PIL Image to classify
-        Returns:
-            Dictionary containing prediction results
-        Raises:
-            RuntimeError: If model is not loaded
-            ValueError: If image processing fails
-        """
-        if not self._is_loaded:
-            logger.info("Model not loaded, loading now...")
-            self.load_model()
-        try:
-            logger.debug("Starting ResNet inference")
-            if image.mode != 'RGB':
-                image = image.convert('RGB')
-                logger.debug(f"Converted image from {image.mode} to RGB")
-            inputs = self.processor(image, return_tensors="pt")
-            # Perform inference
-            with torch.no_grad():
-                logits = self.model(**inputs).logits
-            # Get prediction
-            predicted_label = logits.argmax(-1).item()
-            predicted_class = self.model.config.id2label[predicted_label]
-            # Calculate confidence score
-            probabilities = torch.nn.functional.softmax(logits, dim=-1)
-            confidence = probabilities[0][predicted_label].item()
-            result = {
-                "prediction": predicted_class,
-                "confidence": round(confidence, 4),
-                "model": self.model_name,
-                "predicted_label": predicted_label
-            }
-            logger.debug(f"Inference completed: {predicted_class} (confidence: {confidence:.4f})")
-            return result
-        except Exception as e:
-            logger.error(f"Inference failed: {e}")
-            raise ValueError(f"Failed to process image: {str(e)}")
     @property
     def is_loaded(self) -> bool:
-        """Check if model is loaded."""
         return self._is_loaded

+"""ResNet inference service implementation."""
 import os
+import base64
+from io import BytesIO
 import torch
 from PIL import Image
 from transformers import AutoImageProcessor, ResNetForImageClassification
 from app.core.logging import logger
+from app.services.base import InferenceService
+from app.api.models import ImageRequest, PredictionResponse
+class ResNetInferenceService(InferenceService[ImageRequest, PredictionResponse]):
+    """ResNet-18 inference service for image classification."""
+    def __init__(self, model_name: str = "microsoft/resnet-18"):
         self.model_name = model_name
         self.model = None
         self.processor = None
         self._is_loaded = False
+        self.model_path = os.path.join("models", model_name)
+        logger.info(f"Initializing ResNet service: {self.model_path}")
     def load_model(self) -> None:
         if self._is_loaded:
             return
+        if not os.path.exists(self.model_path):
+            raise FileNotFoundError(f"Model not found: {self.model_path}")
+        config_path = os.path.join(self.model_path, "config.json")
+        if not os.path.exists(config_path):
+            raise FileNotFoundError(f"Config not found: {config_path}")
+        logger.info(f"Loading model from {self.model_path}")
+        import warnings
+        with warnings.catch_warnings():
+            warnings.filterwarnings("ignore", category=FutureWarning)
+            self.processor = AutoImageProcessor.from_pretrained(
+                self.model_path, local_files_only=True
+            )
+            self.model = ResNetForImageClassification.from_pretrained(
+                self.model_path, local_files_only=True
+            )
+        self._is_loaded = True
+        logger.info(f"Model loaded: {len(self.model.config.id2label)} classes")
+    def predict(self, request: ImageRequest) -> PredictionResponse:
+        image_data = base64.b64decode(request.image.data)
+        image = Image.open(BytesIO(image_data))
+        if image.mode != 'RGB':
+            image = image.convert('RGB')
+        inputs = self.processor(image, return_tensors="pt")
+        with torch.no_grad():
+            logits = self.model(**inputs).logits
+        predicted_label = logits.argmax(-1).item()
+        predicted_class = self.model.config.id2label[predicted_label]
+        probabilities = torch.nn.functional.softmax(logits, dim=-1)
+        confidence = probabilities[0][predicted_label].item()
+        return PredictionResponse(
+            prediction=predicted_class,
+            confidence=round(confidence, 4),
+            model=self.model_name,
+            predicted_label=predicted_label,
+            mediaType=request.image.mediaType
+        )
     @property
     def is_loaded(self) -> bool:
         return self._is_loaded

test_main.http CHANGED Viewed

@@ -1,6 +1,7 @@
-# Test ResNet Prediction Endpoint
-POST http://127.0.0.1:8000/predict/resnet
 Content-Type: application/json
 {

+# Test Prediction Endpoint
+# Works with any model configured at startup (default: ResNet-18)
+POST http://127.0.0.1:8000/predict
 Content-Type: application/json
 {