Spaces:

hxia7
/

s4fifo-api

Sleeping

App Files Files Community

hxia7 commited on 5 days ago

Commit

2767c41

verified ·

1 Parent(s): 1806a1b

Deploy S4-FIFO FastAPI artifact

Browse files

Files changed (9) hide show

.gitattributes +2 -35
Dockerfile +16 -0
README.md +54 -6
cost_matrix.npy +3 -0
main.py +62 -0
model_metadata.json +214 -0
models/ensemble_models.joblib +3 -0
predictor.py +106 -0
requirements.txt +6 -0

.gitattributes CHANGED Viewed

@@ -1,35 +1,2 @@
-*.7z filter=lfs diff=lfs merge=lfs -text
-*.arrow filter=lfs diff=lfs merge=lfs -text
-*.bin filter=lfs diff=lfs merge=lfs -text
-*.bz2 filter=lfs diff=lfs merge=lfs -text
-*.ckpt filter=lfs diff=lfs merge=lfs -text
-*.ftz filter=lfs diff=lfs merge=lfs -text
-*.gz filter=lfs diff=lfs merge=lfs -text
-*.h5 filter=lfs diff=lfs merge=lfs -text
-*.joblib filter=lfs diff=lfs merge=lfs -text
-*.lfs.* filter=lfs diff=lfs merge=lfs -text
-*.mlmodel filter=lfs diff=lfs merge=lfs -text
-*.model filter=lfs diff=lfs merge=lfs -text
-*.msgpack filter=lfs diff=lfs merge=lfs -text
-*.npy filter=lfs diff=lfs merge=lfs -text
-*.npz filter=lfs diff=lfs merge=lfs -text
-*.onnx filter=lfs diff=lfs merge=lfs -text
-*.ot filter=lfs diff=lfs merge=lfs -text
-*.parquet filter=lfs diff=lfs merge=lfs -text
-*.pb filter=lfs diff=lfs merge=lfs -text
-*.pickle filter=lfs diff=lfs merge=lfs -text
-*.pkl filter=lfs diff=lfs merge=lfs -text
-*.pt filter=lfs diff=lfs merge=lfs -text
-*.pth filter=lfs diff=lfs merge=lfs -text
-*.rar filter=lfs diff=lfs merge=lfs -text
-*.safetensors filter=lfs diff=lfs merge=lfs -text
-saved_model/**/* filter=lfs diff=lfs merge=lfs -text
-*.tar.* filter=lfs diff=lfs merge=lfs -text
-*.tar filter=lfs diff=lfs merge=lfs -text
-*.tflite filter=lfs diff=lfs merge=lfs -text
-*.tgz filter=lfs diff=lfs merge=lfs -text
-*.wasm filter=lfs diff=lfs merge=lfs -text
-*.xz filter=lfs diff=lfs merge=lfs -text
-*.zip filter=lfs diff=lfs merge=lfs -text
-*.zst filter=lfs diff=lfs merge=lfs -text
-*tfevents* filter=lfs diff=lfs merge=lfs -text


1	+ models/ensemble_models.joblib filter=lfs diff=lfs merge=lfs -text
2	+ cost_matrix.npy filter=lfs diff=lfs merge=lfs -text

Dockerfile ADDED Viewed

	@@ -0,0 +1,16 @@

+FROM python:3.11-slim
+WORKDIR /app
+RUN apt-get update \
+    && apt-get install -y --no-install-recommends libgomp1 \
+    && rm -rf /var/lib/apt/lists/*
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+COPY . .
+EXPOSE 7860
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]

README.md CHANGED Viewed

@@ -1,10 +1,58 @@
 ---
-title: S4fifo Api
-emoji: 🏢
-colorFrom: yellow
-colorTo: gray
 sdk: docker
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: S4-FIFO Parameter Prediction API
 sdk: docker
+app_port: 7860
 ---
+# S4-FIFO Parameter Prediction API
+This Docker Space exposes the S4-FIFO control-plane inference artifact as a FastAPI service.
+The service accepts one 73-dimensional cache-level feature vector and returns:
+- the risk-minimizing S4-FIFO class and parameter set
+- the top candidates by model probability
+- the top candidates by expected risk
+## Endpoints
+- `GET /health`
+- `GET /metadata`
+- `POST /predict`
+- `GET /docs`
+## Request Example
+```bash
+curl -X POST "https://<username>-<space-name>.hf.space/predict" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "features": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
+    "top_k": 3
+  }'
+```
+## Artifact Notes
+This Space uses the full 20-model LightGBM ensemble from `analysis/xgb_18class_rerun_local/ensemble_models.pkl`, stored as a compressed joblib artifact under `models/ensemble_models.joblib`.
+The service performs data-driven risk-minimizing inference with `cost_matrix.npy`, matching the training-side RMI logic:
+```text
+expected_risk[predicted_class] = cost_matrix[predicted_class] @ class_probabilities
+```
+The compressed model artifact is large, so the first request after a cold start can take time while the model is loaded. A smaller dependency-free m2cgen artifact would require training/exporting a lite 73-feature model; the existing header-only lite export in `CacheLib/cachelib/allocator/s4fifo_model` uses a 75-feature model and is therefore not wired into this 73-feature API.
+## Deploy to Hugging Face Spaces
+Create a Docker Space named `s4fifo-api`, then upload this directory as the Space root:
+```bash
+cd s4fifo-api
+python -m pip install -U huggingface_hub
+huggingface-cli login
+huggingface-cli upload <username>/s4fifo-api . --repo-type space
+```
+For non-interactive upload, set `HF_TOKEN` in your shell instead of committing it to the repository.

cost_matrix.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:73b7e9b16ec77e4de4e72718b98fb1cdc819da5442de8a362f6479d28a8a3644
+size 2720

main.py ADDED Viewed

	@@ -0,0 +1,62 @@

+from typing import Any
+from fastapi import FastAPI, HTTPException
+from pydantic import BaseModel, Field
+from predictor import N_FEATURES, get_metadata, predict_from_features
+app = FastAPI(
+    title="S4-FIFO Parameter Prediction API",
+    version="0.1.0",
+    description="Online control-plane inference artifact for S4-FIFO parameter selection.",
+)
+class PredictRequest(BaseModel):
+    features: list[float] = Field(
+        ...,
+        description="73-dimensional cache-level feature vector in the training feature order.",
+        min_length=N_FEATURES,
+        max_length=N_FEATURES,
+    )
+    top_k: int = Field(
+        default=3,
+        ge=1,
+        le=18,
+        description="Number of probability/risk-ranked candidate configurations to return.",
+    )
+@app.get("/")
+def root() -> dict[str, Any]:
+    return {
+        "service": "S4-FIFO Parameter Prediction API",
+        "version": app.version,
+        "endpoints": {
+            "health": "/health",
+            "metadata": "/metadata",
+            "predict": "POST /predict",
+            "docs": "/docs",
+        },
+    }
+@app.get("/health")
+def health() -> dict[str, str]:
+    return {"status": "ok"}
+@app.get("/metadata")
+def metadata() -> dict[str, Any]:
+    return get_metadata()
+@app.post("/predict")
+def predict(req: PredictRequest) -> dict[str, Any]:
+    if len(req.features) != N_FEATURES:
+        raise HTTPException(
+            status_code=400,
+            detail=f"Expected {N_FEATURES} features, got {len(req.features)}",
+        )
+    return predict_from_features(req.features, top_k=req.top_k)

model_metadata.json ADDED Viewed

	@@ -0,0 +1,214 @@

+{
+  "model_source": "analysis/xgb_18class_rerun_local/ensemble_models.pkl",
+  "model_type": "LightGBM multiclass ensemble",
+  "n_models": 20,
+  "n_features": 73,
+  "n_classes": 18,
+  "feature_columns": [
+    "H_g",
+    "H_m",
+    "H_s",
+    "decay_rate_small",
+    "entropy_gap",
+    "ghost_pressure",
+    "hist_ghost_0",
+    "hist_ghost_1",
+    "hist_ghost_10",
+    "hist_ghost_11",
+    "hist_ghost_12",
+    "hist_ghost_13",
+    "hist_ghost_14",
+    "hist_ghost_15",
+    "hist_ghost_16",
+    "hist_ghost_17",
+    "hist_ghost_18",
+    "hist_ghost_19",
+    "hist_ghost_2",
+    "hist_ghost_3",
+    "hist_ghost_4",
+    "hist_ghost_5",
+    "hist_ghost_6",
+    "hist_ghost_7",
+    "hist_ghost_8",
+    "hist_ghost_9",
+    "hist_main_0",
+    "hist_main_1",
+    "hist_main_10",
+    "hist_main_11",
+    "hist_main_12",
+    "hist_main_13",
+    "hist_main_14",
+    "hist_main_15",
+    "hist_main_16",
+    "hist_main_17",
+    "hist_main_18",
+    "hist_main_19",
+    "hist_main_2",
+    "hist_main_3",
+    "hist_main_4",
+    "hist_main_5",
+    "hist_main_6",
+    "hist_main_7",
+    "hist_main_8",
+    "hist_main_9",
+    "hist_small_0",
+    "hist_small_1",
+    "hist_small_10",
+    "hist_small_11",
+    "hist_small_12",
+    "hist_small_13",
+    "hist_small_14",
+    "hist_small_15",
+    "hist_small_16",
+    "hist_small_17",
+    "hist_small_18",
+    "hist_small_19",
+    "hist_small_2",
+    "hist_small_3",
+    "hist_small_4",
+    "hist_small_5",
+    "hist_small_6",
+    "hist_small_7",
+    "hist_small_8",
+    "hist_small_9",
+    "probation_efficiency",
+    "rho_onehit",
+    "rho_unique",
+    "scan_intensity",
+    "tail_heaviness",
+    "thrashing_risk",
+    "total_reqs"
+  ],
+  "parameter_sets": [
+    {
+      "class": 0,
+      "rho_s": 0.2,
+      "tau_s": 1,
+      "tau_g": 0,
+      "rho_g": 3.0
+    },
+    {
+      "class": 1,
+      "rho_s": 0.05,
+      "tau_s": 1,
+      "tau_g": 0,
+      "rho_g": 0.9
+    },
+    {
+      "class": 2,
+      "rho_s": 0.5,
+      "tau_s": 1,
+      "tau_g": 0,
+      "rho_g": 0.9
+    },
+    {
+      "class": 3,
+      "rho_s": 0.2,
+      "tau_s": 1,
+      "tau_g": 0,
+      "rho_g": 0.9
+    },
+    {
+      "class": 4,
+      "rho_s": 0.05,
+      "tau_s": 2,
+      "tau_g": 0,
+      "rho_g": 6.0
+    },
+    {
+      "class": 5,
+      "rho_s": 0.1,
+      "tau_s": 2,
+      "tau_g": 1,
+      "rho_g": 3.0
+    },
+    {
+      "class": 6,
+      "rho_s": 0.3,
+      "tau_s": 2,
+      "tau_g": 0,
+      "rho_g": 3.0
+    },
+    {
+      "class": 7,
+      "rho_s": 0.05,
+      "tau_s": 2,
+      "tau_g": 0,
+      "rho_g": 3.0
+    },
+    {
+      "class": 8,
+      "rho_s": 0.1,
+      "tau_s": 2,
+      "tau_g": 0,
+      "rho_g": 0.9
+    },
+    {
+      "class": 9,
+      "rho_s": 0.7,
+      "tau_s": 1,
+      "tau_g": 1,
+      "rho_g": 0.9
+    },
+    {
+      "class": 10,
+      "rho_s": 0.2,
+      "tau_s": 1,
+      "tau_g": 1,
+      "rho_g": 0.9
+    },
+    {
+      "class": 11,
+      "rho_s": 0.05,
+      "tau_s": 1,
+      "tau_g": 1,
+      "rho_g": 0.9
+    },
+    {
+      "class": 12,
+      "rho_s": 0.3,
+      "tau_s": 1,
+      "tau_g": 0,
+      "rho_g": 6.0
+    },
+    {
+      "class": 13,
+      "rho_s": 0.2,
+      "tau_s": 2,
+      "tau_g": 0,
+      "rho_g": 0.9
+    },
+    {
+      "class": 14,
+      "rho_s": 0.9,
+      "tau_s": 2,
+      "tau_g": 0,
+      "rho_g": 3.0
+    },
+    {
+      "class": 15,
+      "rho_s": 0.1,
+      "tau_s": 2,
+      "tau_g": 0,
+      "rho_g": 6.0
+    },
+    {
+      "class": 16,
+      "rho_s": 0.3,
+      "tau_s": 2,
+      "tau_g": 1,
+      "rho_g": 3.0
+    },
+    {
+      "class": 17,
+      "rho_s": 0.05,
+      "tau_s": 2,
+      "tau_g": 0,
+      "rho_g": 0.9
+    }
+  ],
+  "cost_matrix_shape": [
+    18,
+    18
+  ]
+}

models/ensemble_models.joblib ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:73bc403ead3e52a4462a2ff1732ec93f3ecd064e8b4a7e99c11089fdd568e8ed
+size 372092377

predictor.py ADDED Viewed

	@@ -0,0 +1,106 @@

+from __future__ import annotations
+import json
+import warnings
+from functools import lru_cache
+from pathlib import Path
+from typing import Any
+import joblib
+import numpy as np
+APP_DIR = Path(__file__).resolve().parent
+MODEL_PATH = APP_DIR / "models" / "ensemble_models.joblib"
+COST_MATRIX_PATH = APP_DIR / "cost_matrix.npy"
+METADATA_PATH = APP_DIR / "model_metadata.json"
+with METADATA_PATH.open() as f:
+    _METADATA = json.load(f)
+N_FEATURES = int(_METADATA["n_features"])
+N_CLASSES = int(_METADATA["n_classes"])
+PARAMETER_SETS = _METADATA["parameter_sets"]
+@lru_cache(maxsize=1)
+def _load_models() -> list[Any]:
+    return joblib.load(MODEL_PATH)
+@lru_cache(maxsize=1)
+def _load_cost_matrix() -> np.ndarray:
+    costs = np.load(COST_MATRIX_PATH)
+    if costs.shape != (N_CLASSES, N_CLASSES):
+        raise ValueError(f"Expected cost matrix {(N_CLASSES, N_CLASSES)}, got {costs.shape}")
+    return costs.astype(np.float64, copy=False)
+def get_metadata() -> dict[str, Any]:
+    return {
+        "model_type": _METADATA["model_type"],
+        "model_source": _METADATA["model_source"],
+        "n_models": _METADATA["n_models"],
+        "n_features": N_FEATURES,
+        "n_classes": N_CLASSES,
+        "feature_columns": _METADATA["feature_columns"],
+        "parameter_sets": PARAMETER_SETS,
+        "cost_matrix_shape": _METADATA["cost_matrix_shape"],
+    }
+def _predict_probabilities(features: list[float]) -> np.ndarray:
+    x = np.asarray(features, dtype=np.float64).reshape(1, -1)
+    probs = np.zeros(N_CLASSES, dtype=np.float64)
+    for model in _load_models():
+        with warnings.catch_warnings():
+            warnings.filterwarnings("ignore", message="X does not have valid feature names")
+            model_probs = np.asarray(model.predict_proba(x)[0], dtype=np.float64)
+        if model_probs.shape[0] == N_CLASSES:
+            probs += model_probs
+            continue
+        full_probs = np.zeros(N_CLASSES, dtype=np.float64)
+        classes = getattr(model, "classes_", [])
+        for src_idx, class_id in enumerate(classes):
+            full_probs[int(class_id)] = model_probs[src_idx]
+        probs += full_probs
+    probs /= len(_load_models())
+    total = probs.sum()
+    if total > 0:
+        probs /= total
+    return probs
+def _ranked_entries(indices: np.ndarray, probs: np.ndarray, risks: np.ndarray) -> list[dict[str, Any]]:
+    return [
+        {
+            "class": int(i),
+            "probability": float(probs[i]),
+            "expected_risk": float(risks[i]),
+            "params": PARAMETER_SETS[int(i)],
+        }
+        for i in indices
+    ]
+def predict_from_features(features: list[float], top_k: int = 3) -> dict[str, Any]:
+    probs = _predict_probabilities([float(v) for v in features])
+    risks = _load_cost_matrix() @ probs
+    selected_idx = int(np.argmin(risks))
+    probability_idx = np.argsort(probs)[::-1][:top_k]
+    risk_idx = np.argsort(risks)[:top_k]
+    probability_argmax = int(np.argmax(probs))
+    return {
+        "selected_class": selected_idx,
+        "selected_params": PARAMETER_SETS[selected_idx],
+        "selection_method": "minimum_expected_risk",
+        "probability_argmax_class": probability_argmax,
+        "probability_argmax_params": PARAMETER_SETS[probability_argmax],
+        "top_by_probability": _ranked_entries(probability_idx, probs, risks),
+        "top_by_expected_risk": _ranked_entries(risk_idx, probs, risks),
+    }

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+fastapi==0.115.6
+uvicorn[standard]==0.34.0
+numpy==2.4.5
+joblib==1.5.3
+lightgbm==4.6.0
+scikit-learn==1.8.0