File size: 2,243 Bytes
1806a1b
2767c41
1806a1b
2767c41
1806a1b
 
2767c41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
title: S4-FIFO Parameter Prediction API
sdk: docker
app_port: 7860
---

# S4-FIFO Parameter Prediction API

This Docker Space exposes the S4-FIFO control-plane inference artifact as a FastAPI service.

The service accepts one 73-dimensional cache-level feature vector and returns:

- the risk-minimizing S4-FIFO class and parameter set
- the top candidates by model probability
- the top candidates by expected risk

## Endpoints

- `GET /health`
- `GET /metadata`
- `POST /predict`
- `GET /docs`

## Request Example

```bash
curl -X POST "https://<username>-<space-name>.hf.space/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "features": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
    "top_k": 3
  }'
```

## Artifact Notes

This Space uses the full 20-model LightGBM ensemble from `analysis/xgb_18class_rerun_local/ensemble_models.pkl`, stored as a compressed joblib artifact under `models/ensemble_models.joblib`.

The service performs data-driven risk-minimizing inference with `cost_matrix.npy`, matching the training-side RMI logic:

```text
expected_risk[predicted_class] = cost_matrix[predicted_class] @ class_probabilities
```

The compressed model artifact is large, so the first request after a cold start can take time while the model is loaded. A smaller dependency-free m2cgen artifact would require training/exporting a lite 73-feature model; the existing header-only lite export in `CacheLib/cachelib/allocator/s4fifo_model` uses a 75-feature model and is therefore not wired into this 73-feature API.

## Deploy to Hugging Face Spaces

Create a Docker Space named `s4fifo-api`, then upload this directory as the Space root:

```bash
cd s4fifo-api
python -m pip install -U huggingface_hub
huggingface-cli login
huggingface-cli upload <username>/s4fifo-api . --repo-type space
```

For non-interactive upload, set `HF_TOKEN` in your shell instead of committing it to the repository.