Spaces:
Sleeping
title: S4-FIFO Parameter Prediction API
sdk: docker
app_port: 7860
S4-FIFO Parameter Prediction API
This Docker Space exposes the S4-FIFO control-plane inference artifact as a FastAPI service.
The service accepts one 73-dimensional cache-level feature vector and returns:
- the risk-minimizing S4-FIFO class and parameter set
- the top candidates by model probability
- the top candidates by expected risk
Endpoints
GET /healthGET /metadataPOST /predictGET /docs
Request Example
curl -X POST "https://<username>-<space-name>.hf.space/predict" \
-H "Content-Type: application/json" \
-d '{
"features": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
"top_k": 3
}'
Artifact Notes
This Space uses the full 20-model LightGBM ensemble from analysis/xgb_18class_rerun_local/ensemble_models.pkl, stored as a compressed joblib artifact under models/ensemble_models.joblib.
The service performs data-driven risk-minimizing inference with cost_matrix.npy, matching the training-side RMI logic:
expected_risk[predicted_class] = cost_matrix[predicted_class] @ class_probabilities
The compressed model artifact is large, so the first request after a cold start can take time while the model is loaded. A smaller dependency-free m2cgen artifact would require training/exporting a lite 73-feature model; the existing header-only lite export in CacheLib/cachelib/allocator/s4fifo_model uses a 75-feature model and is therefore not wired into this 73-feature API.
Deploy to Hugging Face Spaces
Create a Docker Space named s4fifo-api, then upload this directory as the Space root:
cd s4fifo-api
python -m pip install -U huggingface_hub
huggingface-cli login
huggingface-cli upload <username>/s4fifo-api . --repo-type space
For non-interactive upload, set HF_TOKEN in your shell instead of committing it to the repository.