uralstech's picture
Update README.md
6a9a002 verified
---
model_name: AIDE-Chip-Surrogates
license: cc-by-nc-sa-4.0
library_name: xgboost
pipeline_tag: tabular-regression
tags:
- computer-architecture
- gem5
- cache
- surrogate-model
- explainable-ai
- shap
- monotonic-constraints
- systems-ml
datasets:
- uralstech/AIDE-Chip-15K-gem5-Sims
---
# AIDE Chip Surrogates
This is a collection of physics-aware, monotonicity-constrained XGBoost models that replace expensive gem5 cache simulations during design-space exploration.
Each model predicts either IPC or L2 miss rate for a specific workload, using only cache configuration parameters as input. The models are interpretable via SHAP and enforce microarchitectural monotonicity where physically justified.
This model release accompanies the paper:
> Udayshankar Ravikumar . Fast, Explainable Surrogate Models for gem5 Cache Design Space Exploration. Authorea. January 14, 2026.
> <https://doi.org/10.22541/au.176843174.46109183/v1>
## Model Architecture
* Algorithm: XGBoost Regressor
* Targets:
* IPC
* L2 miss rate
* Features:
* Logβ‚‚ cache sizes & associativities
* Set-count proxies
* Cache hierarchy ratios
* Constraints:
* Monotonic constraints encoding cache physics
* Selective relaxation for latency-sensitive workloads
## Available Models
| Workload | IPC Model | L2 Miss Model |
| ---------- | --------- | ------------- |
| crc32 | βœ” | βœ” |
| dijkstra | βœ” | βœ” |
| fft | βœ” | βœ” |
| matrix_mul | βœ” | βœ” |
| qsort | βœ” | βœ” |
| sha | βœ” | βœ” |
Total models: **12**
## Performance
* Test set accuracy: **RΒ² β‰ˆ 0.999**
* OOD validation:
* 26 unseen cache configurations
* **~817Γ— critical-path speedup**
* Low absolute error even when RΒ² is unstable
## Explainability
Each model is uploaded with its SHAP summary plot. They confirm:
* Cache sizes dominate IPC & miss behavior
* Associativity effects are workload-dependent
* Learned relationships align with microarchitectural intuition
## Intended Use
* Architecture research
* Design-space exploration
* Educational use
* Explainable systems ML
**Not for commercial deployment** without separate licensing.
## Limitations
* Single-core, single-thread models
* Cache hierarchy only (no pipeline, prefetcher, or multicore effects)
* Accuracy depends on training coverage; extreme OOD configs are flagged
## Patent Notice
The models uploaded here implement techniques described in an accompanying research paper.
The author has filed a pending patent application that may cover broader
design-space exploration workflows beyond these specific model implementations.
The open-source license (CC BY-NC-SA 4.0) governs use of these models.
This notice is informational only.