File size: 2,840 Bytes
57ecf0f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
908130a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b8859e7
 
 
 
908130a
 
 
 
 
 
 
 
6a9a002
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
model_name: AIDE-Chip-Surrogates
license: cc-by-nc-sa-4.0
library_name: xgboost
pipeline_tag: tabular-regression
tags:
- computer-architecture
- gem5
- cache
- surrogate-model
- explainable-ai
- shap
- monotonic-constraints
- systems-ml
datasets:
- uralstech/AIDE-Chip-15K-gem5-Sims
---

# AIDE Chip Surrogates

This is a collection of physics-aware, monotonicity-constrained XGBoost models that replace expensive gem5 cache simulations during design-space exploration.

Each model predicts either IPC or L2 miss rate for a specific workload, using only cache configuration parameters as input. The models are interpretable via SHAP and enforce microarchitectural monotonicity where physically justified.

This model release accompanies the paper:

> Udayshankar Ravikumar . Fast, Explainable Surrogate Models for gem5 Cache Design Space Exploration. Authorea. January 14, 2026.
> <https://doi.org/10.22541/au.176843174.46109183/v1>

## Model Architecture

* Algorithm: XGBoost Regressor
* Targets:
  * IPC
  * L2 miss rate
* Features:
  * Logâ‚‚ cache sizes & associativities
  * Set-count proxies
  * Cache hierarchy ratios
* Constraints:
  * Monotonic constraints encoding cache physics
  * Selective relaxation for latency-sensitive workloads

## Available Models

| Workload   | IPC Model | L2 Miss Model |
| ---------- | --------- | ------------- |
| crc32      | ✔         | ✔             |
| dijkstra   | ✔         | ✔             |
| fft        | ✔         | ✔             |
| matrix_mul | ✔         | ✔             |
| qsort      | ✔         | ✔             |
| sha        | ✔         | ✔             |

Total models: **12**

## Performance

* Test set accuracy: **R² ≈ 0.999**
* OOD validation:
  * 26 unseen cache configurations
  * **~817× critical-path speedup**
  * Low absolute error even when R² is unstable

## Explainability

Each model is uploaded with its SHAP summary plot. They confirm:

* Cache sizes dominate IPC & miss behavior
* Associativity effects are workload-dependent
* Learned relationships align with microarchitectural intuition

## Intended Use

* Architecture research
* Design-space exploration
* Educational use
* Explainable systems ML

**Not for commercial deployment** without separate licensing.

## Limitations

* Single-core, single-thread models
* Cache hierarchy only (no pipeline, prefetcher, or multicore effects)
* Accuracy depends on training coverage; extreme OOD configs are flagged

## Patent Notice

The models uploaded here implement techniques described in an accompanying research paper.
The author has filed a pending patent application that may cover broader
design-space exploration workflows beyond these specific model implementations.

The open-source license (CC BY-NC-SA 4.0) governs use of these models.
This notice is informational only.