amarck commited on
Commit
aa74b30
·
1 Parent(s): 22374d1

Quarantine UAF detection: 5/5 CVE patterns, 0 false positives

Browse files

V2 harness now delays free() and fills with canary (0xFD). Detects
UAF writes by checking canary integrity on every subsequent operation.
Closes the UAF gap (test 3). Also adds goal.md with roadmap.

Files changed (2) hide show
  1. goal.md +84 -0
  2. heaptrm/harness/heapgrid_v2.c +84 -10
goal.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # HeapTRM — Goals & Roadmap
2
+
3
+ ## Current State (March 2026)
4
+
5
+ 304K-param TRM classifier detecting 90-97% of heap exploit techniques across glibc 2.27-2.39. V2 harness catches metadata corruption and double-frees with zero false positives on CVE pattern tests. Pwntools integration and CLI built. Action agent achieves 74% UAF success on real binary (hybrid: TRM for strategy, rules for trigger).
6
+
7
+ ## Immediate (next session)
8
+
9
+ ### 1. Quarantine UAF detection (#4)
10
+ - Implement quarantine zone in v2 harness: delay real free(), fill with canary, detect overwrites
11
+ - Closes the UAF gap (test 3 in CVE sims)
12
+ - Lightweight ASAN-in-LD_PRELOAD for unmodified binaries
13
+
14
+ ### 2. Retrain classifier on v2 harness data (#2)
15
+ - V2 dumps include `is_corrupted`, corruption types, metadata change info
16
+ - Train TRM to predict corruption BEFORE it happens (pre-overflow heap layouts)
17
+ - Use cross-glibc Docker data with v2 harness
18
+
19
+ ## Short-term (1-2 weeks)
20
+
21
+ ### 3. AFL/libFuzzer oracle integration (#1)
22
+ - AFL custom mutator that scores heap states via `heaptrm scan --json`
23
+ - Fitness = code coverage + exploit-reachability score
24
+ - Directed fuzzing toward exploitable heap states
25
+
26
+ ### 4. Real-world CVE benchmark (#5)
27
+ - CI pipeline: download CVE PoC binaries, Docker with matching glibc, run heaptrm, compare ground truth
28
+ - Start with 10 CVEs: glibc syslog (CVE-2023-6246), iconv (CVE-2024-2961), sudo heap overflow, polkit pkexec, etc.
29
+ - Publish as a benchmark for heap exploit detection
30
+
31
+ ### 5. Temporal sequence model (#3)
32
+ - Replace per-state classification with sequence model (window of K states)
33
+ - LSTM or Transformer over TRM state embeddings
34
+ - Detect multi-step patterns: spray → free → realloc
35
+
36
+ ## Medium-term (1-2 months)
37
+
38
+ ### 6. Java deserialization extension
39
+ - Instrument JVM ObjectInputStream
40
+ - Encode object graphs as grids
41
+ - Detect gadget chains (ysoserial patterns)
42
+ - Far bigger attack surface than pickle
43
+
44
+ ### 7. Windows heap support
45
+ - API hooking via Detours or ETW tracing
46
+ - NT heap / segment heap / LFH metadata encoding
47
+ - Test against Windows exploit techniques
48
+ - Same universal grid, different harness
49
+
50
+ ### 8. pypi package release
51
+ - `pip install heaptrm`
52
+ - Pre-trained weights bundled
53
+ - Auto-compile harness on first use
54
+ - Documentation + examples
55
+
56
+ ## Long-term (research direction)
57
+
58
+ ### 9. TRM as fuzzer-in-the-loop
59
+ - Not just scoring states, but guiding input generation
60
+ - TRM predicts which mutations are most likely to trigger corruption
61
+ - Combine with symbolic execution for targeted constraint solving
62
+
63
+ ### 10. Allocator-agnostic generalization
64
+ - Test on jemalloc (Firefox, FreeBSD), tcmalloc (Chrome), mimalloc
65
+ - The universal grid should transfer — validate this claim
66
+ - Per-allocator fine-tuning vs zero-shot
67
+
68
+ ### 11. Pre-corruption prediction
69
+ - Train on temporal sequences ending in corruption
70
+ - Predict "this heap layout is N steps from exploitable" before overflow happens
71
+ - Runtime defense: alert/kill process before corruption materializes
72
+
73
+ ## Open Questions
74
+
75
+ - Is the TRM architecture actually necessary, or would a simple MLP on the summary row work just as well? (Ablation showed 3% overall contribution from chunk data, but 24% on trigger timing)
76
+ - Can the model generalize to binaries with 100K+ heap objects? Current grid is 32 rows.
77
+ - Is metadata corruption detection sufficient without UAF? Real-world exploits often chain UAF → tcache poison → arbitrary write.
78
+ - Should we pursue kernel heap exploitation (SLUB/SLAB)? Different allocator but same grid concept.
79
+
80
+ ## Non-goals
81
+
82
+ - Replacing ASAN/MSAN for development — those require recompilation but are more thorough
83
+ - Kernel exploit detection — different domain, different harness needed
84
+ - Closed-source binary analysis without execution — we need runtime instrumentation
heaptrm/harness/heapgrid_v2.c CHANGED
@@ -29,6 +29,10 @@
29
  #define DUMP_BUF_SIZE (1024 * 128)
30
  #define CANARY_VALUE 0xDEADBEEFCAFEBABEULL
31
 
 
 
 
 
32
  typedef struct {
33
  void *user_ptr;
34
  size_t req_size;
@@ -70,6 +74,72 @@ static void *(*real_realloc)(void *, size_t) = NULL;
70
  static char early_buf[4096];
71
  static int early_buf_used = 0;
72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
  /* --- Simple hash for data-change detection --- */
74
  static uint64_t hash_bytes(const void *data, size_t len) {
75
  const uint8_t *p = (const uint8_t *)data;
@@ -181,15 +251,16 @@ static void validate_all_chunks(void) {
181
  g_chunks[i].saved_prev_size = cur_prev_size;
182
  }
183
 
184
- /* Check 3: UAF detection is NOT done here.
185
- * Reliable UAF detection requires compile-time instrumentation (ASAN)
186
- * or hardware watchpoints. LD_PRELOAD cannot distinguish glibc's
187
- * legitimate writes to freed chunks from attacker UAF writes.
188
- * We rely on metadata_corrupt and double_free checks instead. */
189
  }
190
 
191
  /* Check 4: Double-free detection — only flag in free() handler,
192
  * not here, to avoid re-reporting on every validation pass. */
 
 
 
193
  }
194
 
195
  /* --- Dump state --- */
@@ -331,13 +402,16 @@ void free(void *ptr) {
331
  if (idx >= 0) {
332
  g_chunks[idx].state = 2;
333
  g_chunks[idx].free_order = ++g_free_seq;
334
- g_chunks[idx].hash_stable = 0; /* will record post-free hash on next validate */
 
 
 
 
335
  } else {
336
  /* Might be double-free — detect BEFORE calling real_free (glibc may abort) */
337
  int any = find_chunk(ptr);
338
  if (any >= 0 && g_chunks[any].state == 2) {
339
  add_corruption("double_free", any, "freed already-freed chunk");
340
- /* Dump state with corruption BEFORE glibc aborts */
341
  dump_state("free_double", ptr, 0);
342
  if (g_chunk_count < MAX_CHUNKS) {
343
  int slot = g_chunk_count++;
@@ -347,10 +421,10 @@ void free(void *ptr) {
347
  g_chunks[slot].hash_stable = 0;
348
  }
349
  }
 
 
 
350
  }
351
-
352
- real_free(ptr);
353
- dump_state("free", ptr, 0);
354
  g_in_hook = 0;
355
  }
356
 
 
29
  #define DUMP_BUF_SIZE (1024 * 128)
30
  #define CANARY_VALUE 0xDEADBEEFCAFEBABEULL
31
 
32
+ /* Quarantine: delay real free() to detect UAF writes */
33
+ #define QUARANTINE_SIZE 32
34
+ #define CANARY_BYTE 0xFD
35
+
36
  typedef struct {
37
  void *user_ptr;
38
  size_t req_size;
 
74
  static char early_buf[4096];
75
  static int early_buf_used = 0;
76
 
77
+ /* Forward declaration */
78
+ static void add_corruption(const char *type, int chunk_idx, const char *detail);
79
+
80
+ /* --- Quarantine zone for UAF detection --- */
81
+ typedef struct {
82
+ void *ptr;
83
+ size_t size;
84
+ int chunk_idx; /* index in g_chunks */
85
+ } quarantine_entry_t;
86
+
87
+ static quarantine_entry_t g_quarantine[QUARANTINE_SIZE];
88
+ static int g_quarantine_head = 0;
89
+ static int g_quarantine_count = 0;
90
+
91
+ static void quarantine_check_canary(int q_idx) {
92
+ quarantine_entry_t *qe = &g_quarantine[q_idx];
93
+ if (!qe->ptr) return;
94
+ const uint8_t *p = (const uint8_t *)qe->ptr;
95
+ size_t check_len = qe->size < 128 ? qe->size : 128;
96
+ for (size_t i = 0; i < check_len; i++) {
97
+ if (p[i] != CANARY_BYTE) {
98
+ char detail[256];
99
+ snprintf(detail, sizeof(detail),
100
+ "quarantine canary overwritten at offset %zu (0x%02x != 0xFD) — UAF write",
101
+ i, p[i]);
102
+ add_corruption("uaf_write", qe->chunk_idx, detail);
103
+ /* Re-fill canary to detect further writes */
104
+ memset(qe->ptr, CANARY_BYTE, qe->size < 128 ? qe->size : 128);
105
+ return;
106
+ }
107
+ }
108
+ }
109
+
110
+ static void quarantine_check_all(void) {
111
+ for (int i = 0; i < g_quarantine_count; i++) {
112
+ int idx = (g_quarantine_head - g_quarantine_count + i + QUARANTINE_SIZE) % QUARANTINE_SIZE;
113
+ quarantine_check_canary(idx);
114
+ }
115
+ }
116
+
117
+ static void quarantine_add(void *ptr, size_t size, int chunk_idx) {
118
+ /* If quarantine is full, drain oldest entry (actually free it) */
119
+ if (g_quarantine_count >= QUARANTINE_SIZE) {
120
+ int oldest = (g_quarantine_head - g_quarantine_count + QUARANTINE_SIZE) % QUARANTINE_SIZE;
121
+ quarantine_entry_t *old = &g_quarantine[oldest];
122
+ if (old->ptr) {
123
+ /* Final canary check before real free */
124
+ quarantine_check_canary(oldest);
125
+ real_free(old->ptr);
126
+ old->ptr = NULL;
127
+ }
128
+ g_quarantine_count--;
129
+ }
130
+
131
+ /* Fill with canary pattern */
132
+ size_t fill_len = size < 128 ? size : 128;
133
+ memset(ptr, CANARY_BYTE, fill_len);
134
+
135
+ /* Add to quarantine */
136
+ g_quarantine[g_quarantine_head].ptr = ptr;
137
+ g_quarantine[g_quarantine_head].size = size;
138
+ g_quarantine[g_quarantine_head].chunk_idx = chunk_idx;
139
+ g_quarantine_head = (g_quarantine_head + 1) % QUARANTINE_SIZE;
140
+ g_quarantine_count++;
141
+ }
142
+
143
  /* --- Simple hash for data-change detection --- */
144
  static uint64_t hash_bytes(const void *data, size_t len) {
145
  const uint8_t *p = (const uint8_t *)data;
 
251
  g_chunks[i].saved_prev_size = cur_prev_size;
252
  }
253
 
254
+ /* Check 3: UAF detection moved to quarantine canary system (Check 5).
255
+ * Quarantine delays real free(), fills with canary pattern (0xFD),
256
+ * and checks for overwrite on every subsequent operation. */
 
 
257
  }
258
 
259
  /* Check 4: Double-free detection — only flag in free() handler,
260
  * not here, to avoid re-reporting on every validation pass. */
261
+
262
+ /* Check 5: Quarantine canary verification (UAF write detection) */
263
+ quarantine_check_all();
264
  }
265
 
266
  /* --- Dump state --- */
 
402
  if (idx >= 0) {
403
  g_chunks[idx].state = 2;
404
  g_chunks[idx].free_order = ++g_free_seq;
405
+ g_chunks[idx].hash_stable = 0;
406
+
407
+ /* Quarantine: don't actually free yet, fill with canary */
408
+ quarantine_add(ptr, g_chunks[idx].req_size, idx);
409
+ dump_state("free", ptr, 0);
410
  } else {
411
  /* Might be double-free — detect BEFORE calling real_free (glibc may abort) */
412
  int any = find_chunk(ptr);
413
  if (any >= 0 && g_chunks[any].state == 2) {
414
  add_corruption("double_free", any, "freed already-freed chunk");
 
415
  dump_state("free_double", ptr, 0);
416
  if (g_chunk_count < MAX_CHUNKS) {
417
  int slot = g_chunk_count++;
 
421
  g_chunks[slot].hash_stable = 0;
422
  }
423
  }
424
+ /* Still actually free for double-free (glibc will handle/abort) */
425
+ real_free(ptr);
426
+ dump_state("free", ptr, 0);
427
  }
 
 
 
428
  g_in_hook = 0;
429
  }
430