phanerozoic commited on
Commit
8e667db
·
verified ·
1 Parent(s): 68aa2ea

Add defensive half: Carbon-8B resists the composition-matched adversary (0.53 -> 1.00 at order 5+)

Browse files
Files changed (1) hide show
  1. ADVERSARIAL.md +20 -0
ADVERSARIAL.md CHANGED
@@ -27,6 +27,26 @@ breaks at m=1, k=4 at m=3, and k=6 at m=5. The staircase is the sufficient-stati
27
  made visible. The hexamer detector this model uses is blind to an adversary who matches the
28
  order-5 composition of human DNA (AUROC 0.53 at m=5).
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  ## Implication for biosecurity screening
31
 
32
  Homology-free, composition-based screening, the family that includes k-mer engineered-DNA
 
27
  made visible. The hexamer detector this model uses is blind to an adversary who matches the
28
  order-5 composition of human DNA (AUROC 0.53 at m=5).
29
 
30
+ ## The neural model is not evaded
31
+
32
+ Scoring the same order-m-matched synthetic human with Carbon-8B (zero-shot per-base likelihood)
33
+ separates it from real human across every order, exactly where composition fails:
34
+
35
+ | adversary order m | closed-form k=6 (AUROC) | Carbon-8B (AUROC) |
36
+ |---|---|---|
37
+ | 2 | 0.95 | 1.00 |
38
+ | 3 | 0.77 | 1.00 |
39
+ | 4 | 0.68 | 1.00 |
40
+ | 5 | 0.53 | 1.00 |
41
+ | 6 | 0.52 | 1.00 |
42
+ | 7 | 0.52 | 1.00 |
43
+
44
+ The order-5-matched construct is invisible to the hexamer detector (0.53) and obvious to the model
45
+ (1.00). Even an order-7 match, reproducing every 8-mer frequency of human DNA, is caught at 0.997,
46
+ because the model reads long-range structure, codon-pair grammar, gene organization, and motif
47
+ context, that no fixed-order composition encodes. The model's value here is precisely adversarial
48
+ robustness against the evasion composition cannot resist.
49
+
50
  ## Implication for biosecurity screening
51
 
52
  Homology-free, composition-based screening, the family that includes k-mer engineered-DNA