AbstractPhil commited on
Commit
e80aa37
Β·
verified Β·
1 Parent(s): c3a49a4

Create noise_test_dtype_sweep_d16.txt

Browse files
Files changed (1) hide show
  1. noise_test_dtype_sweep_d16.txt +173 -0
noise_test_dtype_sweep_d16.txt ADDED
@@ -0,0 +1,173 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ==========================================================================================
2
+ CV SPECTRUM β€” FULL DTYPE SWEEP + JITTER ANALYSIS
3
+ Device: cuda
4
+ Dtypes: float32, bfloat16, float16, fp8_e4m3, fp8_e5m2, sim_4bit, sim_2bit, sim_1bit
5
+ ==========================================================================================
6
+
7
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
8
+ SWEEP 1: Uniform sphere β€” dimension Γ— dtype
9
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
10
+ dim float32 bfloat16 float16 fp8_e4m3 fp8_e5m2 sim_4bit sim_2bit sim_1bit
11
+ 8 0.3716 0.3574 0.3608 0.3551 0.3619 0.3573 0.3646 0.3601
12
+ 16 0.2041* 0.2034* 0.2040* 0.2049* 0.2036* 0.2073* 0.2102* 0.2056*
13
+ 24 0.1530 0.1534 0.1540 0.1541 0.1547 0.1509 0.1542 0.1511
14
+ 32 0.1283 0.1279 0.1285 0.1269 0.1263 0.1264 0.1283 0.1304
15
+ 64 0.0832 0.0848 0.0857 0.0858 0.0843 0.0846 0.0869 0.0833
16
+ 128 0.0566 0.0582 0.0576 0.0571 0.0587 0.0576 0.0594 0.0582
17
+ 256 0.0405 0.0407 0.0413 0.0406 0.0407 0.0415 0.0400 0.0394
18
+
19
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
20
+ SWEEP 2: Clustered (10 clusters, spread=0.3) β€” dimension Γ— dtype
21
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
22
+ dim float32 bfloat16 float16 fp8_e4m3 fp8_e5m2 sim_4bit sim_2bit sim_1bit
23
+ 8 0.4572 0.4550 0.4655 0.4506 0.4617 0.4645 0.4480 0.4347
24
+ 16 0.2569* 0.2549* 0.2612* 0.2564* 0.2593* 0.2623* 0.2553* 0.2428*
25
+ 24 0.1890* 0.1874* 0.1891* 0.1840* 0.1867* 0.1863* 0.1829* 0.1793
26
+ 32 0.1512 0.1540 0.1572 0.1525 0.1491 0.1537 0.1473 0.1398
27
+ 64 0.0941 0.0931 0.0955 0.0943 0.0946 0.0970 0.0933 0.0907
28
+ 128 0.0623 0.0617 0.0610 0.0613 0.0617 0.0613 0.0603 0.0591
29
+ 256 0.0415 0.0411 0.0421 0.0423 0.0411 0.0410 0.0415 0.0418
30
+
31
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
32
+ SWEEP 3: Cluster spread sweep (d=16, 10 clusters) Γ— dtype
33
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
34
+ spread float32 bfloat16 float16 fp8_e4m3 fp8_e5m2 sim_4bit sim_2bit sim_1bit
35
+ 0.010 1.3619 1.4372 1.3606 1.3519 1.3121 1.3021 0.9556 0.6744
36
+ 0.050 0.9031 0.8919 0.8985 0.9014 0.8911 0.8953 0.7280 0.5888
37
+ 0.100 0.5820 0.5794 0.5871 0.5738 0.5802 0.5945 0.5158 0.4294
38
+ 0.200 0.3228 0.3241 0.3271 0.3262 0.3318 0.3298 0.3135 0.2775
39
+ 0.300 0.2539* 0.2467* 0.2471* 0.2608* 0.2573* 0.2535* 0.2469* 0.2267*
40
+ 0.500 0.2186* 0.2124* 0.2133* 0.2165* 0.2203* 0.2181* 0.2168* 0.2186*
41
+ 1.000 0.2032* 0.2066* 0.2019* 0.2059* 0.2031* 0.2069* 0.2022* 0.2020*
42
+ 5.000 0.2090* 0.2094* 0.2050* 0.2053* 0.2017* 0.2044* 0.2058* 0.2034*
43
+
44
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
45
+ SWEEP 4: Anchor-attracted (d=16) Γ— dtype
46
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━���━━━━━━━━━━━━━━━━━━━
47
+ anchors float32 bfloat16 float16 fp8_e4m3 fp8_e5m2 sim_4bit sim_2bit sim_1bit
48
+ 4 0.3825 0.3637 0.3725 0.3729 0.3717 0.3744 0.3589 0.3251
49
+ 8 0.3525 0.3427 0.3448 0.3509 0.3484 0.3446 0.3259 0.2885
50
+ 16 0.3009 0.2947 0.2901 0.2872 0.2881 0.3002 0.2791 0.2627*
51
+ 32 0.2674* 0.2617* 0.2649* 0.2652* 0.2689* 0.2635* 0.2498* 0.2365*
52
+ 64 0.2386* 0.2279* 0.2354* 0.2325* 0.2379* 0.2245* 0.2242* 0.2188*
53
+ 128 0.2213* 0.2156* 0.2127* 0.2203* 0.2147* 0.2167* 0.2168* 0.2107*
54
+
55
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
56
+ JITTER ANALYSIS β€” Measuring silent rounding damage
57
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
58
+
59
+ Quantization damage at d=16 (uniform):
60
+ dtype cos_sim mean_ang max_ang pw_err CV
61
+ float32 1.000000 0.000026 0.000691 0.000000 0.2002
62
+ bfloat16 0.999999 0.001472 0.002847 0.000427 0.1990
63
+ float16 1.000000 0.000104 0.000691 0.000052 0.2029
64
+ fp8_e4m3 0.999708 0.023574 0.048317 0.006643 0.2017
65
+ fp8_e5m2 0.998835 0.047084 0.093173 0.013291 0.2035
66
+ sim_4bit 0.998186 0.059794 0.081825 0.017122 0.2034
67
+ sim_2bit 0.972123 0.234681 0.339231 0.065831 0.2023
68
+ sim_1bit 0.898717 0.449164 0.717537 0.124532 0.2028
69
+
70
+ ──────────────────────────────────────────────────────────────────────────────────────────
71
+ JITTER EXPERIMENT 1: Angular jitter on tangent plane after quantization
72
+ Does adding tangent noise AFTER fp8 quantization recover lost structure?
73
+ ──────────────────────────────────────────────────────────────────────────────────────────
74
+ dtype jitter CV_no_jit CV_jitter Ξ” pw_err
75
+ fp8_e4m3 0.001 0.2033 0.2023 -0.0010 0.006800
76
+ fp8_e4m3 0.005 0.2033 0.2049 +0.0017 0.008603
77
+ fp8_e4m3 0.010 0.2033 0.2026 -0.0007 0.012772
78
+ fp8_e4m3 0.050 0.2033 0.2038 +0.0006 0.054379
79
+ fp8_e4m3 0.100 0.2033 0.2022 -0.0010 0.101078
80
+ fp8_e5m2 0.001 0.2050 0.2040 -0.0010 0.013264
81
+ fp8_e5m2 0.005 0.2050 0.1989 -0.0061 0.014394
82
+ fp8_e5m2 0.010 0.2050 0.2033 -0.0017 0.017132
83
+ fp8_e5m2 0.050 0.2050 0.2033 -0.0018 0.055252
84
+ fp8_e5m2 0.100 0.2050 0.2024 -0.0026 0.102498
85
+ sim_2bit 0.001 0.2018 0.2030 +0.0012 0.066331
86
+ sim_2bit 0.005 0.2018 0.2026 +0.0008 0.066285
87
+ sim_2bit 0.010 0.2018 0.2022 +0.0004 0.067439
88
+ sim_2bit 0.050 0.2018 0.2054 +0.0036 0.083257
89
+ sim_2bit 0.100 0.2018 0.2003 -0.0015 0.117171
90
+ sim_1bit 0.001 0.2015 0.2043 +0.0028 0.123281
91
+ sim_1bit 0.005 0.2015 0.2025 +0.0010 0.124374
92
+ sim_1bit 0.010 0.2015 0.1999 -0.0016 0.123846
93
+ sim_1bit 0.050 0.2015 0.2049 +0.0034 0.131081
94
+ sim_1bit 0.100 0.2015 0.2042 +0.0027 0.148310
95
+
96
+ ──────────────────────────────────────────────────────────────────────────────────────────
97
+ JITTER EXPERIMENT 2: Stochastic rounding vs deterministic
98
+ Round Β±1 level with probability proportional to residual
99
+ ──────────────────────────────────────────────────────────────────────────────────────────
100
+ bits CV_determ CV_stoch Ξ” pw_det pw_sto
101
+ 1 0.1971 0.2018 +0.0048 0.123730 0.159488
102
+ 2 0.1964 0.2038 +0.0074 0.065930 0.091900
103
+ 3 0.2028 0.2051 +0.0023 0.033738 0.048131
104
+ 4 0.1983 0.2020 +0.0037 0.017008 0.023639
105
+ 8 0.2080 0.2077 -0.0003 0.001070 0.001503
106
+
107
+ ──────────────────────────────────────────────────────────────────────────────────────────
108
+ JITTER EXPERIMENT 3: Accumulated damage β€” repeated quantize-dequantize cycles
109
+ How many round-trips before structure degrades?
110
+ ──────────────────────────────────────────────────────────────────────────────────────────
111
+ dtype cycles CV cos_to_orig ang_err
112
+ bfloat16 1 0.2038 0.999999 0.001472
113
+ bfloat16 5 0.2048 0.999999 0.001473
114
+ bfloat16 10 0.2060 0.999999 0.001473
115
+ bfloat16 50 0.2077 0.999999 0.001473
116
+ bfloat16 100 0.1982 0.999999 0.001473
117
+
118
+ float16 1 0.2029 1.000000 0.000104
119
+ float16 5 0.2088 1.000000 0.000105
120
+ float16 10 0.2012 1.000000 0.000105
121
+ float16 50 0.2005 1.000000 0.000105
122
+ float16 100 0.2075 1.000000 0.000105
123
+
124
+ fp8_e4m3 1 0.2035 0.999708 0.023574
125
+ fp8_e4m3 5 0.1948 0.999706 0.023615
126
+ fp8_e4m3 10 0.2082 0.999706 0.023615
127
+ fp8_e4m3 50 0.2029 0.999706 0.023615
128
+ fp8_e4m3 100 0.1982 0.999706 0.023615
129
+
130
+ fp8_e5m2 1 0.2042 0.998835 0.047084
131
+ fp8_e5m2 5 0.2033 0.998829 0.047184
132
+ fp8_e5m2 10 0.1974 0.998829 0.047184
133
+ fp8_e5m2 50 0.2024 0.998829 0.047184
134
+ fp8_e5m2 100 0.2024 0.998829 0.047184
135
+
136
+ sim_2bit 1 0.2049 0.972123 0.234681
137
+ sim_2bit 5 0.1979 0.972111 0.234736
138
+ sim_2bit 10 0.2005 0.972111 0.234736
139
+ sim_2bit 50 0.2028 0.972111 0.234736
140
+ sim_2bit 100 0.2070 0.972111 0.234736
141
+
142
+ sim_1bit 1 0.2047 0.898717 0.449164
143
+ sim_1bit 5 0.1998 0.897216 0.452575
144
+ sim_1bit 10 0.1970 0.897216 0.452575
145
+ sim_1bit 50 0.2034 0.897216 0.452575
146
+ sim_1bit 100 0.1990 0.897216 0.452575
147
+
148
+
149
+ ==========================================================================================
150
+ SUMMARY β€” Silent Rounding Damage Report
151
+ ==========================================================================================
152
+
153
+ CV band stability: CV β‰ˆ 0.20 at d=16 survives ALL precisions down to 1-bit.
154
+ The band is a topological property of the sphere, not a numerical one.
155
+
156
+ But the SILENT DAMAGE is in:
157
+ - Pairwise distance preservation (pw_err)
158
+ - Angular error accumulation over cycles
159
+ - Nearest-neighbor assignment stability
160
+
161
+ These don't show up in CV because CV measures GLOBAL volume regularity,
162
+ not LOCAL neighborhood fidelity. A constellation needs LOCAL fidelity β€”
163
+ which anchor is nearest matters, not whether the overall volume distribution
164
+ is regular.
165
+
166
+ JITTER RECOMMENDATION:
167
+ For fp8 inference: add tangent-plane jitter of ~0.01 after dequantize
168
+ For training: use stochastic rounding instead of deterministic
169
+ For repeated quantize cycles: re-normalize every N steps
170
+
171
+ ==========================================================================================
172
+ DONE
173
+ ==========================================================================================