AbstractPhil commited on
Commit
41471bf
Β·
verified Β·
1 Parent(s): 13eba2f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +334 -3
README.md CHANGED
@@ -1,3 +1,334 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #@title Generate MobiusNet HuggingFace Model Card
2
+
3
+ readme_content = '''---
4
+ license: apache-2.0
5
+ language:
6
+ - en
7
+ library_name: pytorch
8
+ tags:
9
+ - image-classification
10
+ - geometric-deep-learning
11
+ - clip
12
+ - distillation
13
+ - wave-interference
14
+ - mobius
15
+ datasets:
16
+ - AbstractPhil/imagenet-clip-features-orderly
17
+ metrics:
18
+ - accuracy
19
+ pipeline_tag: image-classification
20
+ model-index:
21
+ - name: mobiusnet-distillations
22
+ results:
23
+ - task:
24
+ type: image-classification
25
+ dataset:
26
+ name: ImageNet-1K (CLIP-ViT-L14 features)
27
+ type: AbstractPhil/imagenet-clip-features-orderly
28
+ config: clip_vit_l14
29
+ metrics:
30
+ - name: Top-1 Accuracy
31
+ type: accuracy
32
+ value: 80.8
33
+ ---
34
+
35
+ # MobiusNet
36
+
37
+ A geometric deep learning architecture using **MΓΆbius wave interference lenses** for efficient image classification.
38
+
39
+ ## Model Description
40
+
41
+ MobiusNet learns frequency-selective sparse coding through three drifting wave functions (L, M, R) combined via learnable XOR/AND logic. The architecture progressively sharpens selectivity through depth, culminating in near-binary winner-take-all gating at the final block.
42
+
43
+ ### Wave Interference Mechanism
44
+
45
+ Each MΓΆbius Lens computes:
46
+ ```
47
+ L = exp(-Ξ± Β· sinΒ²(Ο‰ Β· s Β· (x + drift_L Β· t))) # Left wave (drift=+1)
48
+ M = exp(-Ξ± Β· sinΒ²(Ο‰ Β· s Β· (x + drift_M Β· t))) # Middle wave (drift=0)
49
+ R = exp(-Ξ± Β· sinΒ²(Ο‰ Β· s Β· (x + drift_R Β· t))) # Right wave (drift=-1)
50
+
51
+ XOR = |L + R - 2Β·LΒ·R|
52
+ AND = L Β· R
53
+ gate = Οƒ(LayerNorm(wΒ·[L,M,R] Γ— (0.5 + 0.5Β·(xor_wΒ·XOR + (1-xor_w)Β·AND))))
54
+ ```
55
+
56
+ ### Learned Progression
57
+
58
+ | Block | Ο‰ | Ξ± | XOR weight | L/M/R means | Behavior |
59
+ |-------|---|---|------------|-------------|----------|
60
+ | S0B0 | 1.55 | 0.64 | 0.40 | 0.80/0.92/0.71 | Broad overlapping |
61
+ | S0B1 | 3.01 | 0.22 | 0.69 | 0.82/0.80/0.83 | Nearly all passes |
62
+ | S1B0 | 0.93 | 2.00 | 0.79 | 0.86/0.87/0.81 | Sharpening |
63
+ | S1B1 | 1.63 | 0.50 | 0.41 | 0.86/0.48/0.55 | M/R diverge |
64
+ | S2B0 | 1.64 | 2.09 | 0.58 | 0.12/0.08/0.20 | Sparse |
65
+ | S2B1 | 2.68 | **5.22** | **0.99** | 0.02/0.02/0.05 | **Winner-take-all** |
66
+
67
+ ## Usage
68
+
69
+ ### Installation
70
+ ```bash
71
+ pip install torch safetensors huggingface_hub
72
+ ```
73
+
74
+ ### Inference
75
+ ```python
76
+ import torch
77
+ import torch.nn as nn
78
+ import torch.nn.functional as F
79
+ from huggingface_hub import hf_hub_download
80
+ from safetensors.torch import load_file
81
+ import math
82
+
83
+ # ============================================================================
84
+ # ARCHITECTURE
85
+ # ============================================================================
86
+
87
+ class MobiusLens(nn.Module):
88
+ def __init__(self, dim, layer_idx, total_layers, scale_range=(0.5, 2.5)):
89
+ super().__init__()
90
+ self.t = layer_idx / max(total_layers - 1, 1)
91
+ scale_span = scale_range[1] - scale_range[0]
92
+ step = scale_span / max(total_layers, 1)
93
+ self.register_buffer('scales', torch.tensor([
94
+ scale_range[0] + self.t * scale_span,
95
+ scale_range[0] + self.t * scale_span + step
96
+ ]))
97
+ self.twist_in_angle = nn.Parameter(torch.tensor(self.t * math.pi))
98
+ self.twist_in_proj = nn.Linear(dim, dim, bias=False)
99
+ self.omega = nn.Parameter(torch.tensor(math.pi))
100
+ self.alpha = nn.Parameter(torch.tensor(1.5))
101
+ self.phase_l = nn.Parameter(torch.zeros(2))
102
+ self.drift_l = nn.Parameter(torch.ones(2))
103
+ self.phase_m = nn.Parameter(torch.zeros(2))
104
+ self.drift_m = nn.Parameter(torch.zeros(2))
105
+ self.phase_r = nn.Parameter(torch.zeros(2))
106
+ self.drift_r = nn.Parameter(-torch.ones(2))
107
+ self.accum_weights = nn.Parameter(torch.tensor([0.4, 0.2, 0.4]))
108
+ self.xor_weight = nn.Parameter(torch.tensor(0.7))
109
+ self.gate_norm = nn.LayerNorm(dim)
110
+ self.twist_out_angle = nn.Parameter(torch.tensor(-self.t * math.pi))
111
+ self.twist_out_proj = nn.Linear(dim, dim, bias=False)
112
+
113
+ def forward(self, x):
114
+ # Twist in
115
+ cos_t, sin_t = torch.cos(self.twist_in_angle), torch.sin(self.twist_in_angle)
116
+ x = x * cos_t + self.twist_in_proj(x) * sin_t
117
+
118
+ # Wave interference
119
+ x_norm = torch.tanh(x)
120
+ t = x_norm.abs().mean(dim=-1, keepdim=True).unsqueeze(-2)
121
+ x_exp = x_norm.unsqueeze(-2)
122
+ s = self.scales.view(-1, 1)
123
+ a = self.alpha.abs() + 0.1
124
+
125
+ def wave(phase, drift):
126
+ pos = s * self.omega * (x_exp + drift.view(-1, 1) * t) + phase.view(-1, 1)
127
+ return torch.exp(-a * torch.sin(pos).pow(2)).prod(dim=-2)
128
+
129
+ L, M, R = wave(self.phase_l, self.drift_l), wave(self.phase_m, self.drift_m), wave(self.phase_r, self.drift_r)
130
+
131
+ # XOR/AND combination
132
+ w = torch.softmax(self.accum_weights, dim=0)
133
+ xor_w = torch.sigmoid(self.xor_weight)
134
+ lr = xor_w * (L + R - 2*L*R).abs() + (1 - xor_w) * L * R
135
+ gate = torch.sigmoid(self.gate_norm((w[0]*L + w[1]*M + w[2]*R) * (0.5 + 0.5*lr)))
136
+ x = x * gate
137
+
138
+ # Twist out
139
+ cos_t, sin_t = torch.cos(self.twist_out_angle), torch.sin(self.twist_out_angle)
140
+ return x * cos_t + self.twist_out_proj(x) * sin_t
141
+
142
+
143
+ class MobiusConvBlock(nn.Module):
144
+ def __init__(self, channels, layer_idx, total_layers, scale_range=(0.5, 2.5), reduction=0.5):
145
+ super().__init__()
146
+ self.conv = nn.Sequential(
147
+ nn.Conv2d(channels, channels, 3, padding=1, groups=channels, bias=False),
148
+ nn.Conv2d(channels, channels, 1, bias=False),
149
+ nn.BatchNorm2d(channels),
150
+ )
151
+ self.lens = MobiusLens(channels, layer_idx, total_layers, scale_range)
152
+ third = channels // 3
153
+ which_third = layer_idx % 3
154
+ mask = torch.ones(channels)
155
+ mask[which_third*third : which_third*third + third + (channels % 3 if which_third == 2 else 0)] = reduction
156
+ self.register_buffer('thirds_mask', mask.view(1, -1, 1, 1))
157
+ self.residual_weight = nn.Parameter(torch.tensor(0.9))
158
+
159
+ def forward(self, x):
160
+ identity = x
161
+ h = self.conv(x).permute(0, 2, 3, 1)
162
+ h = self.lens(h).permute(0, 3, 1, 2) * self.thirds_mask
163
+ rw = torch.sigmoid(self.residual_weight)
164
+ return rw * identity + (1 - rw) * h
165
+
166
+
167
+ class MobiusNet(nn.Module):
168
+ def __init__(self, in_chans=1, num_classes=1000, channels=(64, 128, 256),
169
+ depths=(2, 2, 2), scale_range=(0.5, 2.5), use_integrator=True):
170
+ super().__init__()
171
+ total_layers = sum(depths)
172
+ channels = list(channels)
173
+
174
+ self.stem = nn.Sequential(
175
+ nn.Conv2d(in_chans, channels[0], 3, padding=1, bias=False),
176
+ nn.BatchNorm2d(channels[0]),
177
+ )
178
+
179
+ self.stages = nn.ModuleList()
180
+ self.downsamples = nn.ModuleList()
181
+ layer_idx = 0
182
+
183
+ for si, d in enumerate(depths):
184
+ stage = nn.ModuleList([
185
+ MobiusConvBlock(channels[si], layer_idx + i, total_layers, scale_range)
186
+ for i in range(d)
187
+ ])
188
+ layer_idx += d
189
+ self.stages.append(stage)
190
+
191
+ if si < len(depths) - 1:
192
+ self.downsamples.append(nn.Sequential(
193
+ nn.Conv2d(channels[si], channels[si + 1], 3, stride=2, padding=1, bias=False),
194
+ nn.BatchNorm2d(channels[si + 1]),
195
+ ))
196
+
197
+ self.integrator = nn.Sequential(
198
+ nn.Conv2d(channels[-1], channels[-1], 3, padding=1, bias=False),
199
+ nn.BatchNorm2d(channels[-1]),
200
+ nn.GELU(),
201
+ ) if use_integrator else nn.Identity()
202
+
203
+ self.pool = nn.AdaptiveAvgPool2d(1)
204
+ self.head = nn.Linear(channels[-1], num_classes)
205
+
206
+ def forward(self, x):
207
+ x = self.stem(x)
208
+ for i, stage in enumerate(self.stages):
209
+ for block in stage:
210
+ x = block(x)
211
+ if i < len(self.downsamples):
212
+ x = self.downsamples[i](x)
213
+ x = self.integrator(x)
214
+ return self.head(self.pool(x).flatten(1))
215
+
216
+
217
+ # ============================================================================
218
+ # LOAD AND RUN
219
+ # ============================================================================
220
+
221
+ device = "cuda" if torch.cuda.is_available() else "cpu"
222
+
223
+ # Load model
224
+ model = MobiusNet(
225
+ in_chans=1,
226
+ num_classes=1000,
227
+ channels=(64, 128, 256),
228
+ depths=(2, 2, 2),
229
+ scale_range=(0.5, 2.5),
230
+ use_integrator=True,
231
+ ).to(device)
232
+
233
+ weights_path = hf_hub_download(
234
+ repo_id="AbstractPhil/mobiusnet-distillations",
235
+ filename="checkpoints/mobius_tiny_s_imagenet_clip_vit_l14/20260111_000512/checkpoints/best_model.safetensors",
236
+ )
237
+ model.load_state_dict(load_file(weights_path))
238
+ model.eval()
239
+
240
+ # Inference on CLIP features
241
+ # Input: CLIP-ViT-L14 image features reshaped to [B, 1, 24, 32]
242
+ clip_features = torch.randn(1, 768) # Replace with actual CLIP features
243
+ x = clip_features.view(1, 1, 24, 32).to(device)
244
+
245
+ with torch.no_grad():
246
+ logits = model(x)
247
+ pred = logits.argmax(dim=-1)
248
+ probs = F.softmax(logits, dim=-1)
249
+
250
+ print(f"Predicted class: {pred.item()}, confidence: {probs[0, pred].item():.2%}")
251
+ ```
252
+
253
+ ### With Real CLIP Features
254
+ ```python
255
+ from transformers import CLIPModel, CLIPProcessor
256
+ from PIL import Image
257
+
258
+ # Load CLIP
259
+ clip_model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14").to(device).eval()
260
+ clip_processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")
261
+
262
+ # Extract features
263
+ image = Image.open("your_image.jpg").convert("RGB")
264
+ inputs = clip_processor(images=image, return_tensors="pt").to(device)
265
+
266
+ with torch.no_grad():
267
+ vision_out = clip_model.vision_model(**inputs)
268
+ clip_features = clip_model.visual_projection(vision_out.pooler_output)
269
+
270
+ # Note: The model was trained on pre-extracted features with Οƒβ‰ˆ0.036
271
+ # You may need to match that distribution for optimal results
272
+ x = clip_features.view(1, 1, 24, 32)
273
+
274
+ with torch.no_grad():
275
+ logits = model(x)
276
+ pred = logits.argmax(dim=-1)
277
+ ```
278
+
279
+ ## Training Details
280
+
281
+ - **Dataset**: ImageNet-1K via pre-extracted CLIP-ViT-L14 features
282
+ - **Input**: 768-dim CLIP features reshaped to [1, 24, 32]
283
+ - **Epochs**: 50
284
+ - **Optimizer**: AdamW (lr=1e-3, weight_decay=0.05)
285
+ - **Scheduler**: CosineAnnealingLR
286
+ - **Batch Size**: 256
287
+ - **Parameters**: 1.74M
288
+
289
+ ## Architecture Details
290
+ ```
291
+ Input: [1, 24, 32] (768 = 24 Γ— 32)
292
+ β”œβ”€β”€ Stem: Conv2d(1β†’64) + BN
293
+ β”œβ”€β”€ Stage 0: 2Γ— MobiusConvBlock(64) β†’ [64, 24, 32]
294
+ β”œβ”€β”€ Downsample: Conv2d(64β†’128, stride=2)
295
+ β”œβ”€β”€ Stage 1: 2Γ— MobiusConvBlock(128) β†’ [128, 12, 16]
296
+ β”œβ”€β”€ Downsample: Conv2d(128β†’256, stride=2)
297
+ β”œβ”€β”€ Stage 2: 2Γ— MobiusConvBlock(256) β†’ [256, 6, 8]
298
+ β”œβ”€β”€ Integrator: Conv2d + BN + GELU
299
+ β”œβ”€β”€ AdaptiveAvgPool2d(1)
300
+ └── Linear(256β†’1000)
301
+ ```
302
+
303
+ ## Key Insights
304
+
305
+ 1. **Progressive Sharpening**: Ξ± increases through depth (0.22 β†’ 5.22), creating increasingly selective filters
306
+ 2. **XOR Logic Emergence**: Final block learns xor_weight=0.99, implementing near-pure XOR gating
307
+ 3. **LayerNorm Amplification**: Tiny wave differences (Οƒβ‰ˆ0.02) get rescaled to meaningful gate distributions
308
+ 4. **Sparse Resonance**: High Ξ± creates winner-take-all dynamics where only resonant channels activate
309
+
310
+ ## Citation
311
+ ```bibtex
312
+ @misc{mobiusnet2026,
313
+ author = {AbstractPhil},
314
+ title = {MobiusNet: Wave Interference Lenses for Geometric Deep Learning},
315
+ year = {2026},
316
+ publisher = {HuggingFace},
317
+ url = {https://huggingface.co/AbstractPhil/mobiusnet-distillations}
318
+ }
319
+ ```
320
+
321
+ ## License
322
+
323
+ Apache 2.0
324
+ '''
325
+
326
+ # Save to file
327
+ with open("README.md", "w") as f:
328
+ f.write(readme_content)
329
+
330
+ print("README.md created!")
331
+ print(f"\nLength: {len(readme_content)} chars")
332
+ print("\nPreview (first 2000 chars):")
333
+ print("="*60)
334
+ print(readme_content[:2000])