Update beatrix-trainC-chaos-native (Epoch 0, Acc: 0.0253) - chaos_native

Browse files

Files changed (3) hide show

README.md +5 -73
weights/beatrix-trainC-chaos-native/20251008_163456_chaos_native/config.json +81 -0
weights/beatrix-trainC-chaos-native/20251008_163456_chaos_native/model.safetensors +3 -0

README.md CHANGED Viewed

@@ -10,34 +10,10 @@ license: mit
 # ViT-Beatrix Dual-Stream Family
-This repository contains the **Beatrix family** of dual-stream vision transformers with preserved geometric features.
-## Current Experiment: beatrix-trainB-workshop
-**Model Path**: `weights/beatrix-trainB-workshop/20251008_152906_div2_gentle_nomixup/`
-## Training Lineage
-- **Origin Checkpoint**: `20251008_131339`
-- **Origin Epoch**: 25
-- **Divergence Point**: div2_gentle_nomixup
-- **Experiment Name**: beatrix-trainB-workshop
-- **Training Philosophy**: Gentle Guidance (5% threshold, 5-epoch cooldown, no Mixup)
-This model was branched from a previous training run to explore different augmentation strategies.
-## Key Innovation: Dual Processing Streams + Geometric Compatibility
-Unlike standard ViTs that destroy geometric features after injection, this architecture maintains **two parallel processing streams**:
-1. **Visual Stream** (512D): Processes patch tokens
-2. **Geometric Stream** (256D): Evolves 8 geometric tokens
-The streams cross-communicate via attention without homogenizing features.
-**Important:** This model uses discrete geometric simplex structures and is **incompatible with Mixup augmentation** (label interpolation). CutMix is supported (spatial mixing with discrete labels).
 ## Architecture
@@ -47,56 +23,12 @@ The streams cross-communicate via attention without homogenizing features.
 - **Dual Blocks**: 8 layers
 - **k-simplex**: 4
-## Training Configuration
-- **Experiment**: beatrix-trainB-workshop
-- **Overfit Threshold**: 5.0%
-- **Augmentation Cooldown**: 5 epochs
-- **Min Accuracy for Augmentation**: 45.0%
-- **Mixup**: Disabled (geometric incompatibility)
 ## Performance
-- **Best Accuracy**: 0.5139
-- **Current Epoch**: 52
 - **Dataset**: CIFAR-100
-## Usage
-```python
-from geovocab2.train.model.vit_beatrix_dualstream import DualStreamGeometricClassifier
-from safetensors.torch import load_file
-from huggingface_hub import hf_hub_download
-# Download specific experiment
-model_path = hf_hub_download(
-    repo_id="AbstractPhil/vit-beatrix-dualstream",
-    filename="weights/beatrix-trainB-workshop/20251008_152906_div2_gentle_nomixup/model.safetensors"
-)
-# Load model
-model = DualStreamGeometricClassifier(
-    num_classes=100,
-    visual_dim=512,
-    geom_dim=256,
-    num_geom_tokens=8
-)
-state_dict = load_file(model_path)
-model.load_state_dict(state_dict)
-```
-## Citation
-```bibtex
-@misc{vit-beatrix-dualstream,
-  author = {AbstractPhil},
-  title = {ViT-Beatrix Dual-Stream: Preserved Geometric Features},
-  year = {2025},
-  note = {Experiment: beatrix-trainB-workshop}
-}
-```
 ---
-*Last updated: Epoch 52 | Best Accuracy: 0.5139*

 # ViT-Beatrix Dual-Stream Family
+## Current Experiment: beatrix-trainC-chaos-native
+**Model Path**: `weights/beatrix-trainC-chaos-native/20251008_163456_chaos_native/`
 ## Architecture
 - **Dual Blocks**: 8 layers
 - **k-simplex**: 4
 ## Performance
+- **Best Accuracy**: 0.0253
+- **Current Epoch**: 0
 - **Dataset**: CIFAR-100
 ---
+*Last updated: Epoch 0 | Best Accuracy: 0.0253*

weights/beatrix-trainC-chaos-native/20251008_163456_chaos_native/config.json ADDED Viewed

	@@ -0,0 +1,81 @@

+{
+  "num_classes": 100,
+  "img_size": 32,
+  "patch_size": 4,
+  "visual_dim": 512,
+  "geom_dim": 256,
+  "k_simplex": 4,
+  "depth": 8,
+  "num_heads": 8,
+  "mlp_ratio": 4.0,
+  "dropout": 0.0,
+  "num_geom_tokens": 8,
+  "pe_levels": 12,
+  "pe_features_per_level": 2,
+  "pe_smooth_tau": 0.25,
+  "simplex_init_method": "regular",
+  "simplex_init_scale": 1.0,
+  "batch_size": 512,
+  "num_epochs": 150,
+  "learning_rate": 0.0001,
+  "weight_decay": 0.005,
+  "warmup_epochs": 10,
+  "task_loss_weight": 0.5,
+  "flow_loss_weight": 1.5,
+  "coherence_loss_weight": 0.5,
+  "multiscale_loss_weight": 0.3,
+  "use_adaptive_augmentation": false,
+  "overfit_threshold": 0.05,
+  "augmentation_cooldown_epochs": 5,
+  "min_accuracy_for_augmentation": 0.45,
+  "mixup_alpha": 0.2,
+  "cutmix_alpha": 1.0,
+  "use_cutmix_schedule": true,
+  "cutmix_schedule": [
+    [
+      0,
+      0.2
+    ],
+    [
+      20,
+      0.5
+    ],
+    [
+      40,
+      1.0
+    ],
+    [
+      60,
+      1.2
+    ],
+    [
+      80,
+      1.5
+    ],
+    [
+      100,
+      1.8
+    ],
+    [
+      120,
+      2.0
+    ]
+  ],
+  "device": "cuda",
+  "num_workers": 4,
+  "pin_memory": true,
+  "save_dir": "./checkpoints_dualstream",
+  "save_every": 10,
+  "use_safetensors": true,
+  "timestamp_dirs": true,
+  "push_to_hub": true,
+  "hub_model_id": "AbstractPhil/vit-beatrix-dualstream",
+  "hub_model_name": "beatrix-trainC-chaos-native",
+  "hub_upload_best_only": true,
+  "hub_upload_every_n_epochs": 10,
+  "use_tensorboard": true,
+  "log_dir": "./logs_dualstream",
+  "log_every": 50,
+  "monitor_stream_health": true,
+  "log_stream_norms": true
+}

weights/beatrix-trainC-chaos-native/20251008_163456_chaos_native/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b0fcd146ef0fc7663306054dd537a49eb360642bf20ae85ad0e48e4cf5777049
+size 164567960