AbstractPhil commited on
Commit
c4f468e
·
verified ·
1 Parent(s): 0aa5749

Upload runs/cifar10_weighted_20251119_023700/README.md with huggingface_hub

Browse files
runs/cifar10_weighted_20251119_023700/README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Run: cifar10_weighted_20251119_023700
2
+
3
+ ## Configuration
4
+ - **Dataset**: CIFAR10
5
+ - **Fusion Mode**: weighted
6
+ - **Parameters**: 8,936,890
7
+ - **Simplex**: 4-simplex (5 vertices)
8
+
9
+ ## Performance
10
+ - **Best Validation Accuracy**: 77.22%
11
+ - **Training Time**: 0.7 hours
12
+ - **Batch Size**: 128
13
+ - **Mixed Precision**: False
14
+ - **Final Epoch**: 100
15
+
16
+ ## Files
17
+ - `runs/cifar10_weighted_20251119_023700/checkpoints/best_model.safetensors` - Model weights (SafeTensors)
18
+ - `runs/cifar10_weighted_20251119_023700/checkpoints/best_training_state.pt` - Optimizer/scheduler state
19
+ - `runs/cifar10_weighted_20251119_023700/checkpoints/best_metadata.json` - Training metadata
20
+ - `runs/cifar10_weighted_20251119_023700/config.yaml` - Full configuration
21
+ - `runs/cifar10_weighted_20251119_023700/tensorboard/` - TensorBoard logs
22
+
23
+ ## Usage
24
+ ```python
25
+ from safetensors.torch import load_file
26
+ import torch
27
+
28
+ # Download from HuggingFace Hub
29
+ from huggingface_hub import hf_hub_download
30
+
31
+ model_path = hf_hub_download(
32
+ repo_id="AbstractPhil/vit-beans-v3",
33
+ filename="runs/cifar10_weighted_20251119_023700/checkpoints/best_model.safetensors"
34
+ )
35
+
36
+ # Load model weights (SafeTensors - no pickle!)
37
+ state_dict = load_file(model_path)
38
+ model.load_state_dict(state_dict)
39
+ ```
40
+
41
+ ## Training Configuration
42
+ ```yaml
43
+ embed_dim: 384
44
+ num_fusion_blocks: 6
45
+ num_heads: 8
46
+ fusion_mode: weighted
47
+ k_simplex: 4
48
+ learning_rate: 0.0003
49
+ batch_size: 128
50
+ epochs: 100
51
+ weight_decay: 0.05
52
+ ```
53
+
54
+ ## Details
55
+
56
+ Built with geometric consciousness-aware routing using the Devil's Staircase (Beatrix) and pentachoron parameterization.
57
+
58
+ **Training completed**: 2025-11-19 03:22:18
59
+
60
+ **Safe Format**: All model weights use SafeTensors (not pickle) for maximum security.