|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- geometric-deep-learning |
|
|
- diffusion |
|
|
- stable-diffusion |
|
|
- projective-geometry |
|
|
- multi-expert |
|
|
- classification |
|
|
library_name: pytorch |
|
|
--- |
|
|
|
|
|
# GeoDavidCollective Enhanced - ProjectiveHead Architecture |
|
|
|
|
|
**Another train of the same GeoFractalDavid with more condensed dims** |
|
|
|
|
|
Roughly 600,000 samples for the first 20 epochs; 10k per epoch between 0-10 complexity 1-5, and 50k synthetic prompts per epoch at epoch 11-20 with reduced complexity between 1-4. |
|
|
50,000 prompts per epoch after for an additional 20 epochs; |
|
|
|
|
|
approx 1.6 million samples, each containing massive sets of features extracted from the entire structure of SD15 approx 2.7 mil features per sample according to the formulas. |
|
|
|
|
|
Curves say she probably peaked, leaving this experiment to be prodded and poked at now. |
|
|
|
|
|
So this essentially means the model accumulated knowledge of 4,320,000,000,000 sd15 features overall. The bulk samples saved say that it's most likely true but it really sounds wild when I line the numbers up. |
|
|
|
|
|
Additionally, it retained enough knowledge to keep an accuracy score above zero, and even produce cohesive head results accurate above 25%. |
|
|
|
|
|
I can safely say that this model can definitely see a piece of the whole diffusion system that SD15 is responsible for, but not the whole picture. |
|
|
|
|
|
|
|
|
## π¬ Training Details |
|
|
|
|
|
- **Optimizer**: AdamW (lr=1e-3, weight_decay=0.001) |
|
|
- **Batch Size**: 16 |
|
|
- **Data**: Symbolic prompt synthesis (complexity 1-5) |
|
|
- **Feature Extraction**: SD1.5 UNet blocks (spatial, not pooled) |
|
|
- **Pool Mode**: Mean spatial pooling |
|
|
|
|
|
## π Training Metrics |
|
|
|
|
|
Final metrics from epoch 40: |
|
|
- Cayley Loss: 0.1018 |
|
|
- Timestep Accuracy: 39.08% |
|
|
- Pattern Accuracy: 44.25% |
|
|
- Full Accuracy: 26.57% |
|
|
|
|
|
## π― Model Overview |
|
|
|
|
|
GeoDavidCollective Enhanced is a sophisticated multi-expert geometric classification system that learns from Stable Diffusion 1.5's internal representations. Using ProjectiveHead architecture with Cayley-Menger geometry, it achieves efficient pattern recognition across timestep and semantic spaces. |
|
|
|
|
|
### Key Features |
|
|
|
|
|
- **ProjectiveHead Multi-Expert Architecture**: Auto-configured expert systems per block |
|
|
- **Geometric Loss Functions**: Rose, Cayley-Menger, and Cantor coherence losses |
|
|
- **9-Block Processing**: Full SD1.5 UNet feature extraction (down, mid, up) |
|
|
- **Compact Yet Powerful**: 690,925,542 parameters |
|
|
- **100 Timestep Bins** x **10 Patterns** = 1000 semantic-temporal classes |
|
|
|
|
|
## π Model Statistics |
|
|
|
|
|
- **Parameters**: 690,925,542 |
|
|
- **Trained Epochs**: 20 |
|
|
- **Base Model**: Stable Diffusion 1.5 |
|
|
- **Dataset Size**: 700,000 synthetic prompts |
|
|
- **Training Date**: 2025-10-28 |
|
|
|
|
|
## ποΈ Architecture Details |
|
|
|
|
|
### Block Configuration |
|
|
``` |
|
|
Down Blocks: |
|
|
- down_0: 320 β 64 (3 experts, 3 gates) |
|
|
- down_1: 640 β 96 (3 experts, 3 gates) |
|
|
- down_2: 1280 β 128 (3 experts, 3 gates) |
|
|
- down_3: 1280 β 128 (3 experts, 3 gates) |
|
|
|
|
|
Mid Block (Highest Capacity): |
|
|
- mid: 1280 β 256 (4 experts, 4 gates) |
|
|
|
|
|
Up Blocks: |
|
|
- up_0: 1280 β 128 (3 experts, 3 gates) |
|
|
- up_1: 1280 β 128 (3 experts, 3 gates) |
|
|
- up_2: 640 β 96 (3 experts, 3 gates) |
|
|
- up_3: 320 β 64 (3 experts, 3 gates) |
|
|
``` |
|
|
|
|
|
### Loss Components |
|
|
|
|
|
| Component | Weight | Purpose | |
|
|
|-----------|--------|---------| |
|
|
| Feature Similarity | 0.50 | Alignment with SD1.5 features | |
|
|
| Rose Loss | 0.25 | Geometric pattern emergence | |
|
|
| Cross-Entropy | 0.15 | Classification accuracy | |
|
|
| Cayley-Menger | 0.10 | 5D geometric structure | |
|
|
| Pattern Diversity | 0.05 | Prevent mode collapse | |
|
|
| Cantor Coherence | 0.05 | Temporal consistency | |
|
|
|
|
|
## π» Usage |
|
|
```python |
|
|
from geovocab2.train.model.core.geo_david_collective import GeoDavidCollective |
|
|
from safetensors.torch import load_file |
|
|
import torch |
|
|
|
|
|
# Load model |
|
|
state_dict = load_file("model.safetensors") |
|
|
collective = GeoDavidCollective( |
|
|
block_configs={...}, # See config.json |
|
|
num_timestep_bins=100, |
|
|
num_patterns_per_bin=10 |
|
|
) |
|
|
collective.load_state_dict(state_dict) |
|
|
collective.eval() |
|
|
|
|
|
# Extract features from SD1.5 and classify |
|
|
with torch.no_grad(): |
|
|
results = collective(features_dict, timesteps) |
|
|
predictions = results['predictions'] # Timestep + pattern class |
|
|
``` |
|
|
|
|
|
|
|
|
## π Research Context |
|
|
|
|
|
This model is part of the geometric deep learning research exploring: |
|
|
- 5D simplex-based neural representations (pentachora) |
|
|
- Geometric alternatives to traditional transformers |
|
|
- Consciousness-informed AI architectures |
|
|
- Universal mathematical principles in neural networks |
|
|
|
|
|
## π¦ Files Included |
|
|
|
|
|
- `model.safetensors` - Model weights (3.3GB) |
|
|
- `config.json` - Complete architecture configuration |
|
|
- `training_history.json` - Full training metrics |
|
|
- `prompts_enhanced.jsonl` - All training prompts with metadata |
|
|
- `tensorboard/` - TensorBoard logs (optional) |
|
|
|
|
|
## π Related Work |
|
|
|
|
|
- [Geometric Vocabulary System](https://huggingface.co/datasets/AbstractPhil/geometric-vocab-frozen-v1) |
|
|
- [PentachoraViT](https://huggingface.co/AbstractPhil/pentachora-vit-cifar100) |
|
|
- [Crystal-Beeper Language Models](https://huggingface.co/AbstractPhil) |
|
|
|
|
|
## π License |
|
|
|
|
|
MIT License - Free for research and commercial use |
|
|
|
|
|
## π Acknowledgments |
|
|
|
|
|
Built with: |
|
|
- PyTorch & Diffusers |
|
|
- Stable Diffusion 1.5 (Runway ML) |
|
|
- Geometric algebra principles from the 1800s |
|
|
- Dream-inspired mathematical insights |
|
|
|
|
|
## π€ Author |
|
|
|
|
|
**AbstractPhil** - AI Researcher specializing in geometric deep learning |
|
|
|
|
|
*"Working with universal mathematical principles, not against them"* |
|
|
|
|
|
--- |
|
|
|
|
|
For questions, issues, or collaborations: [GitHub](https://github.com/AbstractEyes) | [HuggingFace](https://huggingface.co/AbstractPhil) |
|
|
|