|
|
--- |
|
|
license: cc-by-nc-4.0 |
|
|
--- |
|
|
|
|
|
# ComplexityDiT - Diffusion Transformer with INL Dynamics |
|
|
|
|
|
Diffusion Transformer enhanced with PID-style dynamics control for smoother denoising. |
|
|
|
|
|
## Architecture |
|
|
|
|
|
``` |
|
|
Input -> [Attention -> MLP -> Dynamics] x 12 -> Output |
|
|
``` |
|
|
|
|
|
**Core equations:** |
|
|
- Attention: `softmax(QK^T/sqrt(d)) * V` |
|
|
- MLP: `W2 * GELU(W1 * x)` |
|
|
- Dynamics: `h += dt * gate * (alpha*v - beta*(h - mu))` |
|
|
|
|
|
## Model Details |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Architecture | ComplexityDiT-S | |
|
|
| Parameters | 114M | |
|
|
| Layers | 12 | |
|
|
| Hidden dim | 384 | |
|
|
| Heads | 6 | |
|
|
| Experts | 4 | |
|
|
| Dynamics | Enabled | |
|
|
|
|
|
## Training |
|
|
|
|
|
- Dataset: huggan/wikiart |
|
|
- Steps: 20,000 |
|
|
- Batch size: 16 |
|
|
- Mixed precision: FP16 |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from safetensors.torch import load_file |
|
|
from complexity_diffusion import ComplexityDiT |
|
|
|
|
|
# Load model |
|
|
model = ComplexityDiT.from_config('S', context_dim=768) |
|
|
state_dict = load_file('model.safetensors') |
|
|
model.load_state_dict(state_dict) |
|
|
``` |
|
|
|
|
|
## INL Dynamics |
|
|
|
|
|
The dynamics layer adds robotics-grade control to stabilize denoising trajectories: |
|
|
- `mu` - learnable equilibrium (target position) |
|
|
- `alpha` - inertia (momentum) |
|
|
- `beta` - correction strength (spring constant) |
|
|
- `gate` - amplitude control |
|
|
|
|
|
This creates smooth, stable trajectories like a PID controller guiding the model toward clean images. |
|
|
|
|
|
## Links |
|
|
|
|
|
- [GitHub](https://github.com/Complexity-ML/complexity-framework) |
|
|
- [PyPI](https://pypi.org/project/complexity-framework/) |
|
|
|