stanno / STANNO_IS_NOT.md
oldman-dev's picture
Up-to-date with original repo
8f0d906 verified

STANNO: What It Is, What It Isn't

STANNO trains networks using direct weight modification, not backpropagation. It's specialized for specific tasks where this is useful (anomaly detection, online learning, interpretability). It's not a replacement for PyTorch or TensorFlow.


STANNO Works Well For

1. Anomaly Detection & Filtering

Train on normal data, then score new inputs by reconstruction error. Works reliably in production.

from stanno.integration.filter import STANNOFilter

stanno.fit(normal_embeddings, normal_embeddings, epochs=50)
filter = STANNOFilter(stanno)
score = filter.score(new_embedding)  # returns [0, 1]: 0=normal, 1=anomaly

2. Online / Continual Learning

Update weights one sample at a time with no batch accumulation. Fast and interpretable.

from stanno.integration.continual import ContinualSTANNO

cont = ContinualSTANNO(stanno)
for x_i, y_i in stream:
    loss = cont.observe(x_i, y_i)  # single-sample update

3. Interpretable Weight Modification

See exactly what the trainer does at each synapse β€” the weight deltas are explicit, not hidden inside autodiff.

dW, db = trainer.compute_updates(state)  # explicit weight changes
print(dW)  # actual numbers, not gradients

4. Multi-Stage Cascades

Chain multiple STANNOs into encoder-decoder pipelines or progressive compression networks, then train end-to-end with gradient flow across stage boundaries.

from stanno import CascadeSTANNO

enc = STANNO(STANNOConfig(layers=[768, 256, 64]))
dec = STANNO(STANNOConfig(layers=[64, 256, 768]))

ae = CascadeSTANNO([enc, dec])
ae.fit(embeddings, embeddings, epochs=200)  # trains both end-to-end

STANNO Does NOT Work Well For

Regression (General Function Fitting)

STANNO is not optimized for regression. If you train on sin(x), you'll get MAE β‰ˆ 0.4–0.5. A standard neural network with Adam easily reaches MAE < 0.01.

Why? The fixed 4-module trainer applies the same update formula at every step. This works well for the tasks above, but not for learning arbitrary functions.

Better choice: Use PyTorch, TensorFlow, or scikit-learn.

Replacement for PyTorch/TensorFlow

STANNO intentionally avoids autodiff. If you need GPU acceleration, backpropagation, or access to a model zoo, use a standard framework.

# Bad idea
stanno = STANNO(...)  # slow NumPy, no GPU

# Good idea
torch.nn.Sequential(...)  # fast, GPU, backprop, pretrained weights

Standalone Image Generation

Alone, STANNO is just a small neural network. For image workflows, use the ComfyUI nodes which integrate with Stable Diffusion and provide the full pipeline.

# Incomplete
stanno = STANNO(STANNOConfig(layers=[768, 512, 768]))  # just a network

# Complete (in ComfyUI)
# STANNOLoad β†’ STANNODreamCond β†’ KSampler β†’ STANNOScoreImages

Training Divergence (Why It Happens, How We Guard Against It)

Direct weight modification can diverge if training runs too long without safeguards. The weights keep changing, accumulate errors, and blow up.

How we prevent it:

  • Divergence detection: Stop if loss > 100
  • Early stopping: Stop if no improvement for N epochs (default: patience=20)
  • Default epochs: 300 (enough to converge without risking divergence)

If training stops with a divergence warning, reduce epochs or batch size.


Realistic Performance Expectations

Task Realistic Performance Notes
Anomaly detection > 90% accuracy βœ“ Achievable, used in production
Online learning < 100 steps to converge βœ“ Fast adaptation
Cascades (end-to-end) Stable training, gradient flow βœ“ Works well
Sin regression (MAE) β‰ˆ 0.4–0.5 βœ— Not the right tool β€” use PyTorch
Image reconstruction Depends on model size βœ“ Fine-tuning with ComfyUI nodes
General regression Baseline only βœ— Not optimized

When to Use STANNO (Decision Tree)

Do you want to:

  • Detect anomalies in a stream? β†’ Use STANNO + filter βœ“
  • Learn from one sample at a time? β†’ Use ContinualSTANNO βœ“
  • Train an encoder-decoder pipeline? β†’ Use CascadeSTANNO βœ“
  • Fit sin(x) accurately? β†’ Use PyTorch βœ—
  • Fine-tune a large pretrained model? β†’ Use PyTorch βœ—
  • Generate images from scratch? β†’ Use Stable Diffusion directly βœ—
  • Compose STANNO with image generation? β†’ Use ComfyUI nodes βœ“

FAQ

Q: Why doesn't STANNO fit sin(x) well?

A: It's not designed for regression. The fixed 4-module trainer works great for anomaly detection and online learning, but arbitrary function fitting needs backpropagation or evolution. Use PyTorch for that.

Q: Will longer training improve accuracy?

A: No. Longer training will diverge. Training has built-in early stopping (patience parameter), so it stops when it's done learning. If you increase epochs, you risk overfitting and divergence.

Q: Which trainer should I use: Fixed, LocalRule, or Evolutionary?

A: Start with Fixed β€” it's stable and interpretable. LocalRule learns per-synapse rules, which can be powerful but also unstable. Evolutionary uses evolutionary strategies and is slower but novel. Experiment for your problem.

Q: Is STANNO production-ready?

A: For anomaly detection and online learning: yes. For regression or general purpose training: no. For ComfyUI image workflows: yes, use the nodes.


Bottom Line

STANNO is specialized for anomaly detection, online learning, cascading, and ComfyUI workflows. It's not a general-purpose neural network and not a replacement for PyTorch or TensorFlow. Use it where the strengths match your problem.