Weight-Space Autoencoder (TRANSFORMER)

This model is a weight-space autoencoder trained on neural network activation weights/signatures. It includes both an encoder (compresses weights into latent representations) and a decoder (reconstructs weights from latent codes).

Model Description

  • Architecture: Transformer encoder-decoder
  • Training Dataset: maximuspowers/muat-fourier-5
  • Input Mode: signature
  • Latent Dimension: 256

Tokenization

  • Chunk Size: 64 weight values per token
  • Max Tokens: 512
  • Metadata: True

Training Config

  • Loss Function: cosine
  • Optimizer: adam
  • Learning Rate: 0.0001
  • Batch Size: 16

Performance Metrics (Test Set)

  • MSE: 0.299696
  • MAE: 0.303521
  • RMSE: 0.547445
  • Cosine Similarity: 0.8642
  • R² Score: 0.0638
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train maximuspowers/sig-autoencoder-fourier-5-cosine