YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.


  license: apache-2.0
  tags:
  - test-fixtures
  - training
  - autograd
  - optimizer
  - dataloader
  - pytorch
  ---

  # ferrotorch / training-trajectory-v1

  Multi-epoch training-trajectory parity fixtures for ferrotorch's
  full training stack โ€” autograd + loss + optimizer + DataLoader.
  Phase E of real-artifact-driven development (#1161).

  Generated by running `torch.optim.Adam` on a fixed 3-layer MLP
  against a fixed deterministic regression dataset for 5 epochs of
  sequential iteration (`batch_size=4`, `drop_last=False`,
  no shuffling) and snapshotting the state_dict after each epoch.
  125 optimizer steps total.

  Companion to:
    * `scripts/pin_pretrained_training_trajectory.py` (this pin)
    * `scripts/verify_training_trajectory.py` (the harness)
    * `ferrotorch-train/examples/multi_epoch_train_dump.rs`
    * `ferrotorch-train/tests/conformance_multi_epoch_training.rs`

  ## Why live autograd

  Phase C.2 (#1155) verified the *optimizer step math* with
  **frozen** gradients (snapshotted from torch, re-applied on the
  ferrotorch side) to isolate one suspect at a time. This pin
  verifies the *full training loop* with **live** autograd โ€” the
  ferrotorch side has to re-derive the gradients itself. If
  anything in the stack diverges (linear backward, relu backward,
  mse backward, Adam state, sequential dataloader iteration order)
  the harness will catch it as a per-epoch state_dict drift.

  ## Architecture

  ```
  MLP(
    Linear(64 -> 32) -> ReLU
    Linear(32 -> 16)      -> ReLU
    Linear(16 -> 8)
  )
  ```

  ## Dataset

  * `X_full.bin`  โ€” `torch.randn(100, 64)` with seed 42
  * `y_full.bin`  โ€” `torch.randn(100, 8)` with seed 42
  * Loss target: `F.mse_loss(pred, y, reduction='mean')`

  ## Training

  * Optimizer: `Adam(lr=0.001, betas=(0.9, 0.999), eps=1e-8)`
  * Batch size: `4`
  * Iteration: sequential (`for i in range(0, N, BATCH)` โ€”
    equivalent to `DataLoader(shuffle=False, drop_last=False)`)
  * Epochs: `5`
  * Per-epoch losses (mean over 25 batches):
    * epoch 1: `1.099803`
  • epoch 2: 1.059872

  • epoch 3: 1.033610

  • epoch 4: 1.009329

  • epoch 5: 0.982952

    ## Layout
    
    ```
    epoch_0_state.safetensors    # initial state (alias: initial_state.safetensors)
    epoch_1_state.safetensors    # after epoch 1
    epoch_2_state.safetensors    # after epoch 2
    epoch_3_state.safetensors    # after epoch 3
    epoch_4_state.safetensors    # after epoch 4
    epoch_5_state.safetensors    # after epoch 5
    X_full.bin                   # full dataset features
    y_full.bin                   # full dataset targets
    meta.json                    # hyperparameters + per-epoch losses
    bundle.tar                   # convenience archive (registry pin checksum)
    ```
    
    State-dict keys: `fc1.weight`, `fc1.bias`, `fc2.weight`,
    `fc2.bias`, `fc3.weight`, `fc3.bias`.
    
    ## Tolerance
    
    The harness gate is `max_abs <= 1e-4` and `cosine_sim >= 0.9999`
    per tensor for every epoch โ€” autograd noise budget for 125 steps
    of accumulated f32 noise across two independent runtimes.
    
    ## License
    
    Apache 2.0. Synthetic fixtures generated by this repo's pin
    script; no upstream weights / data.
    
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support