YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

  license: apache-2.0
  tags:
  - test-fixtures
  - optimizer
  - pytorch
  ---

  # ferrotorch / optimizer-trajectories-v1

  Frozen-gradient parity fixtures for ferrotorch's `Optimizer::step()`
  implementations, generated by running `torch.optim` against a
  small fixed MLP for 10 steps and snapshotting initial
  params, per-step gradients, and final params.

  Phase C.2 of real-artifact-driven development (#1155). Companion
  to:
    * `scripts/pin_pretrained_optimizer_trajectories.py` (this pin)
    * `scripts/verify_optimizer_inference.py` (the harness)
    * `ferrotorch-optim/examples/optimizer_trajectory_dump.rs`
    * `ferrotorch-optim/tests/conformance_optimizer_trajectories.rs`

  ## Why frozen gradients

  The test is "does ferrotorch's optimizer match torch's optimizer
  math", not "does ferrotorch's autograd match torch's autograd".
  Live autograd would fold linear+MSELoss backward bugs into every
  verdict; the matmul-grad path is already covered by the
  causal-LM and BERT real-artifact harnesses. By snapshotting
  gradients on the PyTorch side and re-applying them verbatim on
  the ferrotorch side, a failure in this harness fingers an
  optimizer (one of SGD / Adam / AdamW / RMSprop / Adagrad), not
  autograd.

  ## MLP

  ```
  Linear(64 -> 32) -> ReLU -> Linear(32 -> 16) -> ReLU -> Linear(16 -> 8)
  ```

  * Seed: `torch.manual_seed(42)` before construction
  * Input batch: `torch.randn(8, 64)`
  * Target batch: `torch.randn(8, 8)`
  * Loss: `MSELoss(reduction='mean')`
  * 6 parameters per model in canonical order:
    `layer{0,1,2}.{weight,bias}`

  ## Configurations

    * `sgd_plain` — `SGD(lr=0.01)`

sgd_momentum — SGD(lr=0.01, momentum=0.9)
sgd_nesterov — SGD(lr=0.01, momentum=0.9, nesterov=True)
adam_default — Adam(lr=0.001)
adam_explicit — Adam(lr=0.001, betas=(0.9, 0.999), eps=1e-08)
adamw_decoupled — AdamW(lr=0.001, weight_decay=0.01)
rmsprop_default — RMSprop(lr=0.001)
rmsprop_momentum — RMSprop(lr=0.001, momentum=0.9, alpha=0.99)
adagrad_default — Adagrad(lr=0.01)

adagrad_explicit — Adagrad(lr=0.01, lr_decay=0.1, eps=1e-10)

## Layout

One subfolder per configuration:

```
<config_name>/
  meta.json
  initial_params.bin       # params before step 0
  gradients_step_0.bin     # gradient at step 0
  gradients_step_1.bin     # gradient at step 1
  gradients_step_2.bin     # gradient at step 2
  gradients_step_3.bin     # gradient at step 3
  gradients_step_4.bin     # gradient at step 4
  gradients_step_5.bin     # gradient at step 5
  gradients_step_6.bin     # gradient at step 6
  gradients_step_7.bin     # gradient at step 7
  gradients_step_8.bin     # gradient at step 8
  gradients_step_9.bin     # gradient at step 9
  final_params.bin         # params after step 10
```

## Binary format

All `.bin` files use the same little-endian multi-tensor layout:

```
[u32 num_tensors]
per tensor:
  [u32 ndim] [u32 * ndim shape] [f32 * prod(shape)]
```

Tensor *order* (not name) is the contract — see
`param_names` in `meta.json`.

## License

Apache 2.0. Synthetic fixtures generated by this repo's pin
script; no upstream weights / data.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support