YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
license: apache-2.0
tags:
- test-fixtures
- optimizer
- pytorch
---
# ferrotorch / optimizer-trajectories-v1
Frozen-gradient parity fixtures for ferrotorch's `Optimizer::step()`
implementations, generated by running `torch.optim` against a
small fixed MLP for 10 steps and snapshotting initial
params, per-step gradients, and final params.
Phase C.2 of real-artifact-driven development (#1155). Companion
to:
* `scripts/pin_pretrained_optimizer_trajectories.py` (this pin)
* `scripts/verify_optimizer_inference.py` (the harness)
* `ferrotorch-optim/examples/optimizer_trajectory_dump.rs`
* `ferrotorch-optim/tests/conformance_optimizer_trajectories.rs`
## Why frozen gradients
The test is "does ferrotorch's optimizer match torch's optimizer
math", not "does ferrotorch's autograd match torch's autograd".
Live autograd would fold linear+MSELoss backward bugs into every
verdict; the matmul-grad path is already covered by the
causal-LM and BERT real-artifact harnesses. By snapshotting
gradients on the PyTorch side and re-applying them verbatim on
the ferrotorch side, a failure in this harness fingers an
optimizer (one of SGD / Adam / AdamW / RMSprop / Adagrad), not
autograd.
## MLP
```
Linear(64 -> 32) -> ReLU -> Linear(32 -> 16) -> ReLU -> Linear(16 -> 8)
```
* Seed: `torch.manual_seed(42)` before construction
* Input batch: `torch.randn(8, 64)`
* Target batch: `torch.randn(8, 8)`
* Loss: `MSELoss(reduction='mean')`
* 6 parameters per model in canonical order:
`layer{0,1,2}.{weight,bias}`
## Configurations
* `sgd_plain` β `SGD(lr=0.01)`
sgd_momentumβSGD(lr=0.01, momentum=0.9)sgd_nesterovβSGD(lr=0.01, momentum=0.9, nesterov=True)adam_defaultβAdam(lr=0.001)adam_explicitβAdam(lr=0.001, betas=(0.9, 0.999), eps=1e-08)adamw_decoupledβAdamW(lr=0.001, weight_decay=0.01)rmsprop_defaultβRMSprop(lr=0.001)rmsprop_momentumβRMSprop(lr=0.001, momentum=0.9, alpha=0.99)adagrad_defaultβAdagrad(lr=0.01)adagrad_explicitβAdagrad(lr=0.01, lr_decay=0.1, eps=1e-10)## Layout One subfolder per configuration: ``` <config_name>/ meta.json initial_params.bin # params before step 0 gradients_step_0.bin # gradient at step 0 gradients_step_1.bin # gradient at step 1 gradients_step_2.bin # gradient at step 2 gradients_step_3.bin # gradient at step 3 gradients_step_4.bin # gradient at step 4 gradients_step_5.bin # gradient at step 5 gradients_step_6.bin # gradient at step 6 gradients_step_7.bin # gradient at step 7 gradients_step_8.bin # gradient at step 8 gradients_step_9.bin # gradient at step 9 final_params.bin # params after step 10 ``` ## Binary format All `.bin` files use the same little-endian multi-tensor layout: ``` [u32 num_tensors] per tensor: [u32 ndim] [u32 * ndim shape] [f32 * prod(shape)] ``` Tensor *order* (not name) is the contract β see `param_names` in `meta.json`. ## License Apache 2.0. Synthetic fixtures generated by this repo's pin script; no upstream weights / data.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support