YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.


  license: apache-2.0
  tags:
  - test-fixtures
  - optimizer
  - pytorch
  ---

  # ferrotorch / optimizer-trajectories-v1

  Frozen-gradient parity fixtures for ferrotorch's `Optimizer::step()`
  implementations, generated by running `torch.optim` against a
  small fixed MLP for 10 steps and snapshotting initial
  params, per-step gradients, and final params.

  Phase C.2 of real-artifact-driven development (#1155). Companion
  to:
    * `scripts/pin_pretrained_optimizer_trajectories.py` (this pin)
    * `scripts/verify_optimizer_inference.py` (the harness)
    * `ferrotorch-optim/examples/optimizer_trajectory_dump.rs`
    * `ferrotorch-optim/tests/conformance_optimizer_trajectories.rs`

  ## Why frozen gradients

  The test is "does ferrotorch's optimizer match torch's optimizer
  math", not "does ferrotorch's autograd match torch's autograd".
  Live autograd would fold linear+MSELoss backward bugs into every
  verdict; the matmul-grad path is already covered by the
  causal-LM and BERT real-artifact harnesses. By snapshotting
  gradients on the PyTorch side and re-applying them verbatim on
  the ferrotorch side, a failure in this harness fingers an
  optimizer (one of SGD / Adam / AdamW / RMSprop / Adagrad), not
  autograd.

  ## MLP

  ```
  Linear(64 -> 32) -> ReLU -> Linear(32 -> 16) -> ReLU -> Linear(16 -> 8)
  ```

  * Seed: `torch.manual_seed(42)` before construction
  * Input batch: `torch.randn(8, 64)`
  * Target batch: `torch.randn(8, 8)`
  * Loss: `MSELoss(reduction='mean')`
  * 6 parameters per model in canonical order:
    `layer{0,1,2}.{weight,bias}`

  ## Configurations

    * `sgd_plain` β€” `SGD(lr=0.01)`
  • sgd_momentum β€” SGD(lr=0.01, momentum=0.9)

  • sgd_nesterov β€” SGD(lr=0.01, momentum=0.9, nesterov=True)

  • adam_default β€” Adam(lr=0.001)

  • adam_explicit β€” Adam(lr=0.001, betas=(0.9, 0.999), eps=1e-08)

  • adamw_decoupled β€” AdamW(lr=0.001, weight_decay=0.01)

  • rmsprop_default β€” RMSprop(lr=0.001)

  • rmsprop_momentum β€” RMSprop(lr=0.001, momentum=0.9, alpha=0.99)

  • adagrad_default β€” Adagrad(lr=0.01)

  • adagrad_explicit β€” Adagrad(lr=0.01, lr_decay=0.1, eps=1e-10)

    ## Layout
    
    One subfolder per configuration:
    
    ```
    <config_name>/
      meta.json
      initial_params.bin       # params before step 0
      gradients_step_0.bin     # gradient at step 0
      gradients_step_1.bin     # gradient at step 1
      gradients_step_2.bin     # gradient at step 2
      gradients_step_3.bin     # gradient at step 3
      gradients_step_4.bin     # gradient at step 4
      gradients_step_5.bin     # gradient at step 5
      gradients_step_6.bin     # gradient at step 6
      gradients_step_7.bin     # gradient at step 7
      gradients_step_8.bin     # gradient at step 8
      gradients_step_9.bin     # gradient at step 9
      final_params.bin         # params after step 10
    ```
    
    ## Binary format
    
    All `.bin` files use the same little-endian multi-tensor layout:
    
    ```
    [u32 num_tensors]
    per tensor:
      [u32 ndim] [u32 * ndim shape] [f32 * prod(shape)]
    ```
    
    Tensor *order* (not name) is the contract β€” see
    `param_names` in `meta.json`.
    
    ## License
    
    Apache 2.0. Synthetic fixtures generated by this repo's pin
    script; no upstream weights / data.
    
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support