Gradient Checkpointing

by amadalincostea2 - opened Feb 3, 2024

Does the model support gradient checkpointing?

The OLMo codebase supports activation checkpointing.

But since you're here in Huggingface, and not on GitHub, you probably want to know whether the Huggingface version of OLMo supports it?

Same person, different account. Yes I meant for the Huggingface version.

@akshitab , do we have to do anything special to make activation checkpointing work?

dirkgr changed discussion status to closed Oct 16, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment