| In such a case you can use the |
| detect_overflow helper function to inject the detector where you want it, for example: |
| thon |
| from debug_utils import detect_overflow |
| class T5LayerFF(nn.Module): |
| [] |
| def forward(self, hidden_states): |
| forwarded_states = self.layer_norm(hidden_states) |
| detect_overflow(forwarded_states, "after layer_norm") |
| forwarded_states = self.DenseReluDense(forwarded_states) |
| detect_overflow(forwarded_states, "after DenseReluDense") |
| return hidden_states + self.dropout(forwarded_states) |
|
|
| You can see that we added 2 of these and now we track if inf or nan for forwarded_states was detected |
| somewhere in between. |
| Actually, the detector already reports these because each of the calls in the example above is a nn.Module, but |
| let's say if you had some local direct calculations this is how you'd do that. |
| Additionally, if you're instantiating the debugger in your own code, you can adjust the number of frames printed from |
| its default, e.g.: |
| thon |
| from transformers.debug_utils import DebugUnderflowOverflow |
| debug_overflow = DebugUnderflowOverflow(model, max_frames_to_save=100) |
|
|
| Specific batch absolute min and max value tracing |
| The same debugging class can be used for per-batch tracing with the underflow/overflow detection feature turned off. |
| Let's say you want to watch the absolute min and max values for all the ingredients of each forward call of a given |
| batch, and only do that for batches 1 and 3. |