NeMo Lightning

The NeMo Lightning directory provides custom PyTorch Lightning-compatible objects for seamlessly training NeMo 2.0 models using PTL. NeMo 2.0 models are implemented using Megatron Core. NeMo Lightning provides the bridge between higher-level, object-oriented PTL APIs and lower-level Megatron APIs. For detailed tutorials and documentation on NeMo 2.0, refer to the docs.

Some of the helpful classes provided here include:

Trainer: A lightweight wrapper around PTL's Trainer object which provides some additional support for capturing the arguments used to initialized the trainer. More information on NeMo 2's serialization mechanisms is available here.
MegatronStrategy: A PTL strategy that enables training of Megatron models on NVIDIA GPUs.
MegatronParallel: Class which sets up and manages Megatron's distributed model parallelism.
MegatronMixedPrecision: A specialized precision plugin for training Megatron-based models in PTL.

More information on MegatronStrategy, MegatronParallel, and MegatronMixedPrecision can be found in this document.