| .. _parallelisms: | |
| Parallelisms | |
| ------------ | |
| NeMo Megatron supports 4 types of parallelisms (can be mixed together arbitraritly): | |
| Distributed Data parallelism | |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| .. image:: images/ddp.gif | |
| :align: center | |
| :width: 800px | |
| :alt: Distributed Data Parallel | |
| Tensor Parallelism | |
| ^^^^^^^^^^^^^^^^^^ | |
| .. image:: images/tp.gif | |
| :align: center | |
| :width: 800px | |
| :alt: Tensor Parallel | |
| Pipeline Parallelism | |
| ^^^^^^^^^^^^^^^^^^^^ | |
| .. image:: images/pp.gif | |
| :align: center | |
| :width: 800px | |
| :alt: Pipeline Parallel | |
| Sequence Parallelism | |
| ^^^^^^^^^^^^^^^^^^^^ | |
| .. image:: images/sp.gif | |
| :align: center | |
| :width: 800px | |
| :alt: Sqeuence Parallel | |
| Parallelism nomenclature | |
| ^^^^^^^^^^^^^^^^^^^^^^^^ | |
| When reading and modifying NeMo Megatron code you will encounter the following terms. | |
| .. image:: images/pnom.gif | |
| :align: center | |
| :width: 800px | |
| :alt: Parallelism nomenclature | |