Update README.md
Browse files
README.md
CHANGED
|
@@ -104,7 +104,7 @@ FLM-101B is trained on a cluster of 24 DGX-A800 GPU (8×80G) servers for less th
|
|
| 104 |
|
| 105 |
#### Software
|
| 106 |
|
| 107 |
-
FLM-101B was trained with a codebase adapted from Megatron-LM.
|
| 108 |
It uses a 3D(DP+TP+PP) parallelism approach and distributed optimizer.
|
| 109 |
|
| 110 |
|
|
|
|
| 104 |
|
| 105 |
#### Software
|
| 106 |
|
| 107 |
+
FLM-101B was trained with a codebase adapted from Megatron-LM, and will be open-sourced soon.
|
| 108 |
It uses a 3D(DP+TP+PP) parallelism approach and distributed optimizer.
|
| 109 |
|
| 110 |
|