stockmark
/

stockmark-100b

Text Generation

text-generation-inference

Model card Files Files and versions

omitakahiro commited on May 15, 2024

Commit

8d09d1e

·

verified ·

1 Parent(s): 7f22c88

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -56,14 +56,17 @@ The detail of Japanese data is summarized in the below table. The stockmark web
 English data is sampled from [RedPajama-Data](https://github.com/togethercomputer/RedPajama-Data/tree/rp_v1).
 ## Training
 - GPU: 48 nodes of a3 (8*H100) instances
 - Training duration: about 7 weeks
 - Container: [Pytorch NGC Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
 - Library: [Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
 ## License
 [MIT](https://opensource.org/licenses/MIT)
 ## Developed by
 [Stockmark Inc.](https://stockmark.co.jp/)

 English data is sampled from [RedPajama-Data](https://github.com/togethercomputer/RedPajama-Data/tree/rp_v1).
 ## Training
 - GPU: 48 nodes of a3 (8*H100) instances
 - Training duration: about 7 weeks
 - Container: [Pytorch NGC Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
 - Library: [Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
 ## License
 [MIT](https://opensource.org/licenses/MIT)
 ## Developed by
 [Stockmark Inc.](https://stockmark.co.jp/)