42dot
/

42dot_LLM-SFT-1.3B

Text Generation

text-generation-inference

Model card Files Files and versions

ykhwang commited on Sep 14, 2023

Commit

3067976

·

1 Parent(s): 444ef9d

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -27,7 +27,7 @@ As same as 42dot-PLM, the model is built upon a Transformer decoder architecture
 (\* unit: tokens)
 ### Supervised Fine-tuning
-Fine-tuning took about 20 hours using 8 * NVIDIA A100 GPUs. For the training dataset, we manually constructed  (question or insturuction) and response pairs, which can either be single- or multi-turn.
 ### Evaluation
 Inspired by recent attempts like [Vicuna](https://lmsys.org/blog/2023-03-30-vicuna/#how-good-is-vicuna), we evaluate 42dot-PLM with other proprietary/open-sourced chatbots using GPT-4 for assessing various aspects of responses. The evaluation dataset consists of 121 prompts over 10 categories. The sample of the evaluation dataset and prompt template can be downloaded from our [GitHub repo](https://github.com/42dot/42dot_LLM).

 (\* unit: tokens)
 ### Supervised Fine-tuning
+Fine-tuning took about 112 GPU hours (in NVIDIA A100). For the training dataset, we manually constructed  (question or insturuction) and response pairs, which can either be single- or multi-turn.
 ### Evaluation
 Inspired by recent attempts like [Vicuna](https://lmsys.org/blog/2023-03-30-vicuna/#how-good-is-vicuna), we evaluate 42dot-PLM with other proprietary/open-sourced chatbots using GPT-4 for assessing various aspects of responses. The evaluation dataset consists of 121 prompts over 10 categories. The sample of the evaluation dataset and prompt template can be downloaded from our [GitHub repo](https://github.com/42dot/42dot_LLM).