| --- |
| license: bigscience-openrail-m |
| datasets: |
| - iamplus/Instruction_Tuning |
| --- |
| Instruction Tuned GPT-NeoXT-20B model on Stanford Alpaca-2 Instruction Tuning dataset (outputs from ChatGPT) (52k data) using ***Colossal AI*** |
|
|
| **Base Model:** togethercomputer/GPT-NeoXT-Chat-Base-20B (not fine-tuned on feedback data) |
|
|
| **Training Details :** |
| * Epochs: 5 |
| * Batch Size : 16 instantaneous per device x 1 gradient accumulation steps x 8 gpus = 128 |
| * Max Length : 1024 |
| * Weight Decay : 0 |
| * Learning Rate : 2e-5 |
| * Learning Rate Scheduler Type : Cosine |
| * Number of warmup steps : 30 |
| * Machine : 8xA100 80GB |
|
|
| **Dataset Details :** |
|
|
| Dataset : iamplus/Instruction_Tuning |
| |
| Files : |
| * stanford_alpaca_it_v2.csv |