Aravindan
/

gpt2out

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

Aravindan commited on Jun 6, 2024

Commit

9a72e7e

·

verified ·

1 Parent(s): b3c6230

Update README.md

Files changed (1) hide show

README.md +3 -69

README.md CHANGED Viewed

@@ -8,72 +8,6 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# gpt2coder-8epochs
-This model is a fine-tuned version of [Aravindan/gpt2out](https://huggingface.co/Aravindan/gpt2out) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.6964
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 8
-- eval_batch_size: 8
-- seed: 42
-- gradient_accumulation_steps: 16
-- total_train_batch_size: 128
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 20
-### Training results
-| Training Loss | Epoch   | Step | Validation Loss |
-|:-------------:|:-------:|:----:|:---------------:|
-| No log        | 0.9962  | 132  | 1.5428          |
-| No log        | 2.0     | 265  | 1.4204          |
-| No log        | 2.9962  | 397  | 1.2888          |
-| 1.5725        | 4.0     | 530  | 1.1900          |
-| 1.5725        | 4.9962  | 662  | 1.1045          |
-| 1.5725        | 6.0     | 795  | 1.0314          |
-| 1.5725        | 6.9962  | 927  | 0.9723          |
-| 1.217         | 8.0     | 1060 | 0.9139          |
-| 1.217         | 8.9962  | 1192 | 0.8689          |
-| 1.217         | 10.0    | 1325 | 0.8274          |
-| 1.217         | 10.9962 | 1457 | 0.7910          |
-| 1.0164        | 12.0    | 1590 | 0.7555          |
-| 1.0164        | 12.9962 | 1722 | 0.7266          |
-| 1.0164        | 14.0    | 1855 | 0.7014          |
-| 1.0164        | 14.9962 | 1987 | 0.6777          |
-| 0.8885        | 16.0    | 2120 | 0.6597          |
-| 0.8885        | 16.9962 | 2252 | 0.6440          |
-| 0.8885        | 18.0    | 2385 | 0.6327          |
-| 0.8106        | 18.9962 | 2517 | 0.6239          |
-| 0.8106        | 19.9962 | 2640 | 0.6964          |
-### Framework versions
-- Transformers 4.41.1
-- Pytorch 2.1.2
-- Datasets 2.19.1
-- Tokenizers 0.19.1

   results: []
 ---
+## GPT-OUT
+* It's a coding foundation model trained on various python datasets to bring the model only for python.
+* Pre-training is still on-going process, estimated to complete this process before end of the june month.