Upload folder using huggingface_hub

a1506a1 verified about 1 year ago

2.67 kB

library_name: transformers
license: other
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - full
  - generated_from_trainer
model-index:
  - name: no_explain
    results: []

no_explain

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the chess_explain_noexplain_00, the chess_explain_noexplain_01, the chess_explain_noexplain_02, the chess_explain_noexplain_03, the chess_explain_noexplain_04, the chess_explain_noexplain_05, the chess_explain_noexplain_06, the chess_explain_noexplain_07, the chess_explain_noexplain_08, the chess_explain_noexplain_09, the chess_explain_noexplain_10, the chess_explain_noexplain_11, the chess_explain_noexplain_12, the chess_explain_noexplain_13 and the chess_explain_noexplain_14 datasets. It achieves the following results on the evaluation set:

Loss: 0.0932

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 64
eval_batch_size: 64
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 2
total_train_batch_size: 1024
total_eval_batch_size: 512
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss
0.0429	0.8010	1000	0.0422
0.0329	1.6015	2000	0.0336
0.0275	2.4021	3000	0.0297
0.0202	3.2026	4000	0.0292
0.0194	4.0032	5000	0.0294
0.0119	4.8042	6000	0.0311
0.0048	5.6047	7000	0.0439
0.0013	6.4053	8000	0.0538
0.0004	7.2058	9000	0.0670
0.0003	8.0064	10000	0.0698
0.0	8.8074	11000	0.0894
0.0	9.6079	12000	0.0931

Framework versions

Transformers 4.48.2
Pytorch 2.6.0+cu124
Datasets 2.21.0
Tokenizers 0.21.0