Upload folder using huggingface_hub

1230f58 verified 8 months ago

5.54 kB

	---
	library_name: peft
	tags:
	- generated_from_trainer
	datasets:
	- /workspace/axolotl/datasets/chemistry_data.csv
	base_model: /workspace/axolotl/llama-8B
	model-index:
	- name: root/outputs/fine_tuned_model
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
	<details><summary>See axolotl config</summary>

	axolotl version: `0.6.0`
	```yaml
	base_model: /workspace/axolotl/llama-8B
	model_type: AutoModelForCausalLM
	tokenizer_type: AutoTokenizer
	load_in_8bit: false
	load_in_4bit: true
	strict: false
	datasets:
	- path: /workspace/axolotl/datasets/chemistry_data.csv
	type: alpaca
	format: csv
	prompt_template: '### Instruction: {instruction}

	### Input: {input}

	### Response: {output}'
	dataset_prepared_path: null
	val_set_size: 0.1
	output_dir: /root/outputs/fine_tuned_model
	adapter: qlora
	lora_model_dir: null
	sequence_len: 2048
	sample_packing: true
	eval_sample_packing: false
	pad_to_sequence_len: true
	lora_r: 16
	lora_alpha: 8
	lora_dropout: 0.05
	lora_target_modules: null
	lora_target_linear: true
	lora_fan_in_fan_out: null
	wandb_project: null
	wandb_entity: null
	wandb_watch: null
	wandb_name: null
	wandb_log_model: null
	gradient_accumulation_steps: 4
	micro_batch_size: 1
	num_epochs: 10
	max_steps: 10000000
	optimizer: paged_adamw_32bit
	lr_scheduler: cosine
	learning_rate: 0.0002
	train_on_inputs: false
	group_by_length: false
	bf16: auto
	fp16: null
	tf32: false
	gradient_checkpointing: true
	early_stopping_patience: 3
	save_strategy: steps
	save_steps: 20
	evaluation_strategy: steps
	eval_steps: 20
	load_best_model_at_end: true
	save_total_limit: 3
	metric_for_best_model: loss
	greater_is_better: false
	resume_from_checkpoint: null
	local_rank: null
	logging_steps: 1
	xformers_attention: null
	flash_attention: true
	warmup_steps: 10
	debug: null
	deepspeed: null
	weight_decay: 0.0
	fsdp: null
	fsdp_config: null
	special_tokens:
	pad_token: <\|end_of_text\|>
	mlflow_tracking_uri: https://mlflow-dev.qpiai-pro.tech
	mlflow_experiment_name: llama-8B-chemistry
	hf_mlflow_log_artifacts: 'true'
	local_files_only: true

	```

	</details><br>

	# root/outputs/fine_tuned_model

	This model was trained from scratch on the /workspace/axolotl/datasets/chemistry_data.csv dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.9859

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 4
	- optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 10
	- training_steps: 9890

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 2.2045 \| 0.0010 \| 1 \| 2.3167 \|
	\| 2.1303 \| 0.0202 \| 20 \| 2.0805 \|
	\| 1.9063 \| 0.0404 \| 40 \| 2.0458 \|
	\| 2.0275 \| 0.0606 \| 60 \| 2.0337 \|
	\| 2.1621 \| 0.0807 \| 80 \| 2.0254 \|
	\| 1.8073 \| 0.1009 \| 100 \| 2.0203 \|
	\| 2.1245 \| 0.1211 \| 120 \| 2.0177 \|
	\| 1.9644 \| 0.1413 \| 140 \| 2.0137 \|
	\| 1.9735 \| 0.1615 \| 160 \| 2.0123 \|
	\| 2.2691 \| 0.1817 \| 180 \| 2.0095 \|
	\| 1.9491 \| 0.2019 \| 200 \| 2.0075 \|
	\| 2.0258 \| 0.2221 \| 220 \| 2.0057 \|
	\| 1.7861 \| 0.2422 \| 240 \| 2.0050 \|
	\| 1.9007 \| 0.2624 \| 260 \| 2.0006 \|
	\| 1.9219 \| 0.2826 \| 280 \| 2.0009 \|
	\| 2.0698 \| 0.3028 \| 300 \| 1.9978 \|
	\| 1.6277 \| 0.3230 \| 320 \| 1.9976 \|
	\| 1.7718 \| 0.3432 \| 340 \| 1.9964 \|
	\| 1.8223 \| 0.3634 \| 360 \| 1.9958 \|
	\| 2.1197 \| 0.3835 \| 380 \| 1.9953 \|
	\| 2.1519 \| 0.4037 \| 400 \| 1.9969 \|
	\| 2.0659 \| 0.4239 \| 420 \| 1.9952 \|
	\| 1.7126 \| 0.4441 \| 440 \| 1.9947 \|
	\| 2.1095 \| 0.4643 \| 460 \| 1.9924 \|
	\| 1.6791 \| 0.4845 \| 480 \| 1.9918 \|
	\| 1.9868 \| 0.5047 \| 500 \| 1.9908 \|
	\| 1.9909 \| 0.5249 \| 520 \| 1.9899 \|
	\| 2.2069 \| 0.5450 \| 540 \| 1.9917 \|
	\| 2.0763 \| 0.5652 \| 560 \| 1.9895 \|
	\| 1.9251 \| 0.5854 \| 580 \| 1.9891 \|
	\| 1.982 \| 0.6056 \| 600 \| 1.9879 \|
	\| 2.054 \| 0.6258 \| 620 \| 1.9875 \|
	\| 1.7292 \| 0.6460 \| 640 \| 1.9875 \|
	\| 1.7901 \| 0.6662 \| 660 \| 1.9891 \|
	\| 1.9179 \| 0.6863 \| 680 \| 1.9868 \|
	\| 1.6178 \| 0.7065 \| 700 \| 1.9874 \|
	\| 1.7637 \| 0.7267 \| 720 \| 1.9859 \|
	\| 1.6946 \| 0.7469 \| 740 \| 1.9868 \|
	\| 1.8821 \| 0.7671 \| 760 \| 1.9862 \|
	\| 2.1346 \| 0.7873 \| 780 \| 1.9859 \|


	### Framework versions

	- PEFT 0.14.0
	- Transformers 4.47.0
	- Pytorch 2.3.1+cu121
	- Datasets 3.1.0
	- Tokenizers 0.21.0