classifier-7b-v9 / README.md

Max Zabarka

Mon Oct 23 18:38:13 UTC 2023

363d479 over 2 years ago

4.2 kB

	---
	base_model: NousResearch/Llama-2-7b-hf
	tags:
	- generated_from_trainer
	model-index:
	- name: classifier-7b-v9
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
	# classifier-7b-v9

	This model is a fine-tuned version of [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.8197

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 10
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.171 \| 0.02 \| 20 \| 2.1160 \|
	\| 1.881 \| 0.04 \| 40 \| 1.9814 \|
	\| 2.0141 \| 0.06 \| 60 \| 1.9357 \|
	\| 1.9386 \| 0.08 \| 80 \| 1.9156 \|
	\| 1.9899 \| 0.1 \| 100 \| 1.9032 \|
	\| 1.9022 \| 0.11 \| 120 \| 1.8964 \|
	\| 1.9176 \| 0.13 \| 140 \| 1.8880 \|
	\| 1.9431 \| 0.15 \| 160 \| 1.8827 \|
	\| 1.8847 \| 0.17 \| 180 \| 1.8772 \|
	\| 1.8158 \| 0.19 \| 200 \| 1.8740 \|
	\| 1.851 \| 0.21 \| 220 \| 1.8711 \|
	\| 1.8173 \| 0.23 \| 240 \| 1.8678 \|
	\| 1.7902 \| 0.25 \| 260 \| 1.8639 \|
	\| 1.8507 \| 0.27 \| 280 \| 1.8600 \|
	\| 1.8749 \| 0.29 \| 300 \| 1.8582 \|
	\| 1.9203 \| 0.3 \| 320 \| 1.8543 \|
	\| 1.8876 \| 0.32 \| 340 \| 1.8518 \|
	\| 1.8918 \| 0.34 \| 360 \| 1.8510 \|
	\| 1.9568 \| 0.36 \| 380 \| 1.8482 \|
	\| 1.7887 \| 0.38 \| 400 \| 1.8489 \|
	\| 1.9188 \| 0.4 \| 420 \| 1.8451 \|
	\| 1.855 \| 0.42 \| 440 \| 1.8434 \|
	\| 1.94 \| 0.44 \| 460 \| 1.8421 \|
	\| 1.7969 \| 0.46 \| 480 \| 1.8399 \|
	\| 1.875 \| 0.48 \| 500 \| 1.8384 \|
	\| 1.8493 \| 0.5 \| 520 \| 1.8383 \|
	\| 1.8048 \| 0.51 \| 540 \| 1.8370 \|
	\| 1.9077 \| 0.53 \| 560 \| 1.8352 \|
	\| 1.804 \| 0.55 \| 580 \| 1.8327 \|
	\| 1.8623 \| 0.57 \| 600 \| 1.8315 \|
	\| 1.8156 \| 0.59 \| 620 \| 1.8312 \|
	\| 1.8639 \| 0.61 \| 640 \| 1.8306 \|
	\| 1.909 \| 0.63 \| 660 \| 1.8292 \|
	\| 1.8636 \| 0.65 \| 680 \| 1.8290 \|
	\| 1.7888 \| 0.67 \| 700 \| 1.8270 \|
	\| 1.7797 \| 0.69 \| 720 \| 1.8259 \|
	\| 1.8014 \| 0.7 \| 740 \| 1.8248 \|
	\| 1.7313 \| 0.72 \| 760 \| 1.8240 \|
	\| 1.8429 \| 0.74 \| 780 \| 1.8235 \|
	\| 1.814 \| 0.76 \| 800 \| 1.8235 \|
	\| 1.7861 \| 0.78 \| 820 \| 1.8221 \|
	\| 1.8515 \| 0.8 \| 840 \| 1.8212 \|
	\| 1.8432 \| 0.82 \| 860 \| 1.8209 \|
	\| 1.8018 \| 0.84 \| 880 \| 1.8204 \|
	\| 1.864 \| 0.86 \| 900 \| 1.8203 \|
	\| 1.7234 \| 0.88 \| 920 \| 1.8201 \|
	\| 1.84 \| 0.89 \| 940 \| 1.8198 \|
	\| 1.8721 \| 0.91 \| 960 \| 1.8199 \|
	\| 1.7822 \| 0.93 \| 980 \| 1.8198 \|
	\| 1.8464 \| 0.95 \| 1000 \| 1.8197 \|
	\| 1.7454 \| 0.97 \| 1020 \| 1.8197 \|
	\| 1.7434 \| 0.99 \| 1040 \| 1.8197 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.5
	- Tokenizers 0.14.1

	---
	base_model: NousResearch/Llama-2-7b-hf
	tags:
	- generated_from_trainer
	model-index:
	- name: classifier-7b-v9
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
	# classifier-7b-v9

	This model is a fine-tuned version of [NousResearch/Llama-2-7b-hf](https://huggingface.co/NousResearch/Llama-2-7b-hf) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.8197

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 10
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.171 \| 0.02 \| 20 \| 2.1160 \|
	\| 1.881 \| 0.04 \| 40 \| 1.9814 \|
	\| 2.0141 \| 0.06 \| 60 \| 1.9357 \|
	\| 1.9386 \| 0.08 \| 80 \| 1.9156 \|
	\| 1.9899 \| 0.1 \| 100 \| 1.9032 \|
	\| 1.9022 \| 0.11 \| 120 \| 1.8964 \|
	\| 1.9176 \| 0.13 \| 140 \| 1.8880 \|
	\| 1.9431 \| 0.15 \| 160 \| 1.8827 \|
	\| 1.8847 \| 0.17 \| 180 \| 1.8772 \|
	\| 1.8158 \| 0.19 \| 200 \| 1.8740 \|
	\| 1.851 \| 0.21 \| 220 \| 1.8711 \|
	\| 1.8173 \| 0.23 \| 240 \| 1.8678 \|
	\| 1.7902 \| 0.25 \| 260 \| 1.8639 \|
	\| 1.8507 \| 0.27 \| 280 \| 1.8600 \|
	\| 1.8749 \| 0.29 \| 300 \| 1.8582 \|
	\| 1.9203 \| 0.3 \| 320 \| 1.8543 \|
	\| 1.8876 \| 0.32 \| 340 \| 1.8518 \|
	\| 1.8918 \| 0.34 \| 360 \| 1.8510 \|
	\| 1.9568 \| 0.36 \| 380 \| 1.8482 \|
	\| 1.7887 \| 0.38 \| 400 \| 1.8489 \|
	\| 1.9188 \| 0.4 \| 420 \| 1.8451 \|
	\| 1.855 \| 0.42 \| 440 \| 1.8434 \|
	\| 1.94 \| 0.44 \| 460 \| 1.8421 \|
	\| 1.7969 \| 0.46 \| 480 \| 1.8399 \|
	\| 1.875 \| 0.48 \| 500 \| 1.8384 \|
	\| 1.8493 \| 0.5 \| 520 \| 1.8383 \|
	\| 1.8048 \| 0.51 \| 540 \| 1.8370 \|
	\| 1.9077 \| 0.53 \| 560 \| 1.8352 \|
	\| 1.804 \| 0.55 \| 580 \| 1.8327 \|
	\| 1.8623 \| 0.57 \| 600 \| 1.8315 \|
	\| 1.8156 \| 0.59 \| 620 \| 1.8312 \|
	\| 1.8639 \| 0.61 \| 640 \| 1.8306 \|
	\| 1.909 \| 0.63 \| 660 \| 1.8292 \|
	\| 1.8636 \| 0.65 \| 680 \| 1.8290 \|
	\| 1.7888 \| 0.67 \| 700 \| 1.8270 \|
	\| 1.7797 \| 0.69 \| 720 \| 1.8259 \|
	\| 1.8014 \| 0.7 \| 740 \| 1.8248 \|
	\| 1.7313 \| 0.72 \| 760 \| 1.8240 \|
	\| 1.8429 \| 0.74 \| 780 \| 1.8235 \|
	\| 1.814 \| 0.76 \| 800 \| 1.8235 \|
	\| 1.7861 \| 0.78 \| 820 \| 1.8221 \|
	\| 1.8515 \| 0.8 \| 840 \| 1.8212 \|
	\| 1.8432 \| 0.82 \| 860 \| 1.8209 \|
	\| 1.8018 \| 0.84 \| 880 \| 1.8204 \|
	\| 1.864 \| 0.86 \| 900 \| 1.8203 \|
	\| 1.7234 \| 0.88 \| 920 \| 1.8201 \|
	\| 1.84 \| 0.89 \| 940 \| 1.8198 \|
	\| 1.8721 \| 0.91 \| 960 \| 1.8199 \|
	\| 1.7822 \| 0.93 \| 980 \| 1.8198 \|
	\| 1.8464 \| 0.95 \| 1000 \| 1.8197 \|
	\| 1.7454 \| 0.97 \| 1020 \| 1.8197 \|
	\| 1.7434 \| 0.99 \| 1040 \| 1.8197 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.5
	- Tokenizers 0.14.1