File size: 1,978 Bytes
16f75a3
 
04ddc3d
16f75a3
04ddc3d
d8892b7
 
04ddc3d
 
 
 
fc4fb86
a3d6cf8
 
 
 
16f75a3
 
04ddc3d
 
16f75a3
04ddc3d
 
16f75a3
78d0cf3
16f75a3
04ddc3d
16f75a3
04ddc3d
16f75a3
04ddc3d
16f75a3
04ddc3d
16f75a3
04ddc3d
16f75a3
04ddc3d
16f75a3
04ddc3d
16f75a3
04ddc3d
16f75a3
04ddc3d
 
 
 
 
 
 
 
16f75a3
04ddc3d
16f75a3
 
 
04ddc3d
16f75a3
04ddc3d
 
 
 
16f75a3
 
04ddc3d
 
 
 
 
 
 
 
 
 
 
16f75a3
04ddc3d
16f75a3
 
fc4fb86
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
license: apache-2.0
library_name: peft
tags:
- generated_from_trainer
- text-generation
inference: true
base_model: mistralai/Mistral-7B-Instruct-v0.1
model-index:
- name: tmp/helix/results/e9624262-34ea-4818-a31f-84692d26fc66
  results: []
pipeline_tag: text-generation
widget:
- messages:
  - role: user
    content: What is Vyvanse used for?
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
# tmp/helix/results/e9624262-34ea-4818-a31f-84692d26fc66

This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on a Custom dataset.

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 6
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 20

### Training results



### Framework versions

- Transformers 4.36.0.dev0
- Datasets 2.15.0
- Tokenizers 0.15.0
## Training procedure


The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: bfloat16

### Framework versions


- PEFT 0.6.0