flytech commited on
Commit
c49ab87
·
1 Parent(s): 3c9314b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -10
README.md CHANGED
@@ -12,36 +12,44 @@ should probably proofread and complete it, then remove this comment. -->
12
 
13
  # Ruckus-PyAssi-13b
14
 
15
- This model is a fine-tuned version of [meta-llama/Llama-2-13b-hf](https://huggingface.co/meta-llama/Llama-2-13b-hf) on an unknown dataset.
 
16
 
17
  ## Model description
18
 
19
- More information needed
 
20
 
21
  ## Intended uses & limitations
22
 
23
- More information needed
24
-
25
- ## Training and evaluation data
26
-
27
- More information needed
28
 
29
  ## Training procedure
30
 
 
 
31
  ### Training hyperparameters
32
 
33
  The following hyperparameters were used during training:
34
  - learning_rate: 0.0002
35
  - train_batch_size: 32
36
- - eval_batch_size: 32
37
  - seed: 42
38
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
39
  - lr_scheduler_type: constant
40
  - num_epochs: 5
41
 
42
- ### Training results
43
-
44
 
 
 
 
 
 
 
45
 
46
  ### Framework versions
47
 
 
12
 
13
  # Ruckus-PyAssi-13b
14
 
15
+ This model is a fine-tuned version of [meta-llama/Llama-2-13b-hf](https://huggingface.co/meta-llama/Llama-2-13b-hf)
16
+ on a 10 000 examples from flytech/llama-python-codes-30k dataset.
17
 
18
  ## Model description
19
 
20
+ Model trained in 4-bit architecture using SFT (Supervised Fine Tuning) and LoRA (Low-Rank Adaptation) methods,
21
+ fine-tuning further is possible.
22
 
23
  ## Intended uses & limitations
24
 
25
+ Code-generation, but as like all Ruckus models
26
+ - Created to serve as an executional layer
27
+ - Rich in Python codes and instructional tasks
28
+ - Specially formatted for chat (see inference)
 
29
 
30
  ## Training procedure
31
 
32
+ Model was being trained for 13 hours of A6000 single 48GB vRAM GPU
33
+
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
  - learning_rate: 0.0002
38
  - train_batch_size: 32
39
+ - eval_batch_size: 32 * 2
40
  - seed: 42
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: constant
43
  - num_epochs: 5
44
 
45
+ ## Inference
 
46
 
47
+ - Make sure to format your prompt:
48
+ - <s>[INST]This is my prompt[/INST]
49
+
50
+ - <s>[INST]Ruckus, open google[/INST]
51
+ **Note that <s> is not closed, this is because
52
+ </s> is used to mark end of AI's answer**
53
 
54
  ### Framework versions
55