Update README.md
Browse files
README.md
CHANGED
|
@@ -33,7 +33,6 @@ This is the model card of a 🤗 transformers model that has been pushed on the
|
|
| 33 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
| 34 |
Have not trained on RLHF for safety
|
| 35 |
|
| 36 |
-
[More Information Needed]
|
| 37 |
|
| 38 |
## How to Get Started with the Model
|
| 39 |
|
|
@@ -56,7 +55,6 @@ outputs = model.generate(**inputs, tokenizer=tokenizer, max_new_tokens=256, do_s
|
|
| 56 |
print(tokenizer.decode(outputs[0]))
|
| 57 |
```
|
| 58 |
|
| 59 |
-
[More Information Needed]
|
| 60 |
|
| 61 |
## Training Details
|
| 62 |
|
|
@@ -64,12 +62,10 @@ print(tokenizer.decode(outputs[0]))
|
|
| 64 |
|
| 65 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
| 66 |
Trained on OpenHermes dataset (translated to Vietnamese) > 600k samples
|
| 67 |
-
[More Information Needed]
|
| 68 |
|
| 69 |
|
| 70 |
#### Training Hyperparameters
|
| 71 |
-
|
| 72 |
-
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
| 73 |
- **Target_modules:** q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj
|
| 74 |
- **batch_size:** 2048
|
| 75 |
- **epoch:** 1
|
|
|
|
| 33 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
| 34 |
Have not trained on RLHF for safety
|
| 35 |
|
|
|
|
| 36 |
|
| 37 |
## How to Get Started with the Model
|
| 38 |
|
|
|
|
| 55 |
print(tokenizer.decode(outputs[0]))
|
| 56 |
```
|
| 57 |
|
|
|
|
| 58 |
|
| 59 |
## Training Details
|
| 60 |
|
|
|
|
| 62 |
|
| 63 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
| 64 |
Trained on OpenHermes dataset (translated to Vietnamese) > 600k samples
|
|
|
|
| 65 |
|
| 66 |
|
| 67 |
#### Training Hyperparameters
|
| 68 |
+
<!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
|
|
|
| 69 |
- **Target_modules:** q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj
|
| 70 |
- **batch_size:** 2048
|
| 71 |
- **epoch:** 1
|