Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -1,12 +1,16 @@
|
|
| 1 |
# π§ Myanmar LLM Training
|
| 2 |
|
| 3 |
-
Fine-tune **
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
## π Requirements
|
| 6 |
|
| 7 |
- Python 3.8+
|
| 8 |
-
- GPU with
|
| 9 |
-
- HuggingFace Account
|
| 10 |
|
| 11 |
## π Quick Start
|
| 12 |
|
|
@@ -18,11 +22,8 @@ pip install -r requirements.txt
|
|
| 18 |
### 2. Login to HuggingFace
|
| 19 |
```bash
|
| 20 |
huggingface-cli login
|
| 21 |
-
# Enter your token
|
| 22 |
```
|
| 23 |
|
| 24 |
-
**Note:** Llama requires accepting the license at https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct
|
| 25 |
-
|
| 26 |
### 3. Run training
|
| 27 |
```bash
|
| 28 |
python train.py
|
|
@@ -32,45 +33,47 @@ python train.py
|
|
| 32 |
|
| 33 |
| Parameter | Default | Description |
|
| 34 |
|-----------|---------|-------------|
|
| 35 |
-
| MODEL_NAME |
|
| 36 |
| num_train_epochs | 3 | Training iterations |
|
| 37 |
-
| per_device_train_batch_size |
|
| 38 |
-
| gradient_accumulation_steps |
|
| 39 |
-
| learning_rate |
|
| 40 |
|
| 41 |
## π Features
|
| 42 |
|
| 43 |
-
- β
|
|
|
|
| 44 |
- β
Gradient checkpointing - Memory αα»α½α±αα¬αα«αααΊα
|
| 45 |
- β
Test/Validation evaluation - ααΎα
αΊαα―αα―αΆαΈα‘αα½ααΊ α
ααΊαΈαααΊαα«αααΊα
|
| 46 |
-
- β
BF16 mixed precision - ααα―ααα―αααα»αα²α· trainingα
|
| 47 |
|
| 48 |
## π Training Data
|
| 49 |
|
| 50 |
-
Dataset: [amkyawdev/
|
| 51 |
|
| 52 |
| Split | Samples |
|
| 53 |
|-------|---------|
|
| 54 |
-
| Train |
|
| 55 |
-
| Validation |
|
| 56 |
-
| Test |
|
|
|
|
|
|
|
| 57 |
|
| 58 |
## πΎ Output
|
| 59 |
|
| 60 |
-
Trained model saved to `./myanmar-
|
| 61 |
|
| 62 |
## π€ Upload to HuggingFace
|
| 63 |
|
| 64 |
```bash
|
| 65 |
-
cd myanmar-
|
| 66 |
-
huggingface-cli upload amkyawdev/my-myanmar-
|
| 67 |
```
|
| 68 |
|
| 69 |
## π₯οΈ Google Colab
|
| 70 |
|
| 71 |
```python
|
| 72 |
# Install
|
| 73 |
-
!pip install transformers datasets torch
|
| 74 |
|
| 75 |
# Login
|
| 76 |
from huggingface_hub import login
|
|
@@ -80,10 +83,5 @@ login("YOUR_TOKEN")
|
|
| 80 |
%run train.py
|
| 81 |
```
|
| 82 |
|
| 83 |
-
## β οΈ Important
|
| 84 |
-
|
| 85 |
-
1. Llama license ααα―αα«αααΊα https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct ααΎα¬ Accept αα―ααΊαα«αααΊα
|
| 86 |
-
2. Token ααΎα¬Llama access ααΎαααα«αααΊα
|
| 87 |
-
|
| 88 |
---
|
| 89 |
Built by amkyawdev
|
|
|
|
| 1 |
# π§ Myanmar LLM Training
|
| 2 |
|
| 3 |
+
Fine-tune **Qwen2.5-0.5B-Instruct** with Myanmar language dataset.
|
| 4 |
+
|
| 5 |
+
## β‘ No License Required!
|
| 6 |
+
|
| 7 |
+
This model is fully open. No Llama license needed!
|
| 8 |
|
| 9 |
## π Requirements
|
| 10 |
|
| 11 |
- Python 3.8+
|
| 12 |
+
- GPU with 6GB+ VRAM
|
| 13 |
+
- HuggingFace Account
|
| 14 |
|
| 15 |
## π Quick Start
|
| 16 |
|
|
|
|
| 22 |
### 2. Login to HuggingFace
|
| 23 |
```bash
|
| 24 |
huggingface-cli login
|
|
|
|
| 25 |
```
|
| 26 |
|
|
|
|
|
|
|
| 27 |
### 3. Run training
|
| 28 |
```bash
|
| 29 |
python train.py
|
|
|
|
| 33 |
|
| 34 |
| Parameter | Default | Description |
|
| 35 |
|-----------|---------|-------------|
|
| 36 |
+
| MODEL_NAME | Qwen/Qwen2.5-0.5B-Instruct | Base model (fully open!) |
|
| 37 |
| num_train_epochs | 3 | Training iterations |
|
| 38 |
+
| per_device_train_batch_size | 4 | Batch size |
|
| 39 |
+
| gradient_accumulation_steps | 4 | Effective batch = 16 |
|
| 40 |
+
| learning_rate | 2e-5 | Learning rate |
|
| 41 |
|
| 42 |
## π Features
|
| 43 |
|
| 44 |
+
- β
Fully open model - ααα―ααΊα
ααΊαααα―αα«αααΊα
|
| 45 |
+
- β
FP16 precision - ααα―ααα―ααΌααΊαα«αααΊα
|
| 46 |
- β
Gradient checkpointing - Memory αα»α½α±αα¬αα«αααΊα
|
| 47 |
- β
Test/Validation evaluation - ααΎα
αΊαα―αα―αΆαΈα‘αα½ααΊ α
ααΊαΈαααΊαα«αααΊα
|
|
|
|
| 48 |
|
| 49 |
## π Training Data
|
| 50 |
|
| 51 |
+
Dataset: [amkyawdev/AmkyawDev-Dataset](https://huggingface.co/datasets/amkyawdev/AmkyawDev-Dataset)
|
| 52 |
|
| 53 |
| Split | Samples |
|
| 54 |
|-------|---------|
|
| 55 |
+
| Train | ~29,100 |
|
| 56 |
+
| Validation | ~29,100 |
|
| 57 |
+
| Test | ~29,100 |
|
| 58 |
+
|
| 59 |
+
> **Note:** Each file (train.jsonl, test.jsonl, validation.jsonl) has ~29,100 conversations!
|
| 60 |
|
| 61 |
## πΎ Output
|
| 62 |
|
| 63 |
+
Trained model saved to `./myanmar-qwen-output/`
|
| 64 |
|
| 65 |
## π€ Upload to HuggingFace
|
| 66 |
|
| 67 |
```bash
|
| 68 |
+
cd myanmar-qwen-output
|
| 69 |
+
huggingface-cli upload amkyawdev/my-myanmar-qwen . --repo-type model
|
| 70 |
```
|
| 71 |
|
| 72 |
## π₯οΈ Google Colab
|
| 73 |
|
| 74 |
```python
|
| 75 |
# Install
|
| 76 |
+
!pip install transformers datasets torch accelerate
|
| 77 |
|
| 78 |
# Login
|
| 79 |
from huggingface_hub import login
|
|
|
|
| 83 |
%run train.py
|
| 84 |
```
|
| 85 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
---
|
| 87 |
Built by amkyawdev
|