cli08 commited on
Commit
602382e
·
verified ·
1 Parent(s): 8889144

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -37
README.md CHANGED
@@ -7,58 +7,35 @@ tags:
7
  - lora
8
  - transformers
9
  metrics:
10
- - accuracy
11
  - f1
12
  model-index:
13
  - name: qwen3-0.6-finetuned
14
  results: []
 
 
15
  ---
16
 
17
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
- should probably proofread and complete it, then remove this comment. -->
19
-
20
  # qwen3-0.6-finetuned
21
 
22
- This model is a fine-tuned version of [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) on an unknown dataset.
23
- It achieves the following results on the evaluation set:
24
- - Loss: 0.6120
25
- - Accuracy: 0.899
26
- - F1: 0.8984
27
-
28
- ## Model description
29
-
30
- More information needed
31
-
32
- ## Intended uses & limitations
33
 
34
- More information needed
35
 
36
- ## Training and evaluation data
37
-
38
- More information needed
39
-
40
- ## Training procedure
41
 
42
  ### Training hyperparameters
43
 
44
  The following hyperparameters were used during training:
45
  - learning_rate: 0.001
46
- - train_batch_size: 16
47
- - eval_batch_size: 16
48
- - seed: 42
49
  - gradient_accumulation_steps: 4
50
- - total_train_batch_size: 64
51
- - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
- - lr_scheduler_type: linear
53
- - num_epochs: 2
54
-
55
- ### Training results
56
-
57
- | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
58
- |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
59
- | No log | 1.0 | 79 | 0.6382 | 0.888 | 0.8874 |
60
- | 3.1399 | 2.0 | 158 | 0.6120 | 0.899 | 0.8984 |
61
-
62
 
63
  ### Framework versions
64
 
@@ -66,4 +43,12 @@ The following hyperparameters were used during training:
66
  - Transformers 4.57.1
67
  - Pytorch 2.8.0+cu126
68
  - Datasets 4.4.2
69
- - Tokenizers 0.22.1
 
 
 
 
 
 
 
 
 
7
  - lora
8
  - transformers
9
  metrics:
 
10
  - f1
11
  model-index:
12
  - name: qwen3-0.6-finetuned
13
  results: []
14
+ datasets:
15
+ - sh0416/ag_news
16
  ---
17
 
 
 
 
18
  # qwen3-0.6-finetuned
19
 
20
+ This model is a fine-tuned version of [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) on the [sh0416/ag_news](https://huggingface.co/datasets/sh0416/ag_news) dataset.
21
+ It achieved an F1 of 0.911 on the evaluation set.
 
 
 
 
 
 
 
 
 
22
 
23
+ If you would like to test the fine-tuned adapter yourself, you can load it using `AutoModelForSequenceClassification.from_pretrained()` and pass `cli08/qwen3-0.6-finetuned` as the model.
24
 
25
+ ### Fine-tuning Results
26
+ |Initial F1|Fine-tuned F1|
27
+ |----------|-------------|
28
+ |0.133|0.911|
 
29
 
30
  ### Training hyperparameters
31
 
32
  The following hyperparameters were used during training:
33
  - learning_rate: 0.001
34
+ - num_train_epochs: 2
35
+ - lr_scheduler_type: 'linear'
 
36
  - gradient_accumulation_steps: 4
37
+ - weight_decay: 0.01
38
+ - per_device_train_batch_size: 8
 
 
 
 
 
 
 
 
 
 
39
 
40
  ### Framework versions
41
 
 
43
  - Transformers 4.57.1
44
  - Pytorch 2.8.0+cu126
45
  - Datasets 4.4.2
46
+ - Tokenizers 0.22.1
47
+
48
+ ### Environment
49
+
50
+ Kaggle notebook with two Nvidia T4 GPU's
51
+
52
+ ### Source Code
53
+
54
+ [Training code is hosted on GitHub](https://github.com/calvinli2024/CS614-genai/tree/main)