cli08
/

qwen3-0.6-finetuned

Model card Files Files and versions

cli08 commited on Feb 22

Commit

602382e

·

verified ·

1 Parent(s): 8889144

Update README.md

Files changed (1) hide show

README.md +22 -37

README.md CHANGED Viewed

@@ -7,58 +7,35 @@ tags:
 - lora
 - transformers
 metrics:
-- accuracy
 - f1
 model-index:
 - name: qwen3-0.6-finetuned
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # qwen3-0.6-finetuned
-This model is a fine-tuned version of [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.6120
-- Accuracy: 0.899
-- F1: 0.8984
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 0.001
-- train_batch_size: 16
-- eval_batch_size: 16
-- seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 64
-- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- num_epochs: 2
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
-| No log        | 1.0   | 79   | 0.6382          | 0.888    | 0.8874 |
-| 3.1399        | 2.0   | 158  | 0.6120          | 0.899    | 0.8984 |
 ### Framework versions
@@ -66,4 +43,12 @@ The following hyperparameters were used during training:
 - Transformers 4.57.1
 - Pytorch 2.8.0+cu126
 - Datasets 4.4.2
-- Tokenizers 0.22.1

 - lora
 - transformers
 metrics:
 - f1
 model-index:
 - name: qwen3-0.6-finetuned
   results: []
+datasets:
+- sh0416/ag_news
 ---
 # qwen3-0.6-finetuned
+This model is a fine-tuned version of [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) on the [sh0416/ag_news](https://huggingface.co/datasets/sh0416/ag_news) dataset.
+It achieved an F1 of 0.911 on the evaluation set.
+If you would like to test the fine-tuned adapter yourself, you can load it using `AutoModelForSequenceClassification.from_pretrained()` and pass `cli08/qwen3-0.6-finetuned` as the model.
+### Fine-tuning Results
+|Initial F1|Fine-tuned F1|
+|----------|-------------|
+|0.133|0.911|
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 0.001
+- num_train_epochs: 2
+- lr_scheduler_type: 'linear'
 - gradient_accumulation_steps: 4
+- weight_decay: 0.01
+- per_device_train_batch_size: 8
 ### Framework versions
 - Transformers 4.57.1
 - Pytorch 2.8.0+cu126
 - Datasets 4.4.2
+- Tokenizers 0.22.1
+### Environment
+Kaggle notebook with two Nvidia T4 GPU's
+### Source Code
+[Training code is hosted on GitHub](https://github.com/calvinli2024/CS614-genai/tree/main)