cli08
/

qwen3-0.6-finetuned

Model card Files Files and versions

qwen3-0.6-finetuned

This model is a fine-tuned version of Qwen/Qwen3-0.6B on the sh0416/ag_news dataset. It achieved an F1 of 0.911 on the evaluation set.

If you would like to test the fine-tuned adapter yourself, you can load it using AutoModelForSequenceClassification.from_pretrained() and pass cli08/qwen3-0.6-finetuned as the model.

Fine-tuning Results

Initial F1	Fine-tuned F1
0.133	0.911

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
num_train_epochs: 2
lr_scheduler_type: 'linear'
gradient_accumulation_steps: 4
weight_decay: 0.01
per_device_train_batch_size: 8

Framework versions

PEFT 0.17.1
Transformers 4.57.1
Pytorch 2.8.0+cu126
Datasets 4.4.2
Tokenizers 0.22.1

Environment

Kaggle notebook with two Nvidia T4 GPU's

Source Code

Training code is hosted on GitHub

Downloads last month: 9

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cli08/qwen3-0.6-finetuned

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Adapter

(386)

this model

Dataset used to train cli08/qwen3-0.6-finetuned