qwen3-0.6-finetuned

This model is a fine-tuned version of Qwen/Qwen3-0.6B on the sh0416/ag_news dataset. It achieved an F1 of 0.911 on the evaluation set.

If you would like to test the fine-tuned adapter yourself, you can load it using AutoModelForSequenceClassification.from_pretrained() and pass cli08/qwen3-0.6-finetuned as the model.

Fine-tuning Results

Initial F1 Fine-tuned F1
0.133 0.911

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • num_train_epochs: 2
  • lr_scheduler_type: 'linear'
  • gradient_accumulation_steps: 4
  • weight_decay: 0.01
  • per_device_train_batch_size: 8

Framework versions

  • PEFT 0.17.1
  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.4.2
  • Tokenizers 0.22.1

Environment

Kaggle notebook with two Nvidia T4 GPU's

Source Code

Training code is hosted on GitHub

Downloads last month
150
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cli08/qwen3-0.6-finetuned

Finetuned
Qwen/Qwen3-0.6B
Adapter
(356)
this model

Dataset used to train cli08/qwen3-0.6-finetuned