File size: 2,628 Bytes

---
license: cc-by-nc-4.0
base_model: mistralai/Mistral-7B-Instruct-v0.2
tags:
- generated_from_trainer
- classification
- Transformer-heads
- finetune
- chatml
- gpt4
- synthetic data
- distillation
model-index:
- name: Mistral_classification_head_qlora
  results: []
datasets:
- dair-ai/emotion
language:
- en
library_name: transformers
pipeline_tag: text-generation
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Mistral_classification_head_qlora

![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e09e72e43b9464c835735f/qna1wMB7CLTe7lfpRy5x3.png)

**Mistral_classification_head_qlora** has a new transformer head attached to it for sequence classification task and then resulting model has been finetuned on [dair-ai/emotion](https://huggingface.co/datasets/dair-ai/emotion) 
dataset using QloRA. The model has been trained for 1 epoch on 1x A40 GPU. The evaluation loss for the **emotion-head-3** attached to it was **1.313**. The base model used was

* **[mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)**

This experiment was performed using **[Transformer-heads library](https://github.com/center-for-humans-and-machines/transformer-heads/tree/main)**

### Training Script

The training script for attaching a new transformer head for classification task using QLoRA is following:

[Training Script Colab](https://colab.research.google.com/drive/1rPaG-Q6d_CutPOlKzjsfmPvwebNg_X6i?usp=sharing)


### Evaluating the Emotion-Head-3

For evaluating the transformer head that has been attached to the base model, you can refer to the following colab notebook
[Colab Notebook for Evaluation](https://colab.research.google.com/drive/15UpNnoKJIWjG3G_WJFOQebjpUWyNoPKT?usp=sharing)


### Training hyperparameters

The following hyperparameters were used during training:

train_epochs = 1
eval_epochs = 1
logging_steps = 1
train_batch_size = 4
eval_batch_size = 4

* output_dir="emotion_linear_probe",
* learning_rate=0.00002,
* num_train_epochs=train_epochs,
* logging_steps=logging_steps,
* do_eval=False,
* remove_unused_columns=False,
* optim="paged_adamw_32bit",
* gradient_checkpointing=True,
* lr_scheduler_type="constant",
* ddp_find_unused_parameters=False,
* per_device_train_batch_size=train_batch_size,
* per_device_eval_batch_size=eval_batch_size,
* report_to=["wandb"]



### Framework versions

- Transformers 4.39.0.dev0
- Pytorch 2.1.2+cu118
- Datasets 2.17.0
- Tokenizers 0.15.0
- Transfomer-heads