llama_vlsi_finetuned

This model is a fine-tuned version of unsloth/llama-3.2-3b-unsloth-bnb-4bit on a extracted VLSI dataset. It achieves the following results on the evaluation set:

  • Loss: 4.9901

Model description

As it is prototype version of the main projecvt, we have limited the data extraction to fixed number of research papers, PDFs and YouTube videos. The data pipeline then seamlessly integrates alll the extracted data into a single strutctured dataset of json format. Using Hugging face , we have imported Llama 3.2 - 3B and fine-tuned the model using PEFT (LoRA adapters). The main goal of this prototype is reduce the Loss value of the fine-tuned model to minimal value like 4.9901. Unsloth fastens the fine-tuning process.

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 7 5.1811
No log 2.0 14 5.1392
No log 3.0 21 5.0941
No log 4.0 28 5.0554
No log 5.0 35 5.0254
No log 6.0 42 5.0042
No log 7.0 49 4.9935
5.0049 8.0 56 4.9905
5.0049 8.64 60 4.9901

Framework versions

  • PEFT 0.14.0
  • Transformers 4.50.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support