llama_vlsi_finetuned
This model is a fine-tuned version of unsloth/llama-3.2-3b-unsloth-bnb-4bit on a extracted VLSI dataset. It achieves the following results on the evaluation set:
- Loss: 4.9901
Model description
As it is prototype version of the main projecvt, we have limited the data extraction to fixed number of research papers, PDFs and YouTube videos. The data pipeline then seamlessly integrates alll the extracted data into a single strutctured dataset of json format. Using Hugging face , we have imported Llama 3.2 - 3B and fine-tuned the model using PEFT (LoRA adapters). The main goal of this prototype is reduce the Loss value of the fine-tuned model to minimal value like 4.9901. Unsloth fastens the fine-tuning process.
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| No log | 1.0 | 7 | 5.1811 |
| No log | 2.0 | 14 | 5.1392 |
| No log | 3.0 | 21 | 5.0941 |
| No log | 4.0 | 28 | 5.0554 |
| No log | 5.0 | 35 | 5.0254 |
| No log | 6.0 | 42 | 5.0042 |
| No log | 7.0 | 49 | 4.9935 |
| 5.0049 | 8.0 | 56 | 4.9905 |
| 5.0049 | 8.64 | 60 | 4.9901 |
Framework versions
- PEFT 0.14.0
- Transformers 4.50.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- -