parthtamu
/

QLoRA-Finetuning

@@ -1,199 +1,138 @@
 ---
-library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
+language: en
+license: mit
+base_model: meta-llama/Llama-3.2-3B
+datasets:
+  - sahil2801/CodeAlpaca-20k
+tags:
+  - code-generation
+  - lora
+  - qlora
+  - peft
+  - fine-tuned
+  - llama
+  - instruction-tuning
+library_name: peft
+pipeline_tag: text-generation
 ---
+# Llama-3.2-3B · CodeAlpaca LoRA Adapter
+A LoRA adapter fine-tuned on [CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)
+for instruction-following code generation tasks. Built on top of
+[meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) with
+4-bit NF4 quantization via `bitsandbytes`. Only **~1% of parameters** are
+trainable — the rest of the base model is frozen.
+---
 ## Model Details
+| Field            | Value                                      |
+|------------------|--------------------------------------------|
+| **Base Model**   | meta-llama/Llama-3.2-3B                    |
+| **Adapter Type** | LoRA (via PEFT)                            |
+| **Task**         | Instruction-following code generation      |
+| **Language**     | English                                    |
+| **License**      | MIT                                        |
+| **Author**       | Parth Deshmukh                             |
+| **Date**         | April 2026                                 |
+---
+## Training Configuration
+| Config               | Value                                           |
+|----------------------|-------------------------------------------------|
+| **LoRA Rank (r)**    | 8                                               |
+| **LoRA Alpha**       | 16                                              |
+| **LoRA Dropout**     | 0.05                                            |
+| **Target Modules**   | `q_proj`, `v_proj`                              |
+| **Quantization**     | 4-bit NF4 (`bitsandbytes` BitsAndBytesConfig)   |
+| **Compute dtype**    | float16                                         |
+| **Batch size**       | 2 (+ gradient accumulation steps = 4)           |
+| **Mixed Precision**  | fp16                                            |
+| **Hardware**         | Google Colab T4 GPU (16GB VRAM)                 |
+| **Experiment Tracking** | MLflow + Weights & Biases                  |
+---
+## Dataset
+- **Name:** [CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)
+- **Size:** ~20,000 code instruction samples
+- **Split:** 90/10 train/test (~18,000 train, ~2,000 test)
+- **Columns:** `instruction`, `input`, `output`
+- **Prompt format:**
+Instruction:
+{instruction}
+Input:
+{input}
+Response:
+{output}
+text
+---
+## Evaluation Results
+Evaluated on **200 held-out test samples** from CodeAlpaca-20k using 4-bit
+quantized inference. Metrics computed with `evaluate` (ROUGE-L) and
+`bert_score` (BERTScore-F1).
+| Model                              | ROUGE-L | BERTScore-F1 |
+|------------------------------------|---------|--------------|
+| Base (Llama-3.2-3B, no adapter)    | 0.3303  | 0.7835       |
+| **Fine-tuned (this adapter)**      | **0.5458**  | **0.8856**   |
+| **Delta**                          | **+0.2155 (+65.2%)** | **+0.1021 (+13.0%)** |
+> ROUGE-L of 0.5458 is at the top of the competitive range for fine-tuned
+> code generation models (0.43–0.55), confirming that LoRA fine-tuning
+> successfully taught the model consistent instruction-following and code
+> formatting behavior.
+---
+## How to Use
+Load the base model with 4-bit quantization, then apply this adapter using
+PEFT's `PeftModel.from_pretrained()`.
+**Prompt format:**
+Instruction:
+Write a Python function that reverses a string.
+Input:
+Response:
+text
+**Inference parameters used during evaluation:**
+- `max_new_tokens`: 200
+- `do_sample`: False
+- `repetition_penalty`: 1.1
+- `pad_token_id`: tokenizer.eos_token_id
+---
+## Limitations
+- Trained for only **1–3 epochs** on 18k samples — may struggle with highly
+  complex or multi-file code tasks.
+- Optimized for **single-instruction, single-response** code generation;
+  not designed for multi-turn conversation.
+- Performance is measured on CodeAlpaca-style prompts; may degrade on very
+  different prompt formats.
+- Base model is **3B parameters** — larger models (7B+) would likely achieve
+  higher absolute scores.
+---
+## Project
+This adapter was built as part of a 7-day end-to-end LLM fine-tuning project
+covering LoRA/QLoRA concepts, dataset preparation, training, evaluation,
+deployment, and CI/CD. Full project repository:
+[github.com/your-username/llm-lora-finetuning](https://github.com/your-username/llm-lora-finetuning)