File size: 3,521 Bytes
64e693b ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f ec423bf c1a621f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | ---
license: apache-2.0
---
# Model Card for Model ID
## Model Details
This project fine-tunes Microsoft's Phi-2 language model using parameter-efficient fine-tuning (LoRA) on the Nemotron-Personas-India dataset. The model is loaded using 4-bit NF4 quantization through BitsAndBytes to reduce memory consumption while maintaining training and inference capability on limited hardware.
### Model Description
- **Developed by:** Sachin Singh
- **Model type:** Causal Language Model
- **Base model:** Phi-2
- **Language(s):** English
- **Quantization:** 4-bit NF4 (BitsAndBytes)
- **Fine-tuning method:** LoRA (PEFT)
- **Dataset:** NVIDIA Nemotron-Personas-India (`en_IN` split)
### Model Sources
- **Base Model:** microsoft/phi-2
- **Dataset:** nvidia/Nemotron-Personas-India
### Direct Use
This model is intended for:
- Persona-conditioned text generation
- Instruction-following experiments
- Low-memory LLM deployment research
- Quantization benchmarking
- LoRA fine-tuning demonstrations
- LLM performance analytics studies
### Downstream Use
The fine-tuned model can serve as a foundation for:
- Persona-based conversational agents
- Lightweight chatbot deployments
- LLM optimization research
- Quantization and efficiency studies
### Out-of-Scope Use
This model is not intended for:
- Medical advice
- Legal advice
- Financial decision making
- Safety-critical systems
- High-risk automated decision systems
## Bias, Risks, and Limitations
The model inherits limitations from:
- The Phi-2 base model
- The Nemotron-Personas-India dataset
- Quantization-induced approximation errors
- Limited fine-tuning duration
Generated responses may contain inaccuracies, hallucinations, biases, or incomplete information.
## How to Get Started with the Model
```python
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
model_id = "microsoft/phi-2"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto"
)
```
## Training Details
### Training Data
The model is fine-tuned using:
- Dataset: `nvidia/Nemotron-Personas-India`
- Split: `en_IN`
- Sample Size: 5,000 records
Persona records are transformed into instruction-response training examples before fine-tuning.
#### Training Hyperparameters
- Fine-tuning Method: LoRA
- Quantization: 4-bit NF4
- Epochs: 1
- Compute Type: FP16
- Double Quantization: Enabled
#### Summary
The project evaluates the trade-offs between model efficiency and generation capability when applying 4-bit quantization and LoRA fine-tuning to Phi-2.
### Model Architecture and Objective
- Architecture: Phi-2 Transformer
- Objective: Causal Language Modeling
- Adaptation Method: LoRA
- Quantization Method: BitsAndBytes NF4 4-bit Quantization
### Compute Infrastructure
GPU T4 x2
## Citation [optional]
```bibtex
@misc{phi2,
title={Phi-2: The surprising power of small language models},
author={Microsoft Research}
}
```
### Dataset
```bibtex
@misc{nemotron_personas_india,
title={Nemotron Personas India Dataset},
author={NVIDIA}
}
```
## Model Card Authors
Sachin Singh
## Model in Notebook
[[More Information Needed]](https://www.kaggle.com/code/shreyasraghav/4-bit-quantization-with-phi-2-with-more-analytics) |