Transformers
Safetensors
qlora
finetuned
uber-assistant / README.md
vishalgimhan's picture
Update model card README with quantization details
30d1eb6 verified
---
license: apache-2.0
tags:
- qlora
- finetuned
- transformers
datasets:
- vishalgimhan/uber-report-2024-dataset
---
# Uber-assistant QLoRA Adapter
This is a LoRA adapter finetuned on Uber Annual Report 2024
## Base Model
meta-llama/Llama-3.1-8B-Instruct
## Dataset
Finetuned using the [Uber Annual Report 2024 Dataset](https://huggingface.co/datasets/vishalgimhan/uber-report-2024-dataset)
## Quantization & Training Hyperparameters
- **Quantization**: 4-bit (NF4)
- **Compute Dtype**: torch.bfloat16
- **Double Quantization**: True
- LoRA rank: 16
- LoRA alpha: 32
- Learning rate: 2e-5
- Max steps: 100
- Batch size (effective): 16
- Max length: 512
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
model_id = "vishalgimhan/uber-assistant"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True
)
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.1-8B-Instruct",
quantization_config=bnb_config,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
```
## License & Attribution
This adapter inherits the license of the base model and dataset. Check those licenses before use or redistribution.