Transformers
Safetensors
qlora
finetuned
File size: 1,419 Bytes
30d1eb6
e71684c
 
7f52599
 
 
e71684c
391ca84
 
7f52599
0517cc7
e71684c
 
 
0517cc7
e71684c
7f52599
0517cc7
e71684c
391ca84
e71684c
f453f0b
 
 
 
7f52599
 
30d1eb6
7f52599
 
 
0517cc7
7f52599
0517cc7
7f52599
f453f0b
7f52599
f453f0b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7f52599
e71684c
7f52599
0517cc7
7f52599
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

---
license: apache-2.0
tags:
- qlora
- finetuned
- transformers
datasets:
- vishalgimhan/uber-report-2024-dataset
---

# Uber-assistant QLoRA Adapter

This is a LoRA adapter finetuned on Uber Annual Report 2024

## Base Model
meta-llama/Llama-3.1-8B-Instruct

## Dataset
Finetuned using the [Uber Annual Report 2024 Dataset](https://huggingface.co/datasets/vishalgimhan/uber-report-2024-dataset)

## Quantization & Training Hyperparameters
- **Quantization**: 4-bit (NF4)
- **Compute Dtype**: torch.bfloat16
- **Double Quantization**: True
- LoRA rank: 16
- LoRA alpha: 32
- Learning rate: 2e-5
- Max steps: 100
- Batch size (effective): 16
- Max length: 512

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

model_id = "vishalgimhan/uber-assistant"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True
)

base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B-Instruct",
    quantization_config=bnb_config,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
```

## License & Attribution

This adapter inherits the license of the base model and dataset. Check those licenses before use or redistribution.