Transformers
Safetensors
qlora
finetuned
uber-assistant / README.md
vishalgimhan's picture
Update model card README with quantization details
30d1eb6 verified
metadata
license: apache-2.0
tags:
  - qlora
  - finetuned
  - transformers
datasets:
  - vishalgimhan/uber-report-2024-dataset

Uber-assistant QLoRA Adapter

This is a LoRA adapter finetuned on Uber Annual Report 2024

Base Model

meta-llama/Llama-3.1-8B-Instruct

Dataset

Finetuned using the Uber Annual Report 2024 Dataset

Quantization & Training Hyperparameters

  • Quantization: 4-bit (NF4)
  • Compute Dtype: torch.bfloat16
  • Double Quantization: True
  • LoRA rank: 16
  • LoRA alpha: 32
  • Learning rate: 2e-5
  • Max steps: 100
  • Batch size (effective): 16
  • Max length: 512

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

model_id = "vishalgimhan/uber-assistant"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True
)

base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B-Instruct",
    quantization_config=bnb_config,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

License & Attribution

This adapter inherits the license of the base model and dataset. Check those licenses before use or redistribution.