--- license: apache-2.0 tags: - qlora - finetuned - transformers datasets: - vishalgimhan/uber-report-2024-dataset --- # Uber-assistant QLoRA Adapter This is a LoRA adapter finetuned on Uber Annual Report 2024 ## Base Model meta-llama/Llama-3.1-8B-Instruct ## Dataset Finetuned using the [Uber Annual Report 2024 Dataset](https://huggingface.co/datasets/vishalgimhan/uber-report-2024-dataset) ## Quantization & Training Hyperparameters - **Quantization**: 4-bit (NF4) - **Compute Dtype**: torch.bfloat16 - **Double Quantization**: True - LoRA rank: 16 - LoRA alpha: 32 - Learning rate: 2e-5 - Max steps: 100 - Batch size (effective): 16 - Max length: 512 ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import PeftModel import torch model_id = "vishalgimhan/uber-assistant" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True ) base_model = AutoModelForCausalLM.from_pretrained( "meta-llama/Llama-3.1-8B-Instruct", quantization_config=bnb_config, device_map="auto" ) model = PeftModel.from_pretrained(base_model, model_id) tokenizer = AutoTokenizer.from_pretrained(model_id) ``` ## License & Attribution This adapter inherits the license of the base model and dataset. Check those licenses before use or redistribution.