new-llama3-nyc-base

This model is a fine-tuned version of unsloth/llama-3-8b using LoRA (Low-Rank Adaptation) and quantization techniques.

Model Details

  • Base Model: unsloth/llama-3-8b
  • Fine-tuned Model: comp5331poi/new-llama3-nyc-base
  • Training Run: new-llama3-nyc-base
  • Device: cuda

Training Configuration

Hyperparameters

  • Number of Epochs: 8
  • Batch Size: 4
  • Gradient Accumulation Steps: 2
  • Effective Batch Size: 8
  • Learning Rate: 1e-05
  • Learning Rate Scheduler: constant
  • Warmup Steps: 20
  • Max Sequence Length: 2048
  • Optimizer: paged_adamw_8bit
  • Max Gradient Norm: 0.3
  • Random Seed: 2024

LoRA Configuration

  • LoRA Rank (r): 16
  • LoRA Alpha: 32
  • LoRA Dropout: 0.1
  • Target Modules: down_proj, v_proj, k_proj, up_proj, gate_proj, q_proj, o_proj
  • Task Type: CAUSAL_LM

Quantization

  • Quantization Bits: 4-bit

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("unsloth/llama-3-8b")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "comp5331poi/new-llama3-nyc-base")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("unsloth/llama-3-8b")

# Generate text
inputs = tokenizer("Your prompt here", return_tensors="pt")
outputs = model.generate(**inputs, max_length=2048)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Framework Versions

  • Transformers
  • PEFT
  • TRL
  • PyTorch
  • BitsAndBytes
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for comp5331poi/new-llama3-nyc-base

Base model

unsloth/llama-3-8b
Adapter
(290)
this model