LLaMA-3.1-8B-Instruct-DPO-Baseline

LLaMA 3.1 8B fine tuned on Light R1 DPO dataset for 100 steps

Model Details

  • Base Model: meta-llama/Llama-3.1-8B-Instruct
  • Architecture: Llama 3.1 8B Instruct
  • Training: Direct Preference Optimization (DPO) with baseline PyTorch and TRL AdamW Optimizer. For details, see: GitHub
  • Task: Text generation, instruction following, conversational AI

Requirements

  • transformers >= 4.43.0 (required for full Llama 3.1 support)
  • torch (recommended: torch >= 2.0.0)

Usage

Installation

pip install --upgrade transformers torch

Basic Usage with Transformers

Starting with transformers >= 4.43.0, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.

Using Pipeline

import transformers
import torch

model_id = "ScottBiggs2/LLaMA-3.1-8B-Instruct-DPO-Baseline"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain what machine learning is."},
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Using AutoModelForCausalLM

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "ScottBiggs2/LLaMA-3.1-8B-Instruct-DPO-Baseline"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain what machine learning is."},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)

response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

Tool Use

Llama 3.1 supports tool use through chat templates in Transformers. See the official documentation for detailed examples.

Model Information

This model is based on Meta's Llama 3.1 8B Instruct model, fine-tuned using Direct Preference Optimization (DPO). The model maintains compatibility with the original Llama 3.1 architecture and chat template format.

For more information about the base model, see:

Citation

If you use this model, please cite the original Llama 3.1 paper:

@article{{meta2024llama,
  title={{Llama 3.1}},
  author={{Meta AI}},
  year={{2024}}
}}
Downloads last month
22
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ScottBiggs2/LLaMA-3.1-8B-Instruct-DPO-Baseline

Finetuned
(2208)
this model

Collection including ScottBiggs2/LLaMA-3.1-8B-Instruct-DPO-Baseline