Llama-3.2-1B-Instruct-bnb-4bit-lima - Merged Model

Full-precision (16-bit) merged model with LoRA adapters integrated.

Model Details

Related Models

Prompt Format

This model uses the Llama 3.2 chat template.

Python Usage

Use the tokenizer's apply_chat_template() method:

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Your question here"}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt")

Training Details

  • LoRA Rank: 64
  • Training Steps: 480
  • Training Loss: 1.1123
  • Max Seq Length: 2048
  • Training Scope: 1,278 samples (3.0 epoch(s), full dataset)

For complete training configuration, see the LoRA adapters repository/directory.

Benchmark Results

Evaluated: 2025-11-24 03:10 Comparison: Fine-tuned vs Base model

HuggingFace Transformers (16-bit merged model)

IFEval (Instruction Following)

Model Strict Prompt Strict Inst Loose Prompt Loose Inst
Base 0.4399 0.5731 0.4787 0.6067
Fine-tuned 0.3050 0.4376 0.3327 0.4700
ฮ” โ†˜ -0.1349 โ†˜ -0.1355 โ†˜ -0.1460 โ†˜ -0.1367

Summary

Benchmark What It Tests Base Fine-tuned Improvement
IFEval Tests ability to follow specific instructions 43.99% 30.50% โ†˜ -13.49% (-30.7%)
GSM8K Tests math reasoning and chain-of-thought - - -
HellaSwag Tests real-world knowledge and common sense - - -
MMLU Tests broad knowledge retention (detects catastrophic forgetting) - - -
TruthfulQA Tests tendency to generate truthful answers - - -

Usage

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "./outputs/Llama-3.2-1B-Instruct-bnb-4bit-lima/merged_16bit",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("./outputs/Llama-3.2-1B-Instruct-bnb-4bit-lima/merged_16bit")

messages = [{"role": "user", "content": "Your question here"}]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))

License

Based on unsloth/Llama-3.2-1B-Instruct-bnb-4bit and trained on GAIR/lima. Please refer to the original model and dataset licenses.

Credits

Trained by: Farhan Syah

Training pipeline:

Base components:

Downloads last month
4
Safetensors
Model size
1B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for fs90/Llama-3.2-1B-Instruct-bnb-4bit-lima