Llama 3.1 8B - Fine-tuned with Unsloth

πŸ“ About

This model is a fine-tuned version of Llama 3.1 8B, optimized for math-related tasks using Unsloth. The fine-tuning process was 2x faster than standard approaches while maintaining strong accuracy.

πŸš€ Usage

You can load the model with Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

MODEL_NAME = "eugenemaver/Llama-3.1-8B-MATH"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto")

input_text = "Solve the equation: x^2 + 5x + 6 = 0"
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_length=100)

print(tokenizer.decode(output[0], skip_special_tokens=True))

Load in 4-bit for Lower Memory Usage

If you want to use less memory (about 5GB) instead of ~20GB to load the model, use 4-bit using bitsandbytes:

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype="float16")

model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, quantization_config=bnb_config, device_map="auto")

πŸ“Œ Notes

  • The model is stored in 16-bit, but can be loaded in 4-bit to reduce memory usage.
  • Optimized for math-related tasks.
  • Fine-tuned efficiently with Unsloth for better performance.
Downloads last month
7
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train eugenemaver/Llama-3.1-8B-MATH