Uploaded model

  • Developed by: Sayantan ghosh
  • License: apache-2.0
  • Finetuned from model: unsloth/Llama-3.2-1B-Instruct

This LLaMA model was trained 2× faster with Unsloth and Huggingface's TRL library.


🚀 Inference Code

You can use the following code to run inference with this model using Unsloth:

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "SGHOSH1999/FineLlama3.1-1B-Instruct",  # Your model repo
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

FastLanguageModel.for_inference(model)  # Enable native 2x faster inference

# Create a conversation prompt
messages = [
    {"role": "user", "content": "Describe a tall tower in the capital of France."},
]

# Tokenize input
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True,  # Must be set for generation
    return_tensors = "pt",
).to("cuda")

# Generate text
from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer, skip_prompt=True)
_ = model.generate(
    input_ids = inputs,
    streamer = text_streamer,
    max_new_tokens = 128,
    use_cache = True,
    temperature = 1.5,
    min_p = 0.1,
)
Downloads last month
1
Safetensors
Model size
1B params
Tensor type
BF16
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SGHOSH1999/FineLlama3.1-1B-Instruct

Finetuned
(603)
this model

Dataset used to train SGHOSH1999/FineLlama3.1-1B-Instruct