Uploaded model

Developed by: Sayantan ghosh
License: apache-2.0
Finetuned from model: unsloth/Llama-3.2-1B-Instruct

This LLaMA model was trained 2× faster with Unsloth and Huggingface's TRL library.

🚀 Inference Code

You can use the following code to run inference with this model using Unsloth:

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "SGHOSH1999/FineLlama3.1-1B-Instruct",  # Your model repo
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

FastLanguageModel.for_inference(model)  # Enable native 2x faster inference

# Create a conversation prompt
messages = [
    {"role": "user", "content": "Describe a tall tower in the capital of France."},
]

# Tokenize input
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True,  # Must be set for generation
    return_tensors = "pt",
).to("cuda")

# Generate text
from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer, skip_prompt=True)
_ = model.generate(
    input_ids = inputs,
    streamer = text_streamer,
    max_new_tokens = 128,
    use_cache = True,
    temperature = 1.5,
    min_p = 0.1,
)

Downloads last month: 1

Safetensors

Model size

1B params

Tensor type

BF16

F16

Model tree for SGHOSH1999/FineLlama3.1-1B-Instruct

Base model

meta-llama/Llama-3.1-8B

Quantized

unsloth/Meta-Llama-3.1-8B-bnb-4bit

Finetuned

(603)

this model

SGHOSH1999
/

FineLlama3.1-1B-Instruct

Uploaded model

🚀 Inference Code

Model tree for SGHOSH1999/FineLlama3.1-1B-Instruct

Dataset used to train SGHOSH1999/FineLlama3.1-1B-Instruct