This model was converted to OpenVINO from meta-llama/Llama-3.2-1B-Instruct using optimum-intel via the export space.

Install packages:

pip install optimum[openvino] transformers torch

Sample code:

from optimum.intel import OVModelForCausalLM
from transformers import AutoTokenizer

model_id = "TheAverageDetective/Llama-3.2-1B-Instruct-openvino"
model = OVModelForCausalLM.from_pretrained(model_id, device="GPU")
tokenizer = AutoTokenizer.from_pretrained(model_id)

prompt = "Explain the theory of relativity in simple terms."

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]

input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt")
output_ids = model.generate(**inputs, max_new_tokens=150)
result = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]

print("\n", result)
Downloads last month
3,852
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TheAverageDetective/Llama-3.2-1B-Instruct-openvino

Finetuned
(1587)
this model

Collection including TheAverageDetective/Llama-3.2-1B-Instruct-openvino