llamas
Collection
llama models in openvino format, converted using optimum • 2 items • Updated
This model was converted to OpenVINO from meta-llama/Llama-3.2-1B-Instruct using optimum-intel
via the export space.
Install packages:
pip install optimum[openvino] transformers torch
Sample code:
from optimum.intel import OVModelForCausalLM
from transformers import AutoTokenizer
model_id = "TheAverageDetective/Llama-3.2-1B-Instruct-openvino"
model = OVModelForCausalLM.from_pretrained(model_id, device="GPU")
tokenizer = AutoTokenizer.from_pretrained(model_id)
prompt = "Explain the theory of relativity in simple terms."
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt")
output_ids = model.generate(**inputs, max_new_tokens=150)
result = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
print("\n", result)
Base model
meta-llama/Llama-3.2-1B-Instruct