mlabonne/guanaco-llama2-1k
Viewer • Updated • 1k • 1.91k • 163
This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 on the mlabonne/guanaco-llama2-1k dataset.
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "SaJiThrenalin/tinyllama-finetuned"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Example usage
prompt = "What is a large language model?"
inputs = tokenizer(f"<s>[INST] {prompt} [/INST]", return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
You can use this model via the Hugging Face API:
import requests
API_URL = f"https://api-inference.huggingface.co/models/SaJiThrenalin/tinyllama-finetuned"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({"inputs": "What is a large language model?"})
print(output)
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0