tinyllama-finetuned
This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 on the mlabonne/guanaco-llama2-1k dataset.
Model Details
- Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Fine-tuning Dataset: mlabonne/guanaco-llama2-1k
- Training Method: LoRA (Low-Rank Adaptation)
- LoRA Rank: 16
- LoRA Alpha: 32
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "SaJiThrenalin/tinyllama-finetuned"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Example usage
prompt = "What is a large language model?"
inputs = tokenizer(f"<s>[INST] {prompt} [/INST]", return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
API Usage
You can use this model via the Hugging Face API:
import requests
API_URL = f"https://api-inference.huggingface.co/models/SaJiThrenalin/tinyllama-finetuned"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({"inputs": "What is a large language model?"})
print(output)
- Downloads last month
- -
Model tree for SaJiThrenalin/tinyllama-finetuned
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0