How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="amakhov/tiny-random-llama")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("amakhov/tiny-random-llama")
model = AutoModelForCausalLM.from_pretrained("amakhov/tiny-random-llama")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Quick Links

Tiny Random LLaMA

A really tiny (<10mb) LLaMA model with random weights for testing and development.
⚠️ Outputs are meaningless — for sandbox/testing purposes only.

Really useful for testing LLM applications without needing to download large models or use external APIs.

Downloads last month
61
Safetensors
Model size
4.18M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support