HuggingFaceTB/smoltalk
Viewer • Updated • 2.2M • 17k • 408
Distil-SmolLM2-135M is a distilled version of SmolLM2-1.7B-Instruct, trained on a filtered subset of the Smoltalk dataset. This release aims to provide a more capable and performant ultra-small 135M generative large language model for small tasks on-edge or at-scale.
SmolLM2-1.7B-Instruct into SmolLM2-135M-Instruct. The distillation process utilized the Smoltalk dataset, with specific exclusions.Intended Uses: This model is intended for research, experimentation, and general use in instruction-following and chat applications where a smaller model footprint is desired. It can be used for:
Limitations:
apigen-80k or longalign sources from Smoltalk, which might affect its performance on tasks related to function calling or very long context alignment.You can use this model with the transformers library for text generation tasks.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "OxxoCodes/distil-SmolLM2-135M-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Make sure to use a prompt format the model was trained on
# This example uses a generic instruction format.
# Refer to SmolLM2-1.7B-Instruct or SmolLM2-135M-Instruct for specific prompt templates if applicable.
prompt = f"<|im_start|>system\nYou are a helpful AI assistant.<|im_end|>\n<|im_start|>user\nWhat is the world's largest sea mammal?\n<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50, num_return_sequences=1, temperature=0.3, top_p=0.95)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Base model
HuggingFaceTB/SmolLM2-135M