Text Generation
Transformers
TensorBoard
ONNX
Safetensors
Transformers.js
English
llama
conversational
text-generation-inference
Instructions to use HuggingFaceTB/SmolLM2-135M-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HuggingFaceTB/SmolLM2-135M-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="HuggingFaceTB/SmolLM2-135M-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct") model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Transformers.js
How to use HuggingFaceTB/SmolLM2-135M-Instruct with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('text-generation', 'HuggingFaceTB/SmolLM2-135M-Instruct'); - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use HuggingFaceTB/SmolLM2-135M-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "HuggingFaceTB/SmolLM2-135M-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceTB/SmolLM2-135M-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/HuggingFaceTB/SmolLM2-135M-Instruct
- SGLang
How to use HuggingFaceTB/SmolLM2-135M-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HuggingFaceTB/SmolLM2-135M-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceTB/SmolLM2-135M-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "HuggingFaceTB/SmolLM2-135M-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceTB/SmolLM2-135M-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use HuggingFaceTB/SmolLM2-135M-Instruct with Docker Model Runner:
docker model run hf.co/HuggingFaceTB/SmolLM2-135M-Instruct
Complete Sparse Autoencoder
#15
by juiceb0xc0de - opened
SAE Performance Metrics
Across all 30 layers, the SAEs achieved 0% dead features with stable L0 sparsity.
Note: Reconstruction loss scales up in later layers due to shifting activation magnitudes, but Explained Variance (EV) remains strong throughout.
| Layer | EV | Mean L0 | Recon Loss | Dead % |
|---|---|---|---|---|
| 0 | 0.9480 | 48.74 | 0.2074 | 0.0% |
| 1 | 0.9599 | 43.65 | 0.3298 | 0.0% |
| 2 | 0.9631 | 46.81 | 0.5021 | 0.0% |
| 3 | 0.9508 | 46.56 | 0.7462 | 0.0% |
| 4 | 0.9463 | 46.23 | 0.8936 | 0.0% |
| 5 | 0.9350 | 47.57 | 1.1605 | 0.0% |
| 6 | 0.9306 | 48.44 | 1.3838 | 0.0% |
| 7 | 0.9318 | 49.51 | 1.5446 | 0.0% |
| 8 | 0.9432 | 46.52 | 1.6598 | 0.0% |
| 9 | 0.9373 | 47.15 | 2.0706 | 0.0% |
| 10 | 0.9348 | 45.53 | 2.2983 | 0.0% |
| 11 | 0.9905 | 48.58 | 5.8113 | 0.0% |
| 12 | 0.9901 | 48.42 | 6.1039 | 0.0% |
| 13 | 0.9891 | 46.15 | 6.9692 | 0.0% |
| 14 | 0.9884 | 44.76 | 7.1844 | 0.0% |
| 15 | 0.9863 | 47.63 | 8.6521 | 0.0% |
| 16 | 0.9840 | 43.45 | 10.179 | 0.0% |
| 17 | 0.9808 | 45.88 | 12.363 | 0.0% |
| 18 | 0.9811 | 47.17 | 12.300 | 0.0% |
| 19 | 0.9775 | 46.23 | 15.521 | 0.0% |
| 20 | 0.9726 | 48.23 | 18.727 | 0.0% |
| 21 | 0.9667 | 46.49 | 24.817 | 0.0% |
| 22 | 0.9594 | 46.15 | 31.128 | 0.0% |
| 23 | 0.9418 | 45.13 | 47.936 | 0.0% |
| 24 | 0.9397 | 45.35 | 57.931 | 0.0% |
| 25 | 0.9299 | 46.33 | 74.413 | 0.0% |
| 26 | 0.9212 | 45.32 | 92.727 | 0.0% |
| 27 | 0.9139 | 45.69 | 118.75 | 0.0% |
| 28 | 0.8809 | 46.56 | 129.83 | 0.0% |
| 29 | 0.8812 | 52.01 | 196.36 | 0.0% |
https://huggingface.co/datasets/juiceb0xc0de/smollm2-135m-instruct-SAE