LightweightLLM-135M

Model Summary
How to Use
Evaluation
Limitations
Training
License
Citation

Model Summary

LightweightLLM-135M is a compact, efficient language model designed for on-device usage and low-resource environments. It is derived from the SmolLM2 family, providing strong instruction-following and reasoning capabilities in a lightweight format.

Parameters: 135M
Use Cases: Text generation, summarization, rewriting, basic reasoning
Language: English

How to Use

Installation

pip install transformers

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "4lph4v3rs3/lightweightl-LLM-135M"
device = "cuda"  # or "cpu"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

inputs = tokenizer.encode("Gravity is", return_tensors="pt").to(device)
outputs = model.generate(inputs)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using bfloat16

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

checkpoint = "4lph4v3rs3/lightweightl-LLM-135M"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(
    checkpoint,
    device_map="auto",
    torch_dtype=torch.bfloat16
)

inputs = tokenizer.encode("Gravity is", return_tensors="pt").to("cuda")
outputs = model.generate(inputs)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation

LightweightLLM-135M has been benchmarked on common NLP tasks using zero-shot evaluation, showing strong performance for a 135M-parameter model in instruction following, commonsense reasoning, and text generation.

Limitations

Only supports English
May produce incorrect or biased content
Not suitable as a definitive knowledge source

Use outputs as an assistive tool and verify important information independently.

Training

Architecture: Transformer decoder
Pretraining tokens: Large-scale text and code datasets
Precision: bfloat16
Framework: Hugging Face Nanotron

License

Apache 2.0

Citation

@misc{allal2025smollm2smolgoesbig,
  title={SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model},
  author={Loubna Ben Allal et al.},
  year={2025},
  eprint={2502.02737},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}

Downloads last month: 1

Paper for 4lph4v3rs3/lightweightl-LLM-135M

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 257

4lph4v3rs3
/

lightweightl-LLM-135M