LightweightLLM-135M

image/png

Table of Contents

  1. Model Summary
  2. How to Use
  3. Evaluation
  4. Limitations
  5. Training
  6. License
  7. Citation

Model Summary

LightweightLLM-135M is a compact, efficient language model designed for on-device usage and low-resource environments. It is derived from the SmolLM2 family, providing strong instruction-following and reasoning capabilities in a lightweight format.

  • Parameters: 135M
  • Use Cases: Text generation, summarization, rewriting, basic reasoning
  • Language: English

How to Use

Installation

pip install transformers

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "4lph4v3rs3/lightweightl-LLM-135M"
device = "cuda"  # or "cpu"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

inputs = tokenizer.encode("Gravity is", return_tensors="pt").to(device)
outputs = model.generate(inputs)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using bfloat16

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

checkpoint = "4lph4v3rs3/lightweightl-LLM-135M"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(
    checkpoint,
    device_map="auto",
    torch_dtype=torch.bfloat16
)

inputs = tokenizer.encode("Gravity is", return_tensors="pt").to("cuda")
outputs = model.generate(inputs)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation

LightweightLLM-135M has been benchmarked on common NLP tasks using zero-shot evaluation, showing strong performance for a 135M-parameter model in instruction following, commonsense reasoning, and text generation.

Limitations

  • Only supports English
  • May produce incorrect or biased content
  • Not suitable as a definitive knowledge source

Use outputs as an assistive tool and verify important information independently.

Training

  • Architecture: Transformer decoder
  • Pretraining tokens: Large-scale text and code datasets
  • Precision: bfloat16
  • Framework: Hugging Face Nanotron

License

Apache 2.0

Citation

@misc{allal2025smollm2smolgoesbig,
  title={SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model},
  author={Loubna Ben Allal et al.},
  year={2025},
  eprint={2502.02737},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for 4lph4v3rs3/lightweightl-LLM-135M