SmolLM2-135M-Instruct for TinyBrain

This repository packages HuggingFaceTB/SmolLM2-135M-Instruct as a compiled Core ML bundle.

Bundle output

The conversion script produces:

  • smollm2_135m_instruct_only_logits.mlmodelc.zip

The zip layout is:

  • smollm2_135m_instruct_only_logits.mlmodelc/
  • tokenizer/
  • metadata.json

Runtime interface

  • Task: causal language modeling
  • Inputs:
    • input_ids (Int32, shape 1 x 32)
    • attention_mask (Int32, shape 1 x 32)
  • Output:
    • logits

Important

This bundle is marked as tinybrain-causallm-llama-v1.

That is intentional: SmolLM2 uses a Llama-style tokenizer/chat template.

The bundle also includes TinyBrain runtime hints in metadata.json:

  • chat_template_type
  • tokenizer_family
  • default_system_prompt
  • sampling_defaults
  • runtime_profile

These fields let TinyBrain adapt prompt rendering, history trimming, and sampling defaults without hardcoding the model name in the app.

Build

Run:

python3 convert_smollm2_135m_instruct_to_coreml.py

The script:

  1. Downloads the model and tokenizer from Hugging Face.
  2. Converts the model to Core ML.
  3. Compiles the package to .mlmodelc.
  4. Writes runtime metadata.
  5. Packages the compiled model, tokenizer, and metadata into one zip.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support