SmolLM2-135M-Instruct for TinyBrain
This repository packages HuggingFaceTB/SmolLM2-135M-Instruct as a compiled Core ML bundle.
Bundle output
The conversion script produces:
smollm2_135m_instruct_only_logits.mlmodelc.zip
The zip layout is:
smollm2_135m_instruct_only_logits.mlmodelc/tokenizer/metadata.json
Runtime interface
- Task: causal language modeling
- Inputs:
input_ids(Int32, shape1 x 32)attention_mask(Int32, shape1 x 32)
- Output:
logits
Important
This bundle is marked as tinybrain-causallm-llama-v1.
That is intentional: SmolLM2 uses a Llama-style tokenizer/chat template.
The bundle also includes TinyBrain runtime hints in metadata.json:
chat_template_typetokenizer_familydefault_system_promptsampling_defaultsruntime_profile
These fields let TinyBrain adapt prompt rendering, history trimming, and sampling defaults without hardcoding the model name in the app.
Build
Run:
python3 convert_smollm2_135m_instruct_to_coreml.py
The script:
- Downloads the model and tokenizer from Hugging Face.
- Converts the model to Core ML.
- Compiles the package to
.mlmodelc. - Writes runtime metadata.
- Packages the compiled model, tokenizer, and metadata into one zip.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support