Note: for the ExecuWhisper-specific fine-tuned formatter, see younghan-meta/LFM2.5-350M-ExecuWhisper-Formatter. This repo continues to host the upstream-base LFM2.5 export.

LFM2.5 ExecuTorch MLX

Pre-exported ExecuTorch artifacts for LiquidAI LFM2.5 models with the MLX backend for Apple Silicon.

This repo is an artifact companion for ExecuTorch MLX inference. It ships the .pte files so you can skip export and run directly with an MLX-enabled ExecuTorch Llama runner.

Overview

The pipeline has two stages:

Export: convert the Hugging Face checkpoints into ExecuTorch .pte artifacts with MLX delegation.
Inference: run the artifacts with the shared ExecuTorch Llama C++ runner or pybindings runner.

The artifacts were exported with bf16 model dtype and 4-bit weight-only linear quantization.

Files

File	Size	What
`lfm2_5_350m_mlx_4w.pte`	308 MiB	LFM2.5 350M lowered to ExecuTorch MLX
`lfm2_5_1_2b_mlx_4w.pte`	849 MiB	LFM2.5 1.2B Instruct lowered to ExecuTorch MLX

Tokenizers are not included. Download tokenizer files from the matching upstream LiquidAI Hugging Face checkpoints.

Performance

Validated on Apple Silicon with the ExecuTorch C++ llama_main runner, temperature=0, and prompt:

<|startoftext|><|im_start|>user
Who are you?<|im_end|>
<|im_start|>assistant

Median of 3 fresh runs:

Artifact	Model load	Prompt eval	Decode	Total	TTFT
`lfm2_5_350m_mlx_4w.pte`	0.071 s	650.00 tok/s	330.43 tok/s	312.33 tok/s	0.020 s
`lfm2_5_1_2b_mlx_4w.pte`	0.073 s	481.48 tok/s	147.93 tok/s	136.24 tok/s	0.027 s

These are smoke benchmarks from short-context exports, not a full performance sweep. The 350M artifact was exported with max_seq_length=128, and the 1.2B artifact was exported with max_seq_length=64.

Prerequisites

macOS on Apple Silicon.
ExecuTorch built from source with EXECUTORCH_BUILD_MLX=ON.
Tokenizer files from the matching upstream LiquidAI checkpoints.

git clone https://github.com/pytorch/executorch ~/executorch
cd ~/executorch

./install_executorch.sh
pip install -e . --no-build-isolation
make lfm_2_5-mlx

The artifacts were validated against an ExecuTorch branch containing commit:

e4bd2e653e Enable LFM2.5 MLX export and runner build

Download

pip install huggingface_hub

hf download younghan-meta/LFM2.5-ExecuTorch-MLX \
    --local-dir lfm25_mlx

hf download LiquidAI/LFM2.5-350M \
    tokenizer.json tokenizer_config.json \
    --local-dir lfm25_350m_base

hf download LiquidAI/LFM2.5-1.2B-Instruct \
    tokenizer.json tokenizer_config.json \
    --local-dir lfm25_1_2b_base

Run

350M:

cmake-out/examples/models/llama/llama_main \
    --model_path lfm25_mlx/lfm2_5_350m_mlx_4w.pte \
    --tokenizer_path lfm25_350m_base/tokenizer.json \
    --prompt="<|startoftext|><|im_start|>user\nWho are you?<|im_end|>\n<|im_start|>assistant\n" \
    --temperature 0.3 \
    --max_new_tokens 64

1.2B Instruct:

cmake-out/examples/models/llama/llama_main \
    --model_path lfm25_mlx/lfm2_5_1_2b_mlx_4w.pte \
    --tokenizer_path lfm25_1_2b_base/tokenizer.json \
    --prompt="<|startoftext|><|im_start|>user\nWho are you?<|im_end|>\n<|im_start|>assistant\n" \
    --temperature 0.3 \
    --max_new_tokens 48

Re-export

From an ExecuTorch checkout with the LFM2.5 MLX config:

python -m extension.llm.export.export_llm \
    --config examples/models/lfm2/config/lfm2_mlx_4w.yaml \
    +base.model_class="lfm2_5_350m" \
    +base.params="examples/models/lfm2/config/lfm2_5_350m_config.json" \
    +export.output_name="lfm2_5_350m_mlx_4w.pte"

python -m extension.llm.export.export_llm \
    --config examples/models/lfm2/config/lfm2_mlx_4w.yaml \
    +base.model_class="lfm2_5_1_2b" \
    +base.params="examples/models/lfm2/config/lfm2_5_1_2b_config.json" \
    +export.output_name="lfm2_5_1_2b_mlx_4w.pte"

Checksums

7241608ed90239a5cc7464d010b4fa5a62694c07034a194139b3e7dd543ebaef  lfm2_5_350m_mlx_4w.pte
b243eb401c7f7a428a41fa698d174693655fb3209be160c64947a6b823fbc075  lfm2_5_1_2b_mlx_4w.pte

Downloads last month: 26

Model tree for younghan-meta/LFM2.5-ExecuTorch-MLX

Base model

LiquidAI/LFM2.5-1.2B-Base

Finetuned

LiquidAI/LFM2.5-1.2B-Instruct

Quantized

(49)

this model