LFM2-24B-A2B-Preview-GGUF
LFM2-24B-A2B-Preview in GGUF format for use with NexaSDK, with support for Qualcomm NPU, GPU, and CPU inference.
Model
This repository contains the LFM2 24B parameter model with A2B quantization in GGUF format. It is intended for on-device inference via NexaSDK on Android, Windows, and other supported platforms.
NPU Setup
Hardware: Qualcomm Snapdragon 8 Gen 4 (or other Snapdragon SoCs with NPU as documented by Nexa).
Tutorial: Run on Android demo app
Install the app
Install the NexaSDK Android demo app.Option A โ APK: Download the pre-built APK and install via adb:
# Download: https://nexa-model-hub-bucket.s3.us-west-1.amazonaws.com/public/android-demo-release/nexaai-demo-app.apk adb install nexaai-demo-app.apkOption B โ Build from source: Clone the repo, open
bindings/androidin Android Studio, then build and run. See the Android demo README for full steps.Select the model
Open the model selector (dropdown next to the model name) and choose LFM2-24B-A2B-Preview-GGUF.Download
Tap Download to fetch the model to your device. Wait until the download finishes.Load
Tap Load. A load model config dialog appears: choose CPU, GPU, or NPU (for Qualcomm NPU), then tap SURE. Once the model is loaded, the chat area becomes available.Chat
Type your message in the input field at the bottom, then tap Send to get a response. Use Clear to clear the input or conversation as needed.
For NPU/GPU/CPU requirements and license, see NPU Setup above and Android SDK Doc.
For the full tutorial with screenshots, see the Tutorial: LFM2-24B-A2B-Preview-GGUF section in the Android demo README on GitHub.
Usage
- Android: See NexaSDK Android documentation for running this model on devices with NPU or other backends.
- PC / other platforms: See NexaSDK for CLI and bindings.
- Downloads last month
- 9
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for NexaAI/LFM2-24B-A2B-GGUF
Base model
LiquidAI/LFM2-24B-A2B-Preview