mlboydaisuke's picture
upload README.md
0403a07 verified
|
raw
history blame
1.74 kB
metadata
language: en
license: gemma
base_model: google/functiongemma-270m-it
tags:
  - coreml
  - apple-neural-engine
  - gemma3
  - function-calling
  - on-device
library_name: coreml

FunctionGemma-270M for Apple CoreML (ANE-optimized)

CoreML conversion of google/functiongemma-270m-it produced with the CoreML-LLM pipeline. Targets iOS 26 / macOS 26.

What's in this repo

File Notes
model.mlmodelc/ Compiled stateful decoder (fp16, 840 MB). Drop-in for MLModel(contentsOf:)
model_config.json Bundle metadata (architecture, dims, function-call markers)
hf_model/ Tokenizer + chat template (function-calling format)
cos_*.npy, sin_*.npy Pre-computed RoPE tables (optional)

ANE residency

99.42% on Apple Neural Engine (1893/1904 dispatched ops, verified via MLComputePlan on macOS 26). The 11 CPU-only ops are unavoidable input-boundary ops (token gather, argmax, scalar squeeze).

Use it

Via the CoreML-LLM Swift package:

import CoreMLLLM
let bundleURL = try await Gemma3BundleDownloader.download(
    .functionGemma270m, into: appSupportDir)
let fg = try await FunctionGemma.load(bundleURL: bundleURL)
let text = try fg.generate(prompt: "Turn on the flashlight",
                           maxNewTokens: 64)

For raw Core ML usage, the model expects the same I/O contract as Gemma 4: input_ids (1,1) int32, position_ids (1,) int32, causal_mask (1,1,1,ctx) fp16, update_mask (1,1,ctx,1) fp16, with a stateful kv_cache_0 MLState (2*L, kv_heads, ctx, head_dim).

License

Inherits Google's Gemma terms of use.