| --- |
| language: en |
| license: gemma |
| base_model: google/functiongemma-270m-it |
| tags: |
| - coreml |
| - apple-neural-engine |
| - gemma3 |
| - function-calling |
| - on-device |
| library_name: coreml |
| --- |
| |
| ## Use it from Swift |
|
|
| <!-- swift-usage-begin --> |
| ### Add the package |
|
|
| `Package.swift`: |
|
|
| ```swift |
| .package(url: "https://github.com/john-rocky/CoreML-LLM", branch: "main"), |
| |
| // In your target: |
| .product(name: "CoreMLLLM", package: "CoreML-LLM"), |
| ``` |
|
|
| Platforms: iOS 18+ / macOS 15+. |
|
|
| ### Download + call |
|
|
| ```swift |
| import CoreMLLLM |
| |
| let modelsDir = try FileManager.default.url( |
| for: .applicationSupportDirectory, in: .userDomainMask, |
| appropriateFor: nil, create: true) |
| |
| // Pulls the bundle from this repo on first call, then loads. |
| let fg = try await FunctionGemma.downloadAndLoad(modelsDir: modelsDir) |
| |
| // Plain chat-templated generation |
| let stream = try await fg.generate("How do I list files in Swift?") |
| for await chunk in stream { print(chunk, terminator: "") } |
| |
| // Function-call generation (returns a single JSON object) |
| let json = try await fg.generateFunctionCall( |
| tools: tools, // your [String: Any] schema |
| userMessage: "Get me weather for Tokyo") |
| print(json) |
| ``` |
|
|
| See [`Gemma3FunctionGemma.swift`](https://github.com/john-rocky/CoreML-LLM/blob/main/Sources/CoreMLLLM/Gemma3FunctionGemma.swift) |
| for the full API. |
| <!-- swift-usage-end --> |
|
|
|
|
|
|
| # FunctionGemma-270M for Apple CoreML (ANE-optimized) |
|
|
| CoreML conversion of `google/functiongemma-270m-it` produced with the |
| [CoreML-LLM](https://github.com/john-rocky/CoreML-LLM) pipeline. Targets |
| iOS 26 / macOS 26. |
|
|
| ## What's in this repo |
|
|
| | File | Notes | |
| |---|---| |
| | `model.mlmodelc/` | Compiled stateful decoder (fp16, 840 MB). Drop-in for `MLModel(contentsOf:)` | |
| | `model_config.json` | Bundle metadata (architecture, dims, function-call markers) | |
| | `hf_model/` | Tokenizer + chat template (function-calling format) | |
| | `cos_*.npy`, `sin_*.npy` | Pre-computed RoPE tables (optional) | |
|
|
| ## ANE residency |
|
|
| **99.42% on Apple Neural Engine** (1893/1904 dispatched ops, verified via |
| `MLComputePlan` on macOS 26). The 11 CPU-only ops are unavoidable |
| input-boundary ops (token gather, argmax, scalar squeeze). |
|
|
| ## Use it |
|
|
| Via the [CoreML-LLM Swift package](https://github.com/john-rocky/CoreML-LLM): |
|
|
| ```swift |
| import CoreMLLLM |
| let bundleURL = try await Gemma3BundleDownloader.download( |
| .functionGemma270m, into: appSupportDir) |
| let fg = try await FunctionGemma.load(bundleURL: bundleURL) |
| let text = try fg.generate(prompt: "Turn on the flashlight", |
| maxNewTokens: 64) |
| ``` |
|
|
| For raw Core ML usage, the model expects the same I/O contract as Gemma 4: |
| `input_ids (1,1) int32`, `position_ids (1,) int32`, `causal_mask (1,1,1,ctx) fp16`, |
| `update_mask (1,1,ctx,1) fp16`, with a stateful `kv_cache_0` MLState |
| (2*L, kv_heads, ctx, head_dim). |
| |
| ## License |
| |
| Inherits Google's [Gemma terms of use](https://ai.google.dev/gemma/terms). |
| |