File size: 2,916 Bytes

---
language: en
license: gemma
base_model: google/functiongemma-270m-it
tags:
  - coreml
  - apple-neural-engine
  - gemma3
  - function-calling
  - on-device
library_name: coreml
---

## Use it from Swift

<!-- swift-usage-begin -->
### Add the package

`Package.swift`:

```swift
.package(url: "https://github.com/john-rocky/CoreML-LLM", branch: "main"),

// In your target:
.product(name: "CoreMLLLM", package: "CoreML-LLM"),
```

Platforms: iOS 18+ / macOS 15+.

### Download + call

```swift
import CoreMLLLM

let modelsDir = try FileManager.default.url(
    for: .applicationSupportDirectory, in: .userDomainMask,
    appropriateFor: nil, create: true)

// Pulls the bundle from this repo on first call, then loads.
let fg = try await FunctionGemma.downloadAndLoad(modelsDir: modelsDir)

// Plain chat-templated generation
let stream = try await fg.generate("How do I list files in Swift?")
for await chunk in stream { print(chunk, terminator: "") }

// Function-call generation (returns a single JSON object)
let json = try await fg.generateFunctionCall(
    tools: tools,                  // your [String: Any] schema
    userMessage: "Get me weather for Tokyo")
print(json)
```

See [`Gemma3FunctionGemma.swift`](https://github.com/john-rocky/CoreML-LLM/blob/main/Sources/CoreMLLLM/Gemma3FunctionGemma.swift)
for the full API.
<!-- swift-usage-end -->



# FunctionGemma-270M for Apple CoreML (ANE-optimized)

CoreML conversion of `google/functiongemma-270m-it` produced with the
[CoreML-LLM](https://github.com/john-rocky/CoreML-LLM) pipeline. Targets
iOS 26 / macOS 26.

## What's in this repo

| File | Notes |
|---|---|
| `model.mlmodelc/` | Compiled stateful decoder (fp16, 840 MB). Drop-in for `MLModel(contentsOf:)` |
| `model_config.json` | Bundle metadata (architecture, dims, function-call markers) |
| `hf_model/` | Tokenizer + chat template (function-calling format) |
| `cos_*.npy`, `sin_*.npy` | Pre-computed RoPE tables (optional) |

## ANE residency

**99.42% on Apple Neural Engine** (1893/1904 dispatched ops, verified via
`MLComputePlan` on macOS 26). The 11 CPU-only ops are unavoidable
input-boundary ops (token gather, argmax, scalar squeeze).

## Use it

Via the [CoreML-LLM Swift package](https://github.com/john-rocky/CoreML-LLM):

```swift
import CoreMLLLM
let bundleURL = try await Gemma3BundleDownloader.download(
    .functionGemma270m, into: appSupportDir)
let fg = try await FunctionGemma.load(bundleURL: bundleURL)
let text = try fg.generate(prompt: "Turn on the flashlight",
                           maxNewTokens: 64)
```

For raw Core ML usage, the model expects the same I/O contract as Gemma 4:
`input_ids (1,1) int32`, `position_ids (1,) int32`, `causal_mask (1,1,1,ctx) fp16`,
`update_mask (1,1,ctx,1) fp16`, with a stateful `kv_cache_0` MLState
(2*L, kv_heads, ctx, head_dim).

## License

Inherits Google's [Gemma terms of use](https://ai.google.dev/gemma/terms).