mlboydaisuke
/

functiongemma-270m-coreml

apple-neural-engine

function-calling

Model card Files Files and versions

functiongemma-270m-coreml / README.md

mlboydaisuke's picture

Upload README.md with huggingface_hub

59b97ca verified 10 days ago

|

history blame contribute delete

2.92 kB

	---
	language: en
	license: gemma
	base_model: google/functiongemma-270m-it
	tags:
	- coreml
	- apple-neural-engine
	- gemma3
	- function-calling
	- on-device
	library_name: coreml
	---

	## Use it from Swift

	<!-- swift-usage-begin -->
	### Add the package

	`Package.swift`:

	```swift
	.package(url: "https://github.com/john-rocky/CoreML-LLM", branch: "main"),

	// In your target:
	.product(name: "CoreMLLLM", package: "CoreML-LLM"),
	```

	Platforms: iOS 18+ / macOS 15+.

	### Download + call

	```swift
	import CoreMLLLM

	let modelsDir = try FileManager.default.url(
	for: .applicationSupportDirectory, in: .userDomainMask,
	appropriateFor: nil, create: true)

	// Pulls the bundle from this repo on first call, then loads.
	let fg = try await FunctionGemma.downloadAndLoad(modelsDir: modelsDir)

	// Plain chat-templated generation
	let stream = try await fg.generate("How do I list files in Swift?")
	for await chunk in stream { print(chunk, terminator: "") }

	// Function-call generation (returns a single JSON object)
	let json = try await fg.generateFunctionCall(
	tools: tools, // your [String: Any] schema
	userMessage: "Get me weather for Tokyo")
	print(json)
	```

	See [`Gemma3FunctionGemma.swift`](https://github.com/john-rocky/CoreML-LLM/blob/main/Sources/CoreMLLLM/Gemma3FunctionGemma.swift)
	for the full API.
	<!-- swift-usage-end -->



	# FunctionGemma-270M for Apple CoreML (ANE-optimized)

	CoreML conversion of `google/functiongemma-270m-it` produced with the
	[CoreML-LLM](https://github.com/john-rocky/CoreML-LLM) pipeline. Targets
	iOS 26 / macOS 26.

	## What's in this repo

	\| File \| Notes \|
	\|---\|---\|
	\| `model.mlmodelc/` \| Compiled stateful decoder (fp16, 840 MB). Drop-in for `MLModel(contentsOf:)` \|
	\| `model_config.json` \| Bundle metadata (architecture, dims, function-call markers) \|
	\| `hf_model/` \| Tokenizer + chat template (function-calling format) \|
	\| `cos_.npy`, `sin_.npy` \| Pre-computed RoPE tables (optional) \|

	## ANE residency

	99.42% on Apple Neural Engine (1893/1904 dispatched ops, verified via
	`MLComputePlan` on macOS 26). The 11 CPU-only ops are unavoidable
	input-boundary ops (token gather, argmax, scalar squeeze).

	## Use it

	Via the [CoreML-LLM Swift package](https://github.com/john-rocky/CoreML-LLM):

	```swift
	import CoreMLLLM
	let bundleURL = try await Gemma3BundleDownloader.download(
	.functionGemma270m, into: appSupportDir)
	let fg = try await FunctionGemma.load(bundleURL: bundleURL)
	let text = try fg.generate(prompt: "Turn on the flashlight",
	maxNewTokens: 64)
	```

	For raw Core ML usage, the model expects the same I/O contract as Gemma 4:
	`input_ids (1,1) int32`, `position_ids (1,) int32`, `causal_mask (1,1,1,ctx) fp16`,
	`update_mask (1,1,ctx,1) fp16`, with a stateful `kv_cache_0` MLState
	(2*L, kv_heads, ctx, head_dim).

	## License

	Inherits Google's [Gemma terms of use](https://ai.google.dev/gemma/terms).