# FunctionGemma Mobile Models ## Available Formats 0 mobile format(s) available: ## Usage Examples ### PyTorch Mobile (Android) ```java // Load the model Module module = Module.load(assetFilePath(this, "functiongemma_mobile.pt")); // Prepare input long[] inputIds = new long[128]; // Fill with tokenized text // Create tensor Tensor inputTensor = Tensor.fromBlob(inputIds, new long[]{1, 128}); // Run inference IValue output = module.forward(IValue.from(inputTensor)); Tensor outputTensor = output.toTensor(); ``` ### PyTorch Mobile (iOS) ```swift // Load model guard let filePath = Bundle.main.path(forResource: "functiongemma_mobile", ofType: "pt") else { return } let module = try TorchModule(fileAtPath: filePath) // Prepare input var inputIds: [Int64] = Array(repeating: 0, count: 128) // Fill with tokenized text // Create tensor let inputTensor = try Tensor(shape: [1, 128], data: inputIds) // Run inference let outputTensor = try module.forward([inputTensor]) ``` ### ONNX Runtime (Cross-platform) ```python import onnxruntime as ort # Load model session = ort.InferenceSession("functiongemma.onnx") # Prepare input input_ids = np.array([[...]], dtype=np.int64) # Run inference outputs = session.run(None, {{"input_ids": input_ids}}) logits = outputs[0] ``` ## Model Details - **Base Model**: {mobile_info['base_model']} - **Vocab Size**: {mobile_info['vocab_size']:,} - **Max Sequence**: {mobile_info['max_seq_length']} tokens - **Recommended**: {mobile_info['recommended_seq_length']} tokens (mobile) - **Fine-tuned on**: {mobile_info['fine_tuned_on']} ## Performance - **Inference Time**: 50-300ms on mobile devices - **Memory Usage**: 300-800 MB RAM - **Quantized Version**: 2-4x faster, ~75% smaller ## Requirements ### PyTorch Mobile - Android: Min SDK 21, PyTorch Mobile library - iOS: Min iOS 12.0, LibTorch-Lite ### ONNX Runtime - ONNX Runtime Mobile - Android/iOS/Web/Desktop support ## Notes - Use quantized version for better mobile performance - Recommended sequence length: 128 tokens - Batch size: 1 (mobile optimization)