File size: 2,078 Bytes
e845219
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# FunctionGemma Mobile Models

## Available Formats

0 mobile format(s) available:


## Usage Examples

### PyTorch Mobile (Android)

```java
// Load the model
Module module = Module.load(assetFilePath(this, "functiongemma_mobile.pt"));

// Prepare input
long[] inputIds = new long[128];
// Fill with tokenized text

// Create tensor
Tensor inputTensor = Tensor.fromBlob(inputIds, new long[]{1, 128});

// Run inference
IValue output = module.forward(IValue.from(inputTensor));
Tensor outputTensor = output.toTensor();
```

### PyTorch Mobile (iOS)

```swift
// Load model
guard let filePath = Bundle.main.path(forResource: "functiongemma_mobile", ofType: "pt") else {
    return
}

let module = try TorchModule(fileAtPath: filePath)

// Prepare input
var inputIds: [Int64] = Array(repeating: 0, count: 128)
// Fill with tokenized text

// Create tensor
let inputTensor = try Tensor(shape: [1, 128], data: inputIds)

// Run inference
let outputTensor = try module.forward([inputTensor])
```

### ONNX Runtime (Cross-platform)

```python
import onnxruntime as ort

# Load model
session = ort.InferenceSession("functiongemma.onnx")

# Prepare input
input_ids = np.array([[...]], dtype=np.int64)

# Run inference
outputs = session.run(None, {{"input_ids": input_ids}})
logits = outputs[0]
```

## Model Details

- **Base Model**: {mobile_info['base_model']}
- **Vocab Size**: {mobile_info['vocab_size']:,}
- **Max Sequence**: {mobile_info['max_seq_length']} tokens
- **Recommended**: {mobile_info['recommended_seq_length']} tokens (mobile)
- **Fine-tuned on**: {mobile_info['fine_tuned_on']}

## Performance

- **Inference Time**: 50-300ms on mobile devices
- **Memory Usage**: 300-800 MB RAM
- **Quantized Version**: 2-4x faster, ~75% smaller

## Requirements

### PyTorch Mobile
- Android: Min SDK 21, PyTorch Mobile library
- iOS: Min iOS 12.0, LibTorch-Lite

### ONNX Runtime
- ONNX Runtime Mobile
- Android/iOS/Web/Desktop support

## Notes

- Use quantized version for better mobile performance
- Recommended sequence length: 128 tokens
- Batch size: 1 (mobile optimization)