Yagna1's picture
Upload mobile/README.md with huggingface_hub
e845219 verified
# FunctionGemma Mobile Models
## Available Formats
0 mobile format(s) available:
## Usage Examples
### PyTorch Mobile (Android)
```java
// Load the model
Module module = Module.load(assetFilePath(this, "functiongemma_mobile.pt"));
// Prepare input
long[] inputIds = new long[128];
// Fill with tokenized text
// Create tensor
Tensor inputTensor = Tensor.fromBlob(inputIds, new long[]{1, 128});
// Run inference
IValue output = module.forward(IValue.from(inputTensor));
Tensor outputTensor = output.toTensor();
```
### PyTorch Mobile (iOS)
```swift
// Load model
guard let filePath = Bundle.main.path(forResource: "functiongemma_mobile", ofType: "pt") else {
return
}
let module = try TorchModule(fileAtPath: filePath)
// Prepare input
var inputIds: [Int64] = Array(repeating: 0, count: 128)
// Fill with tokenized text
// Create tensor
let inputTensor = try Tensor(shape: [1, 128], data: inputIds)
// Run inference
let outputTensor = try module.forward([inputTensor])
```
### ONNX Runtime (Cross-platform)
```python
import onnxruntime as ort
# Load model
session = ort.InferenceSession("functiongemma.onnx")
# Prepare input
input_ids = np.array([[...]], dtype=np.int64)
# Run inference
outputs = session.run(None, {{"input_ids": input_ids}})
logits = outputs[0]
```
## Model Details
- **Base Model**: {mobile_info['base_model']}
- **Vocab Size**: {mobile_info['vocab_size']:,}
- **Max Sequence**: {mobile_info['max_seq_length']} tokens
- **Recommended**: {mobile_info['recommended_seq_length']} tokens (mobile)
- **Fine-tuned on**: {mobile_info['fine_tuned_on']}
## Performance
- **Inference Time**: 50-300ms on mobile devices
- **Memory Usage**: 300-800 MB RAM
- **Quantized Version**: 2-4x faster, ~75% smaller
## Requirements
### PyTorch Mobile
- Android: Min SDK 21, PyTorch Mobile library
- iOS: Min iOS 12.0, LibTorch-Lite
### ONNX Runtime
- ONNX Runtime Mobile
- Android/iOS/Web/Desktop support
## Notes
- Use quantized version for better mobile performance
- Recommended sequence length: 128 tokens
- Batch size: 1 (mobile optimization)