Upload mobile/README.md with huggingface_hub

Browse files

Files changed (1) hide show

mobile/README.md +93 -0

mobile/README.md ADDED Viewed

	@@ -0,0 +1,93 @@

+# FunctionGemma Mobile Models
+## Available Formats
+0 mobile format(s) available:
+## Usage Examples
+### PyTorch Mobile (Android)
+```java
+// Load the model
+Module module = Module.load(assetFilePath(this, "functiongemma_mobile.pt"));
+// Prepare input
+long[] inputIds = new long[128];
+// Fill with tokenized text
+// Create tensor
+Tensor inputTensor = Tensor.fromBlob(inputIds, new long[]{1, 128});
+// Run inference
+IValue output = module.forward(IValue.from(inputTensor));
+Tensor outputTensor = output.toTensor();
+```
+### PyTorch Mobile (iOS)
+```swift
+// Load model
+guard let filePath = Bundle.main.path(forResource: "functiongemma_mobile", ofType: "pt") else {
+    return
+}
+let module = try TorchModule(fileAtPath: filePath)
+// Prepare input
+var inputIds: [Int64] = Array(repeating: 0, count: 128)
+// Fill with tokenized text
+// Create tensor
+let inputTensor = try Tensor(shape: [1, 128], data: inputIds)
+// Run inference
+let outputTensor = try module.forward([inputTensor])
+```
+### ONNX Runtime (Cross-platform)
+```python
+import onnxruntime as ort
+# Load model
+session = ort.InferenceSession("functiongemma.onnx")
+# Prepare input
+input_ids = np.array([[...]], dtype=np.int64)
+# Run inference
+outputs = session.run(None, {{"input_ids": input_ids}})
+logits = outputs[0]
+```
+## Model Details
+- **Base Model**: {mobile_info['base_model']}
+- **Vocab Size**: {mobile_info['vocab_size']:,}
+- **Max Sequence**: {mobile_info['max_seq_length']} tokens
+- **Recommended**: {mobile_info['recommended_seq_length']} tokens (mobile)
+- **Fine-tuned on**: {mobile_info['fine_tuned_on']}
+## Performance
+- **Inference Time**: 50-300ms on mobile devices
+- **Memory Usage**: 300-800 MB RAM
+- **Quantized Version**: 2-4x faster, ~75% smaller
+## Requirements
+### PyTorch Mobile
+- Android: Min SDK 21, PyTorch Mobile library
+- iOS: Min iOS 12.0, LibTorch-Lite
+### ONNX Runtime
+- ONNX Runtime Mobile
+- Android/iOS/Web/Desktop support
+## Notes
+- Use quantized version for better mobile performance
+- Recommended sequence length: 128 tokens
+- Batch size: 1 (mobile optimization)