| --- |
| language: |
| - en |
| license: apache-2.0 |
| base_model: google/functiongemma-270m-it |
| tags: |
| - robotics |
| - function-calling |
| - gemma |
| - lora |
| - fine-tuned |
| - edge-ai |
| - jetson |
| pipeline_tag: text-generation |
| library_name: transformers |
| --- |
| |
| # FunctionGemma Robot Actions |
|
|
| A fine-tuned [FunctionGemma 270M](https://huggingface.co/google/functiongemma-270m-it) model that converts natural language into structured robot action and emotion function calls. Designed for real-time inference on edge devices like the NVIDIA Jetson AGX Thor. |
|
|
| ## Overview |
|
|
| This model takes a user's voice or text input and outputs two function calls: |
|
|
| - **`robot_action`** — a physical action for the robot to perform |
| - **`show_emotion`** — an emotion to display on the robot's avatar screen (Rive animations) |
|
|
| General conversation defaults to `stand_still` with a contextually appropriate emotion. |
|
|
| ## Example |
|
|
| ``` |
| Input: "Can you shake hands with me?" |
| Output: robot_action(action_name="shake_hand") + show_emotion(emotion="happy") |
| |
| Input: "What is that?" |
| Output: robot_action(action_name="stand_still") + show_emotion(emotion="confused") |
| |
| Input: "I feel sad" |
| Output: robot_action(action_name="stand_still") + show_emotion(emotion="sad") |
| ``` |
|
|
| ## Supported Actions |
|
|
| | Action | Description | |
| |--------|-------------| |
| | `shake_hand` | Handshake gesture | |
| | `face_wave` | Wave hello | |
| | `hands_up` | Raise both hands | |
| | `stand_still` | Stay idle (default for general conversation) | |
| | `show_hand` | Show open hand | |
|
|
| ## Supported Emotions |
|
|
| | Emotion | Animation | |
| |---------|-----------| |
| | `happy` | Happy.riv | |
| | `sad` | Sad.riv | |
| | `excited` | Excited.riv | |
| | `confused` | Confused.riv | |
| | `curious` | Curious.riv | |
| | `think` | Think.riv | |
|
|
| ## Performance on NVIDIA Jetson AGX Thor |
|
|
| Benchmarked with constrained decoding (2 forward passes instead of 33 autoregressive steps): |
|
|
| | Metric | Value | |
| |--------|-------| |
| | Min latency | 52 ms | |
| | Max latency | 72 ms | |
| | **Avg latency** | **59 ms** | |
|
|
|
|
| ## Training Details |
|
|
| | Parameter | Value | |
| |-----------|-------| |
| | Base model | `google/functiongemma-270m-it` | |
| | Method | LoRA (rank 8, alpha 16) | |
| | Training data | 545 examples (490 train / 55 eval) | |
| | Epochs | 5 | |
| | Learning rate | 2e-4 | |
| | Batch size | 2 (effective 4 with gradient accumulation) | |
| | Max sequence length | 512 | |
| | Precision | bf16 | |
|
|
| ### Quick Start |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| import torch |
| |
| model = AutoModelForCausalLM.from_pretrained( |
| "OpenmindAGI/functiongemma-robot-actions", |
| torch_dtype=torch.bfloat16, |
| device_map="auto", |
| ) |
| tokenizer = AutoTokenizer.from_pretrained("OpenmindAGI/functiongemma-robot-actions") |
| model.eval() |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{openmindagi-functiongemma-robot-actions, |
| title={FunctionGemma Robot Actions}, |
| author={OpenmindAGI}, |
| year={2025}, |
| url={https://huggingface.co/OpenmindAGI/functiongemma-robot-actions} |
| } |
| ``` |
|
|
|
|