README.md · OpenmindAGI/functiongemma-finetuned-g1 at main

functiongemma-finetuned-g1 / README.md

wenjinf0811

Upload README.md with huggingface_hub

781afa0 verified 1 day ago

preview code

raw

history blame contribute delete

2.92 kB

	---
	language:
	- en
	license: apache-2.0
	base_model: google/functiongemma-270m-it
	tags:
	- robotics
	- function-calling
	- gemma
	- lora
	- fine-tuned
	- edge-ai
	- jetson
	pipeline_tag: text-generation
	library_name: transformers
	---

	# FunctionGemma Robot Actions

	A fine-tuned [FunctionGemma 270M](https://huggingface.co/google/functiongemma-270m-it) model that converts natural language into structured robot action and emotion function calls. Designed for real-time inference on edge devices like the NVIDIA Jetson AGX Thor.

	## Overview

	This model takes a user's voice or text input and outputs two function calls:

	- `robot_action` — a physical action for the robot to perform
	- `show_emotion` — an emotion to display on the robot's avatar screen (Rive animations)

	General conversation defaults to `stand_still` with a contextually appropriate emotion.

	## Example

	```
	Input: "Can you shake hands with me?"
	Output: robot_action(action_name="shake_hand") + show_emotion(emotion="happy")

	Input: "What is that?"
	Output: robot_action(action_name="stand_still") + show_emotion(emotion="confused")

	Input: "I feel sad"
	Output: robot_action(action_name="stand_still") + show_emotion(emotion="sad")
	```

	## Supported Actions

	\| Action \| Description \|
	\|--------\|-------------\|
	\| `shake_hand` \| Handshake gesture \|
	\| `face_wave` \| Wave hello \|
	\| `hands_up` \| Raise both hands \|
	\| `stand_still` \| Stay idle (default for general conversation) \|
	\| `show_hand` \| Show open hand \|

	## Supported Emotions

	\| Emotion \| Animation \|
	\|---------\|-----------\|
	\| `happy` \| Happy.riv \|
	\| `sad` \| Sad.riv \|
	\| `excited` \| Excited.riv \|
	\| `confused` \| Confused.riv \|
	\| `curious` \| Curious.riv \|
	\| `think` \| Think.riv \|

	## Performance on NVIDIA Jetson AGX Thor

	Benchmarked with constrained decoding (2 forward passes instead of 33 autoregressive steps):

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Min latency \| 52 ms \|
	\| Max latency \| 72 ms \|
	\| Avg latency \| 59 ms \|


	## Training Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base model \| `google/functiongemma-270m-it` \|
	\| Method \| LoRA (rank 8, alpha 16) \|
	\| Training data \| 545 examples (490 train / 55 eval) \|
	\| Epochs \| 5 \|
	\| Learning rate \| 2e-4 \|
	\| Batch size \| 2 (effective 4 with gradient accumulation) \|
	\| Max sequence length \| 512 \|
	\| Precision \| bf16 \|

	### Quick Start

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	model = AutoModelForCausalLM.from_pretrained(
	"OpenmindAGI/functiongemma-robot-actions",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)
	tokenizer = AutoTokenizer.from_pretrained("OpenmindAGI/functiongemma-robot-actions")
	model.eval()
	```

	## Citation

	```bibtex
	@misc{openmindagi-functiongemma-robot-actions,
	title={FunctionGemma Robot Actions},
	author={OpenmindAGI},
	year={2025},
	url={https://huggingface.co/OpenmindAGI/functiongemma-robot-actions}
	}
	```