Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
base_model:
|
| 5 |
+
- google/functiongemma-270m-it
|
| 6 |
+
---
|
| 7 |
+
## FunctionGemma-270M-IT RAG
|
| 8 |
+
|
| 9 |
+
This is a fine-tuned derivative of `google/functiongemma-270m-it`, optimized for **lightweight Retrieval-Augmented Generation (RAG)** on **mobile / edge / low-power devices**. The fine-tune specializes the model to **consistently emit a tool call to `vector_search`**—with a well-formed, high-recall search query—when the user asks a natural-language question that should be answered from a document store.
|
| 10 |
+
|
| 11 |
+
It’s intended to be used as the **“retrieval controller”** in a local-first RAG pipeline:
|
| 12 |
+
**User question → model generates `vector_search(query=…)` → system retrieves passages → (optional) downstream answer model composes final response**.
|
| 13 |
+
|
| 14 |
+
### Base model
|
| 15 |
+
|
| 16 |
+
- **Base:** `google/functiongemma-270m-it` (Gemma 3 270M family), a small model tuned specifically for function calling. ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma "FunctionGemma model overview | Google AI for Developers"))
|
| 17 |
+
|
| 18 |
+
- **Interface & formatting:** Uses FunctionGemma’s special control tokens for tool use (e.g., `<start_function_call>…<end_function_call>`) and the `<escape>` delimiter for string fields. ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma/formatting-and-best-practices "FunctionGemma formatting and best practices | Google AI for Developers"))
|
| 19 |
+
|
| 20 |
+
- **Context length (base):** 32K total input context (and up to 32K output context per request, budget permitting). ([Hugging Face](https://huggingface.co/google/functiongemma-270m-it "google/functiongemma-270m-it · Hugging Face"))
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
### What’s new in this fine-tune
|
| 24 |
+
|
| 25 |
+
**Primary behavioral change:** When asked questions in natural language, the model reliably chooses to call:
|
| 26 |
+
|
| 27 |
+
- `vector_search`
|
| 28 |
+
|
| 29 |
+
- with a **single string argument**: a retrieval query designed to maximize recall and relevance for downstream passage ranking.
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
**Example behavior (from your eval set):**
|
| 33 |
+
|
| 34 |
+
- **Prompt:** “Can you compare the political systems of the Roman Republic and the Aztec Empire… succession and social mobility?”
|
| 35 |
+
**Output:** `<start_function_call>call:vector_search{query:<escape>Roman Republic vs Aztec Empire political systems succession social mobility ...<escape>}<end_function_call>` ✅
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
(Additional examples include VAR vs VAR review, journalism ethics across platforms, intrinsic vs extrinsic motivation, bench vs jury trial, Rodin image sources.)
|
| 39 |
+
|
| 40 |
+
### Intended use
|
| 41 |
+
|
| 42 |
+
**Designed for:**
|
| 43 |
+
|
| 44 |
+
- On-device or constrained deployments (mobile apps, embedded, low-cost CPU boxes) that need **fast, local routing to retrieval**. FunctionGemma is explicitly positioned as a lightweight base for local-first agents and edge workflows. ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma "FunctionGemma model overview | Google AI for Developers"))
|
| 45 |
+
|
| 46 |
+
- RAG systems where **the most important skill is producing the right search query**, not writing the final answer.
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
**Not designed for:**
|
| 50 |
+
|
| 51 |
+
- Being the sole “answer model” for complex, high-stakes, or deeply reasoned tasks (it’s small; use it to retrieve, then answer with a stronger model if needed).
|
| 52 |
+
|
| 53 |
+
- Multi-step tool plans out of the box (FunctionGemma’s training is strongest for single-turn / parallel calls; multi-step chaining isn’t its primary trained workflow). ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma/formatting-and-best-practices "FunctionGemma formatting and best practices | Google AI for Developers"))
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
### Tool contract
|
| 57 |
+
|
| 58 |
+
This fine-tune assumes a tool with the following conceptual signature:
|
| 59 |
+
|
| 60 |
+
- **Tool name:** `vector_search`
|
| 61 |
+
|
| 62 |
+
- **Arguments:**
|
| 63 |
+
|
| 64 |
+
- `query` (string): a search query describing the user’s information need
|
| 65 |
+
|
| 66 |
+
- **Returns:** passages/snippets (top-k) with metadata (titles/urls/ids), which are then fed into a downstream step.
|
| 67 |
+
|
| 68 |
+
|
| 69 |
+
**Important formatting note:** String values in tool blocks must be wrapped in `<escape>…<escape>` to avoid parsing ambiguity. ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma/formatting-and-best-practices "FunctionGemma formatting and best practices | Google AI for Developers"))
|
| 70 |
+
|
| 71 |
+
### How to use (recommended pattern)
|
| 72 |
+
|
| 73 |
+
1. **Run the model** on the user question.
|
| 74 |
+
|
| 75 |
+
2. If the output contains a `vector_search` call, execute retrieval.
|
| 76 |
+
|
| 77 |
+
3. Feed retrieved passages to:
|
| 78 |
+
|
| 79 |
+
- either the same model (if you accept lower-quality synthesis), or
|
| 80 |
+
|
| 81 |
+
- a larger model for final answer generation.
|
| 82 |
+
|
| 83 |
+
|
| 84 |
+
If you are using the Hugging Face tooling, FunctionGemma models are typically used via chat templates that support tool definitions and function-call decoding. ([Hugging Face](https://huggingface.co/google/functiongemma-270m-it "google/functiongemma-270m-it · Hugging Face"))
|