independently-platform commited on
Commit
b47d230
·
verified ·
1 Parent(s): 92017d1

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ base_model:
5
+ - google/functiongemma-270m-it
6
+ ---
7
+ ## FunctionGemma-270M-IT RAG
8
+
9
+ This is a fine-tuned derivative of `google/functiongemma-270m-it`, optimized for **lightweight Retrieval-Augmented Generation (RAG)** on **mobile / edge / low-power devices**. The fine-tune specializes the model to **consistently emit a tool call to `vector_search`**—with a well-formed, high-recall search query—when the user asks a natural-language question that should be answered from a document store.
10
+
11
+ It’s intended to be used as the **“retrieval controller”** in a local-first RAG pipeline:
12
+ **User question → model generates `vector_search(query=…)` → system retrieves passages → (optional) downstream answer model composes final response**.
13
+
14
+ ### Base model
15
+
16
+ - **Base:** `google/functiongemma-270m-it` (Gemma 3 270M family), a small model tuned specifically for function calling. ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma "FunctionGemma model overview  |  Google AI for Developers"))
17
+
18
+ - **Interface & formatting:** Uses FunctionGemma’s special control tokens for tool use (e.g., `<start_function_call>…<end_function_call>`) and the `<escape>` delimiter for string fields. ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma/formatting-and-best-practices "FunctionGemma formatting and best practices  |  Google AI for Developers"))
19
+
20
+ - **Context length (base):** 32K total input context (and up to 32K output context per request, budget permitting). ([Hugging Face](https://huggingface.co/google/functiongemma-270m-it "google/functiongemma-270m-it · Hugging Face"))
21
+
22
+
23
+ ### What’s new in this fine-tune
24
+
25
+ **Primary behavioral change:** When asked questions in natural language, the model reliably chooses to call:
26
+
27
+ - `vector_search`
28
+
29
+ - with a **single string argument**: a retrieval query designed to maximize recall and relevance for downstream passage ranking.
30
+
31
+
32
+ **Example behavior (from your eval set):**
33
+
34
+ - **Prompt:** “Can you compare the political systems of the Roman Republic and the Aztec Empire… succession and social mobility?”
35
+ **Output:** `<start_function_call>call:vector_search{query:<escape>Roman Republic vs Aztec Empire political systems succession social mobility ...<escape>}<end_function_call>` ✅
36
+
37
+
38
+ (Additional examples include VAR vs VAR review, journalism ethics across platforms, intrinsic vs extrinsic motivation, bench vs jury trial, Rodin image sources.)
39
+
40
+ ### Intended use
41
+
42
+ **Designed for:**
43
+
44
+ - On-device or constrained deployments (mobile apps, embedded, low-cost CPU boxes) that need **fast, local routing to retrieval**. FunctionGemma is explicitly positioned as a lightweight base for local-first agents and edge workflows. ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma "FunctionGemma model overview  |  Google AI for Developers"))
45
+
46
+ - RAG systems where **the most important skill is producing the right search query**, not writing the final answer.
47
+
48
+
49
+ **Not designed for:**
50
+
51
+ - Being the sole “answer model” for complex, high-stakes, or deeply reasoned tasks (it’s small; use it to retrieve, then answer with a stronger model if needed).
52
+
53
+ - Multi-step tool plans out of the box (FunctionGemma’s training is strongest for single-turn / parallel calls; multi-step chaining isn’t its primary trained workflow). ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma/formatting-and-best-practices "FunctionGemma formatting and best practices  |  Google AI for Developers"))
54
+
55
+
56
+ ### Tool contract
57
+
58
+ This fine-tune assumes a tool with the following conceptual signature:
59
+
60
+ - **Tool name:** `vector_search`
61
+
62
+ - **Arguments:**
63
+
64
+ - `query` (string): a search query describing the user’s information need
65
+
66
+ - **Returns:** passages/snippets (top-k) with metadata (titles/urls/ids), which are then fed into a downstream step.
67
+
68
+
69
+ **Important formatting note:** String values in tool blocks must be wrapped in `<escape>…<escape>` to avoid parsing ambiguity. ([Google AI for Developers](https://ai.google.dev/gemma/docs/functiongemma/formatting-and-best-practices "FunctionGemma formatting and best practices  |  Google AI for Developers"))
70
+
71
+ ### How to use (recommended pattern)
72
+
73
+ 1. **Run the model** on the user question.
74
+
75
+ 2. If the output contains a `vector_search` call, execute retrieval.
76
+
77
+ 3. Feed retrieved passages to:
78
+
79
+ - either the same model (if you accept lower-quality synthesis), or
80
+
81
+ - a larger model for final answer generation.
82
+
83
+
84
+ If you are using the Hugging Face tooling, FunctionGemma models are typically used via chat templates that support tool definitions and function-call decoding. ([Hugging Face](https://huggingface.co/google/functiongemma-270m-it "google/functiongemma-270m-it · Hugging Face"))