|
|
--- |
|
|
datasets: |
|
|
- independently-platform/tasky |
|
|
language: |
|
|
- en |
|
|
- it |
|
|
base_model: |
|
|
- google/functiongemma-270m-it |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# Tasky |
|
|
|
|
|
## About the model |
|
|
This model is a fine-tuned **function-calling assistant** for a todo/task application. It maps user requests to one of four tools and produces valid tool |
|
|
arguments according to the schema in `AI-TRAINING-TOOLS.md`. |
|
|
|
|
|
- **Base model:** `google/functiongemma-270m-it` |
|
|
- **Primary languages:** English and Italian (with light spelling errors/typos to mimic real users) |
|
|
- **Task:** Structured tool selection + argument generation |
|
|
|
|
|
## Intended Use |
|
|
Use this model to translate natural language task requests into tool calls for: |
|
|
- `create_tasks` |
|
|
- `search_tasks` |
|
|
- `update_tasks` |
|
|
- `delete_tasks` |
|
|
|
|
|
It is designed for **task/todo management** workflows and should be paired with strict validation of tool arguments before execution. |
|
|
|
|
|
### Example |
|
|
**Input (user):** |
|
|
|
|
|
Aggiungi un task per pagare la bolletta della luce domani mattina |
|
|
|
|
|
|
|
|
**Expected output (model):** |
|
|
```json |
|
|
{ |
|
|
"tool_name": "create_tasks", |
|
|
"tool_arguments": "{\"tasks\":[{\"content\":\"pagare la bolletta della luce\",\"dueDate\":\"2026-01-13T09:00:00.000Z\"}]}" |
|
|
} |
|
|
|
|
|
## Training Data |
|
|
|
|
|
Synthetic, bilingual tool-calling data built from the tool schema, including: |
|
|
|
|
|
- Multiple phrasings and paraphrases |
|
|
- Mixed English/Italian prompts |
|
|
- Light typos and user mistakes in user_content |
|
|
- Broad coverage of optional parameters |
|
|
|
|
|
Splits: |
|
|
|
|
|
- Train: 1,500 examples |
|
|
- Eval: 500 examples |
|
|
|
|
|
## Training Procedure |
|
|
|
|
|
- Fine-tuning on synthetic tool-calling samples |
|
|
- Deduplicated examples |
|
|
- Balanced coverage of all tools and key parameters |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
Reported success rate: 99.5% on the 500‑example eval split vs 0% base model. |
|
|
Success was measured as exact match on the predicted tool name and the JSON arguments after normalization. |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Trained for a specific tool schema; not a general-purpose assistant. |
|
|
- Outputs may include incorrect or incomplete tool arguments; validate before execution. |
|
|
- Language coverage is strongest in English and Italian. |
|
|
- Synthetic data may not capture all real-world user phrasing or ambiguity. |