Tasky / README.md
independently-platform's picture
Update README.md
b2343db verified
---
datasets:
- independently-platform/tasky
language:
- en
- it
base_model:
- google/functiongemma-270m-it
library_name: transformers
---
# Tasky
## About the model
This model is a fine-tuned **function-calling assistant** for a todo/task application. It maps user requests to one of four tools and produces valid tool
arguments according to the schema in `AI-TRAINING-TOOLS.md`.
- **Base model:** `google/functiongemma-270m-it`
- **Primary languages:** English and Italian (with light spelling errors/typos to mimic real users)
- **Task:** Structured tool selection + argument generation
## Intended Use
Use this model to translate natural language task requests into tool calls for:
- `create_tasks`
- `search_tasks`
- `update_tasks`
- `delete_tasks`
It is designed for **task/todo management** workflows and should be paired with strict validation of tool arguments before execution.
### Example
**Input (user):**
Aggiungi un task per pagare la bolletta della luce domani mattina
**Expected output (model):**
```json
{
"tool_name": "create_tasks",
"tool_arguments": "{\"tasks\":[{\"content\":\"pagare la bolletta della luce\",\"dueDate\":\"2026-01-13T09:00:00.000Z\"}]}"
}
## Training Data
Synthetic, bilingual tool-calling data built from the tool schema, including:
- Multiple phrasings and paraphrases
- Mixed English/Italian prompts
- Light typos and user mistakes in user_content
- Broad coverage of optional parameters
Splits:
- Train: 1,500 examples
- Eval: 500 examples
## Training Procedure
- Fine-tuning on synthetic tool-calling samples
- Deduplicated examples
- Balanced coverage of all tools and key parameters
## Evaluation
Reported success rate: 99.5% on the 500‑example eval split vs 0% base model.
Success was measured as exact match on the predicted tool name and the JSON arguments after normalization.
## Limitations
- Trained for a specific tool schema; not a general-purpose assistant.
- Outputs may include incorrect or incomplete tool arguments; validate before execution.
- Language coverage is strongest in English and Italian.
- Synthetic data may not capture all real-world user phrasing or ambiguity.