Tasky

File size: 2,247 Bytes

b2343db
 
 
 
 
 
 
 
 
 
ef3ae22
244fe7e
ef3ae22
b2343db
244fe7e
 
ef3ae22
244fe7e
 
 
ef3ae22
244fe7e
 
 
 
 
 
ef3ae22
244fe7e
ef3ae22
244fe7e
 
ef3ae22
244fe7e
ef3ae22
 
244fe7e
 
 
 
 
 
ef3ae22
244fe7e
ef3ae22
244fe7e
ef3ae22
244fe7e
 
 
 
ef3ae22
244fe7e
ef3ae22
244fe7e
 
ef3ae22
244fe7e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2343db

---
datasets:
- independently-platform/tasky
language:
- en
- it
base_model:
- google/functiongemma-270m-it
library_name: transformers
---

  # Tasky

  ## About the model
  This model is a fine-tuned **function-calling assistant** for a todo/task application. It maps user requests to one of four tools and produces valid tool
  arguments according to the schema in `AI-TRAINING-TOOLS.md`.

  - **Base model:** `google/functiongemma-270m-it`
  - **Primary languages:** English and Italian (with light spelling errors/typos to mimic real users)
  - **Task:** Structured tool selection + argument generation

  ## Intended Use
  Use this model to translate natural language task requests into tool calls for:
  - `create_tasks`
  - `search_tasks`
  - `update_tasks`
  - `delete_tasks`

  It is designed for **task/todo management** workflows and should be paired with strict validation of tool arguments before execution.

  ### Example
  **Input (user):**

  Aggiungi un task per pagare la bolletta della luce domani mattina


  **Expected output (model):**
  ```json
  {
    "tool_name": "create_tasks",
    "tool_arguments": "{\"tasks\":[{\"content\":\"pagare la bolletta della luce\",\"dueDate\":\"2026-01-13T09:00:00.000Z\"}]}"
  }

  ## Training Data

  Synthetic, bilingual tool-calling data built from the tool schema, including:

  - Multiple phrasings and paraphrases
  - Mixed English/Italian prompts
  - Light typos and user mistakes in user_content
  - Broad coverage of optional parameters

  Splits:

  - Train: 1,500 examples
  - Eval: 500 examples

  ## Training Procedure

  - Fine-tuning on synthetic tool-calling samples
  - Deduplicated examples
  - Balanced coverage of all tools and key parameters

  ## Evaluation

  Reported success rate: 99.5% on the 500‑example eval split vs 0% base model.
  Success was measured as exact match on the predicted tool name and the JSON arguments after normalization.

  ## Limitations

  - Trained for a specific tool schema; not a general-purpose assistant.
  - Outputs may include incorrect or incomplete tool arguments; validate before execution.
  - Language coverage is strongest in English and Italian.
  - Synthetic data may not capture all real-world user phrasing or ambiguity.