timmy-t2 / README.md
Satansdeer's picture
Update Timmy T2 source-only provenance
962216b verified
---
library_name: transformers
pipeline_tag: translation
tags:
- t5
- seq2seq
- onnx
- transformers.js
- browser-ml
- timers
- synthetic-data
datasets:
- Satansdeer/timmy-t2-timer-sft
license: apache-2.0
---
# Timmy T2
Timmy T2 stands for **Timmy Timer Translator**. It is a tiny, browser-first seq2seq model for translating natural-language timer requests into Timey's compact action DSL.
This is not a new foundation architecture. It is a task-specific fine-tuned T5-style encoder-decoder model plus a compact output language, lossless slot-annotated input format, constrained parser, and browser ONNX runtime package.
## Release
- Version: `v0.1.0`
- Runtime model version: `phase4y-actions-browser-exact-checkpoint-50-dynq8enc-q4dec-ort-beam4`
- Production commit: `6ea2d2a`
- Production deploy: `6a0ed36e0172c100ef1ab8ac`
- Dataset: [Satansdeer/timmy-t2-timer-sft](https://huggingface.co/datasets/Satansdeer/timmy-t2-timer-sft)
## Intended Use
Timmy T2 is intended for Timey-style timer planning:
```text
5 one minute timers and one 30 second
```
The model emits action commands over extracted slot ids:
```text
REP C0 A0
ADD A1
END
```
The application parses those commands into concrete timers deterministically.
## Files
- Root files are the fp32/safetensors checkpoint for Python Transformers.
- `browser/` contains the production browser artifact:
- dynamic q8 encoder ONNX
- q4 decoder ONNX
- tokenizer/config files used by the Timey browser runtime
- `eval/` contains release evaluation summaries.
- `release_manifest.json` records hashes, sizes, evals, and production smoke checks.
## Training Data
Public dataset rows:
| Split | Rows |
| --- | ---: |
| train | 2639 |
| validation | 207 |
| hard_validation | 62 |
| all_public | 2846 |
The 16-row hidden validation split is withheld from the public dataset to preserve a private holdout.
## Evaluation
| Eval | Records | Parseable | Strict exact | Semantic exact | Semantic invalid |
| --- | ---: | ---: | ---: | ---: | ---: |
| onnx-dynq8enc-q4dec-validation | 207 | 100% | 100% | 100% | 0% |
| onnx-dynq8enc-q4dec-hard | 62 | 100% | 100% | 100% | 0% |
| onnx-dynq8enc-q4dec-hidden | 16 | 100% | 100% | 100% | 0% |
| onnx-dynq8enc-q4dec-browser-failures | 3 | 100% | 100% | 100% | 0% |
| fp32-validation | 207 | 100% | 100% | 100% | 0% |
| fp32-hard | 62 | 100% | 100% | 100% | 0% |
| fp32-hidden | 16 | 100% | 100% | 100% | 0% |
## Browser Smoke
The deployed production browser runtime was smoke-tested with service workers enabled. It loaded `timey-t5-efficient-tiny` and produced the expected timer sequences for:
- `5 one minute timers and one 30 second` -> `[60, 60, 60, 60, 60, 30]`
- `first and last timer 5 minute, 5 one minute timers in between` -> `[300, 60, 60, 60, 60, 60, 300]`
## Limitations
- This is a narrow task model for timer requests, not a general assistant.
- It expects Timey's lossless slot-annotated input format at inference time.
- Correction/edit requests are intentionally handled by deterministic fallback logic in the app.
- Public validation is synthetic and task-targeted; broader natural user traffic should be evaluated before expanding claims.