--- license: apache-2.0 language: - en library_name: mlx pipeline_tag: text-generation base_model: Qwen/Qwen2.5-1.5B-Instruct tags: - cron - systemd - devops - schedule - text-generation - mlx - lora datasets: - Shpigford/cron-schedule-conversion --- # Shpigford/cron-mini A small fine-tuned language model that converts natural-language schedules into cron expressions and systemd `OnCalendar` strings. ## What it does ``` Input: every Tuesday at 3am except December Output: {"cron": "0 3 * 1-11 2", "systemd": "Tue *-01..11-* 03:00:00", "note": "Months 1-11 only excludes December."} ``` It handles: - Standard schedules (daily, weekly, monthly, every N minutes/hours) - Holidays (Christmas, Thanksgiving, Black Friday, Halloween, etc.) - Casual time references ("lunchtime", "before bed", "first thing in the morning") - Ordinal weekdays ("second Tuesday of the month", "last Friday") - Negative specifications ("every day except Sunday", "all months except December") - Sub-minute intervals (cron can't, systemd can — model annotates the limitation) - Awkward intervals (every 90 minutes — cron can't, expanded across the day) - Compound schedules requiring multiple cron lines - systemd-specific features (`OnBootSec=`, `Persistent=`, `RandomizedDelaySec=`) - Time zones (sets `TZ=` for cron, uses `Asia/Tokyo`-style for systemd) - Typos and informal phrasings ("evry tues @ 3am") ## Usage ### MLX (Apple Silicon) ```python from mlx_lm import load, generate model, tokenizer = load("Shpigford/cron-mini") SYSTEM = ("You convert natural-language schedules into cron expressions and " "systemd OnCalendar strings. Output JSON with keys: cron, systemd, " "note. If cron cannot exactly express the schedule, put the closest " "valid cron and explain in note. Do not output anything else.") messages = [ {"role": "system", "content": SYSTEM}, {"role": "user", "content": "Convert this schedule to cron and systemd OnCalendar: every weekday at 9am"}, ] prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False) print(generate(model, tokenizer, prompt=prompt, max_tokens=200, temp=0.0)) ``` ### Transformers (any platform) ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("Shpigford/cron-mini", torch_dtype="auto", device_map="auto") tokenizer = AutoTokenizer.from_pretrained("Shpigford/cron-mini") SYSTEM = "..." # same as above messages = [ {"role": "system", "content": SYSTEM}, {"role": "user", "content": "Convert this schedule to cron and systemd OnCalendar: every weekday at 9am"}, ] inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) out = model.generate(inputs, max_new_tokens=200, do_sample=False) print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True)) ``` ### llama.cpp / Ollama (GGUF) A GGUF version is available — see the Files tab for `.gguf` files. Load with llama.cpp or import into Ollama: ```bash ollama create cron-mini -f Modelfile ``` ## Evaluation Held-out test set of 91 cases including all the trick categories above: - **Overall (cron+systemd both correct):** 63/91 (69.2%) - **Cron exact match:** 73/91 (80.2%) - **Cron syntactically valid:** 87/91 (95.6%) - **systemd exact match:** 71/91 (78.0%) See `eval_results.json` in this repo for per-case results. ## Training - **Base model:** `Qwen/Qwen2.5-1.5B-Instruct` (Apache 2.0) - **Method:** LoRA fine-tune via [mlx-lm](https://github.com/ml-explore/mlx-examples/tree/main/llms) - **Hardware:** M4 Mac mini, 16GB unified memory - **Dataset:** ~3000 examples — hand-crafted hard cases + templated generation + Claude-API paraphrases and synthetic novel cases (verified with a self-check pass) - **Dataset on HF:** [Shpigford/cron-schedule-conversion](https://huggingface.co/datasets/Shpigford/cron-schedule-conversion) ## Limitations - The model emits a single best-guess for ambiguous fuzzy times (e.g., "morning" → 7am). It will not ask clarifying questions. - For "every other Monday" / "biweekly" / "fortnightly" patterns, cron cannot express them natively — the model emits "every Monday" and notes the limitation. Gate in your script with a week-of-year check. - For "last day of month" / "last Friday", cron has no native expression — the model approximates with day-of-month ranges and flags the limitation. - Vixie cron OR-matches DOM and DOW when both are restricted; the model emits expressions that work under the more common AND-matching interpretation. Verify on your specific cron implementation. - Time zone handling: cron has no built-in TZ field; the model emits the schedule in the system's local time and notes when a `TZ=` env var is needed. - Trained on English. Other languages will likely degrade significantly. ## License Apache 2.0, same as the base model. ## Citation If you find this useful: ```bibtex @misc{cron-mini, author = {Pigford, Josh}, title = {Cron-Mini: A Small Model for Schedule Conversion}, year = {2026}, howpublished = {Hugging Face}, url = {https://huggingface.co/Shpigford/cron-mini} } ```