--- title: OffGridSchedula emoji: πŸ—“οΈ colorFrom: indigo colorTo: purple sdk: docker app_port: 7860 pinned: false license: apache-2.0 short_description: Local-first chat-to-calendar agent (Gemma-4 E4B + MiniCPM) tags: - track:backyard - sponsor:openbmb - sponsor:modal - achievement:offgrid - achievement:welltuned - achievement:offbrand - achievement:llama - achievement:sharing - achievement:fieldnotes models: - build-small-hackathon/gemma-4-cal-gguf - openbmb/MiniCPM5-1B-GGUF demo_video: - https://youtu.be/m-o0u9X3tI4 social_posts: - https://x.com/nate_mauer/status/2065973341651882386 - https://x.com/nate_mauer/status/2064920352845709419 - https://x.com/nate_mauer/status/2065661878441750916 - https://www.linkedin.com/feed/update/urn:li:ugcPost:7471440639969132545 blog_post: - https://huggingface.co/blog/build-small-hackathon/offgridschedula made_by: - ParetoOptimal - a.k.a., Nate Mauer --- # πŸ—“οΈ Message Scheduling Agent **OffGridSchedula turns a pasted chat (or a flyer screenshot) into calendar events, catches conflicts, and drafts the reply β€” right from your phone, no app, no account, no setup. iOS allows neither background iMessage access nor a persistent on-device LLM server, so there's no autonomous on-device agent to install; instead, a foreground Shortcut ([docs/automations.md](./docs/automations.md)) hands a thread or screenshot to the agent in two taps (optionally using a remote model via `INFERENCE_BASE_URL`).** The model runs on **your own server or even on the phone itself** and not on a cloud AI service. Your chats aren't shipped off to a third-party AI to be read; agent reads your snippet in memory and discards it after replying. The run trace you can optionally share is a redacted, sent to the agent you control that turns it into ready-to-add calendar events. **Hardware-aware.** With under-powered hardware, the app warns users with an upgrade banner rather than hanging, the real model needs a tiny GPU. ## Build Small submission β€” the idea & the tech **The idea.** A busy parent's calendar lives in other people's messages β€” picture day in the class chat, the practice that moved, the party flyer. OffGridSchedula turns those into calendar events: paste the chat (or snap the flyer) from a phone browser, review the extracted events, the conflicts against your own `.ics`, and a drafted reply β€” then add to Apple/Google Calendar in a tap. **The tech.** Two small local models do the work. Extraction is [`gemma-cal` E4B](https://huggingface.co/build-small-hackathon/gemma-4-cal-gguf) (~4B effective params), our QLoRA fine-tune of Gemma-4 E4B that emits a single validated **ActionPlan** (events Β· conflicts Β· reply Β· clarifying question), served with **vision** through the official **llama.cpp** server inside this Docker Gradio Space β€” no cloud AI APIs. The fine-tune + its 60-example task eval ran entirely on **Modal** serverless GPUs, behind an eval gate that rejected eight regressed models before this one shipped. Conflict math is deterministic Python, the UI is fully custom, the agent doubles as an **MCP tool server**, and redacted run traces are public on the [Hub](https://huggingface.co/datasets/ParetoOptimal/offgridschedula-traces). Click **Run the agents** and a local **OpenBMB MiniCPM** planner (a second local llama-server) drives this same Space's MCP tools as a multi-step agent β€” extract β†’ check conflicts β†’ render `.ics` β€” with every step visible. Still zero cloud AI; every model under 32B. **What's new.** Extraction now reads the *logistics*, not just the date (see below): arrival-aware start times, durationβ†’end conversion, type-based reminders, and calendar-ready titles β€” each guaranteed by deterministic post-processing even when the model wobbles, and each shipped through a measured A/B eval ([full result tables](./training/data/ab_results.md): regex vs text-LLM vs **vision-LLM reading rendered screenshots only**). Calendar out got one-click too: a unified **Connect your calendar** block (Google OAuth β€” the token lives in *your* browser, never on the server; Outlook/Apple need no sign-in) and per-event **Google Β· Outlook Β· iCal** links, with the Google push verified end-to-end (push β†’ readback β†’ delete, 11/11). **The UX.** One decision β€” **Offline or Online** β€” re-themes the whole workflow card and sets the path: off-grid `.ics` only, or a **one-click "Connect your calendar"** whose Google OAuth token lives *only in the browser* (server-verified each visit; the client secret never leaves the server). Results land in a single card: events, conflicts, the drafted reply, and per-event **Google Β· Outlook Β· iCal Β· .ics** quick-add links. **Activity β†’ This week** tallies events captured, conflicts caught, and time saved; a per-device **Memory** (localStorage, one-click samples) feeds names and preferences back into extraction. **Submission links:** [requirement-by-requirement mapping](./docs/build-small-submission.md) Β· [demo video](https://youtu.be/m-o0u9X3tI4) Β· social posts [1](https://x.com/nate_mauer/status/2064920352845709419) Β· [2](https://x.com/nate_mauer/status/2065661878441750916) ## Who this is for A busy parent whose kid's school and activity events are buried in a noisy class group chat β€” picture day Thursday, the practice that moved to Tuesday, the birthday-party RSVP. They read it once, mean to add it later, and miss it. With this, they **paste the chat** (or a **screenshot** of a flyer or invite) from their phone's browser and get back: the events, a **conflict check** against their calendar, and a **ready-to-send reply** β€” all surfaced for review before anything is saved. Output is a local `.ics` they can add to any calendar, with optional Google Calendar push. No app to install and no account. It reads nothing automatically β€” the parent pastes only what they choose. Inference runs **in the Space** via `llama.cpp` (no cloud AI APIs), and works out of the box with no GPU (see *Accuracy upgrade* below). ## The model: `gemma-cal` E4B β€” one calendar-native LLM, built for exactly this What makes this platform different isn't a prompt wrapped around a generic chatbot β€” it's **[`gemma-cal` E4B](https://huggingface.co/build-small-hackathon/gemma-4-cal-gguf), our own fine-tune of Gemma-4 E4B purpose-built for one job: turning messy human conversation into calendar-ready structure.** The model doesn't chat. It reads a thread (or a flyer photo) and emits a single validated **ActionPlan** β€” events with exact ISO datetimes, conflicts, proposed alternatives, a drafted reply, and a clarifying question when the plan is too vague to schedule. **It is the one and only model the platform runs**, everywhere from the production Space to a laptop. - **Edge-sized by design.** ~5 GB at Q4 β€” serves on a **~$0.40/hr 16 GB T4** (vs $4+/hr A100-class for big models), a gaming GPU, or an Apple-silicon laptop, with full **vision** (screenshots/flyers) via its mmproj. Local-first isn't a tagline; it's the parameter count. - **Schema-bulletproof.** The fine-tune holds **100% schema validity even with no system prompt**, with stronger no-event discipline (doesn't invent events from "thanks!") and a higher rate of *asking* when a date is TBD β€” the failure modes that actually burn users of generic models. - **Convention-trained.** It learns *this product's* date semantics ("next Tuesday" means next week's Tuesday; weekday-anchored relative dates) instead of whatever a base model absorbed from the internet. - **Eval-gated, never vibes-shipped.** Every retrain runs a 60-example task eval (start-exact datetime matching, F1, validity, clarification) and **cannot reach production unless it clears the gate** β€” the pipeline has rejected eight regressed models to date. The full, honest scorecard lives in [`docs/eval-roadmap.md`](./docs/eval-roadmap.md) and the [post-mortem write-up](./docs/blog-eval-gated-finetuning.md). **Hackathon size constraint (≀ 32B):** easily β€” E4B is ~4B effective parameters. See the in-app **πŸ† Submission** tab for the full compliance scorecard. ### Reads the logistics, not just the date A confirmation like *"Time: 10:30 AM Β· Duration: approx. 30–45 min Β· (Please arrive 15 minutes early to complete intake forms) Β· πŸ“ 112A West 72nd Street…"* becomes one correct event: - **Arrival-aware start** β€” the event starts at **10:15** (when you must show up), the official 10:30 is preserved in the notes, and the **end is anchored to the stated time + duration** (11:00), so the calendar block covers the forms *and* the visit. - **Type-based notifications** β€” an explicitly stated lead time always wins ("remind me 2 hours before" β†’ 120); otherwise doctor/medical visits get 60 minutes, parties 30, carpools and school events 45. - **Real-world addresses** β€” multi-line and πŸ“-emoji locations join into one string; "(Upper West Side β€” 72nd & Columbus)" glosses and SMS footers ("Reply C to confirm… call us at 212-223-0349") don't confuse it. - **Calendar-ready titles** β€” an action+subject summary ("Pick up Priya β€” Terminal 4"), not a quote of the message. The model is *taught* these conventions (prompt + fine-tune data), but the load-bearing ones are also **guaranteed by deterministic post-processing** (`apply_text_rules` in [`server/agent.py`](./server/agent.py)) β€” same philosophy as the conflict engine: must-hold logistics are never left to model temperament. Every behavior above shipped through a measured A/B eval β€” regex baseline vs text-LLM vs **vision-LLM reading rendered chat screenshots only** β€” with the full tables in [`training/data/ab_results.md`](./training/data/ab_results.md) (headline: text-LLM event F1 0.96 structured / 0.89 unstructured vs regex 0.60/0.67; the screenshot-only vision arm lands within a point of text). ## Try it in 30 seconds Open the Space in your phone's browser β†’ **Schedule** tab β†’ tap **Try a sample** (or paste your own group chat, and optionally a screenshot or your `.ics`) β†’ review the detected events β†’ **Download .ics**. The **Activity β†’ This week** panel then shows what you've captured and the time it saved. ## How it works ``` Paste a thread / screenshot ──▢ HF Space ──▢ llama.cpp ──▢ events + conflicts + reply (phone browser) β”‚ β”‚ custom Gradio UI ◀── review ──┐ β”Œβ”€β”€β”€β”€β”˜ β–Ό β–Ό .ics download / optional Google Calendar ``` The **primary path needs nothing but a browser**: paste text and/or attach a screenshot in the Schedule tab. (Power users can also auto-feed messages from a Mac β€” see *Optional: Mac collector*.) For the full solution-architecture view β€” every workflow and which LLM (if any) it calls, plus the eval-gated fine-tuning loop β€” see **[docs/architecture.md](./docs/architecture.md)**. ## Can it process multiple invites at once? **Yes β€” multiple invites in one paste is the designed path** (on the live Space, where the real model runs). `ActionPlan.events` is a *list*, and the extraction prompt explicitly tells the model that one thread often holds several events β€” a drop-off AND a pickup, or two appointments, are separate events (`server/agent.py`). Everything downstream is built for N events: the results card shows "*N events found*" with one card per invite, the editable table gets one row each, the `.ics` contains one `VEVENT` per event, each event carries its own Google/Outlook/Apple quick-add links, and the conflict check runs across all of them. Screenshot input is multi-file too β€” attach several flyers and they're all read in one run. Two caveats: - **Stub mode extracts only the first invite.** The local-dev heuristic (`_stub_plan` in `server/agent.py`, enabled by `USE_STUB_EXTRACTOR=1`) works with no model and no GPU β€” and it's now a decent parser in its own right (labeled times, explicit dates, multi-line/πŸ“ locations, durations, arrival-early shifts, type-based reminders) β€” but it still returns at most **one** event. If you paste a multi-invite thread locally and get one event back, that's the stub, not the product; the deployed Space uses the multi-event model path. - **Simultaneous *runs* are serialized, not parallel.** If two users (or two tabs) hit *Run the agents* at once, both complete, but inference executes one request at a time β€” `server/model.py` holds the llama.cpp instance behind a `threading.Lock`, and Gradio queues the events. On a single-GPU Space that's intentional (one model copy in memory); the second run simply waits its turn, then streams its own pipeline progress. ## Repo layout ``` app.py # Gradio + FastAPI entrypoint (the Space) server/ agent.py # thread (+images) -> validated ActionPlan orchestrator.py # Run the agents: MiniCPM planner driving our own MCP tools schema.py # Event / Conflict / ActionPlan pydantic models model.py # llama.cpp load: GGUF + vision mmproj, constrained JSON imageutil.py # image -> base64 data URI ui/blocks.py # custom Gradio Blocks (reasoning, events, conflicts, reply) static/app.css # custom CSS (Off-Brand) calendar_out/ ics.py # .ics generation (off-grid default) freebusy.py # parse existing .ics + deterministic conflict detection gcal.py # optional Google Calendar push collector/collector.py # Mac-side iMessage collector (text + image attachments) training/ # dataset build + QLoRA fine-tune + GGUF/mmproj export Dockerfile # dedicated-GPU Space: builds llama.cpp (0.3.28) WITH CUDA requirements-docker.txt # runtime deps for the Docker image (llama.cpp built separately) PLAN.md # full design + build plan ``` ## Quick start (local dev) β€” no GPU needed ```bash pip install -r requirements.txt # Runs the whole app with the built-in heuristic agent β€” no model, no GPU: export USE_STUB_EXTRACTOR=1 INGEST_TOKEN="dev-secret" python app.py # http://localhost:7860 ``` Open it, go to the **Schedule** tab, and tap **Try a sample** β€” or paste a thread, attach chat **screenshots**, and optionally upload your current calendar **`.ics`** for conflict checks. (Heads-up: the stub agent extracts only the **first** invite in a thread β€” multi-invite extraction needs the real model; see *Can it process multiple invites at once?* above.) Tip for self-hosted installs: set `CAL_ICS_PATH=/path/to/calendar.ics` and conflict checks use that file automatically whenever no `.ics` is uploaded β€” step 4 completes itself, fully offline. Review the detected events, conflicts, proposed times, and the suggested reply, then add any event with its **Add to: Google Β· Outlook Β· iCal Β· .ics** links (iCal and .ics both download the event's `.ics` file; with 2+ events an **iCal β€” all N events** link grabs everything at once). The **Activity β†’ This week** panel shows what you've captured. ## This week (impact) The Activity tab has a **This week** panel that persists across restarts: **events captured**, **conflicts caught**, and **estimated time saved**. A "capture" is counted when a run surfaces events for review (adding to a calendar happens through the per-event links, which the server can't observe). `minutes_saved` is a deliberately conservative, **configurable estimate β€” not a measurement**: `IMPACT_MIN_PER_EVENT` (default **8** min per captured event) + `IMPACT_MIN_PER_CONFLICT` (default **15** min per conflict caught). Override either via env. State persists to `IMPACT_PATH` (default `/tmp/impact_weeks.json`; point it at a persistent disk on a Space to survive rebuilds). ## Accuracy upgrade (optional) β€” serve the real `gemma-cal` LLM The stub agent above makes the demo work with **no GPU**. The production Space serves our fine-tuned **`gemma-cal` E4B** through `llama-server` β€” no cloud AI APIs either way. The same config works anywhere llama.cpp runs: ```bash export USE_STUB_EXTRACTOR=0 export MODEL_HF_REPO="build-small-hackathon/gemma-4-cal-gguf" export MODEL_FILE="gemma-cal-e4b-Q4_K_M.gguf" # ~5 GB edge fine-tune (what the Space serves) export MMPROJ_REPO="unsloth/gemma-4-E4B-it-GGUF" # the E4B's own vision projector export MMPROJ_FILE="mmproj-F16.gguf" # enables screenshot/vision input bash scripts/start_space.sh ``` This is the platform's **only** model β€” the same ~5 GB GGUF serves the production Space (16 GB T4), a gaming GPU, or a laptop. (`MODEL_FILE` is explicit on purpose: the model repo also stores legacy training artifacts, so the `-hf repo:Q4_K_M` shorthand is ambiguous.) ## Optional: Mac collector (power users) The phone-paste path above needs nothing installed. If you'd rather have new iMessages fed in automatically, run the collector on a Mac where iMessages sync (iOS exposes no API for message content, so a Mac is the only auto-feed source): ```bash cd collector && cp .env.example .env # edit SPACE_URL + INGEST_TOKEN python collector.py ``` > ⚠️ The collector needs **Full Disk Access** (System Settings β†’ Privacy & Security) to read `chat.db`. ## Autonomous & on a phone There's a single backend endpoint β€” **`POST /agent`** (bearer `INGEST_TOKEN`) β€” that takes a thread (or messages, + optional screenshot/`.ics`) and returns the extracted events, conflicts, and reply as JSON (optionally an `.ics` or a Google Calendar push). Every front-end calls it: - **Fully autonomous (Mac) β€” set-and-forget:** `INGEST_TOKEN=… MODEL_GGUF=~/models/hermes.gguf scripts/setup_mac.sh` installs three launchd jobs (Hermes `llama-server` + autonomous backend + collector). New iMessages **you send or accept** become calendar events automatically, deduped per chat. Triggers on outgoing messages by default (`TRIGGER_ON=outgoing`; `any` to widen). - **Hermes "grows-with-you" brain:** point `INFERENCE_BASE_URL` at a Hermes `llama-server`; its personal **memory** (peopleβ†’roles, "you decline Mondays") improves extraction over time and is shown in the dashboard **Memory** tab. See **[docs/hermes.md](./docs/hermes.md)**. - **iPhone, one tap:** an iOS **Shortcut** shares a thread/screenshot to `/agent` and adds the events to Apple Calendar natively β€” no `.ics` import. - **Android, hands-off:** a Tasker/MacroDroid rule on a notification/SMS calls `/agent` and inserts events. See **[docs/android-tasker.md](./docs/android-tasker.md)**. - **On-device model:** set `INFERENCE_BASE_URL` to a local `llama-server` (e.g. Gemma **E4B** or a small Hermes in Termux) so inference runs *on the phone* β€” same agent, env-selected. > **iOS can't read iMessage in the background** (no message API), so fully-autonomous iMessage needs > the Mac collector; the iPhone path is one-gesture. See **[docs/automations.md](./docs/automations.md)** > and **[docs/on-device.md](./docs/on-device.md)**. ## Build Small β€” prizes & quests **Track: 🏑 Backyard AI** (`track:backyard`) β€” a practical app for a specific real person: a busy parent whose family calendar is buried in a noisy class group chat. ### Sponsor awards we compete for | Award | Why this submission qualifies | |---|---| | 🟒 **Modal Awards** (best Modal-powered apps) | **Modal powered the development of the platform's model end-to-end** β€” required note, gladly given: [`training/modal_train.py`](./training/modal_train.py) (QLoRA fine-tune on serverless A100/H100s, Volumes caching weights), [`training/modal_eval.py`](./training/modal_eval.py) + [`modal_quant_eval.py`](./training/modal_quant_eval.py) (the task eval served on llama.cpp inside Modal, incl. an f16/Q8_0/Q4_K_M quantization study and the regex/text/vision A/B harness), and [`training/gated_retrain.py`](./training/gated_retrain.py) (train β†’ staging β†’ eval β†’ promote *only past the gate* β€” eight regressed models rejected, every run a Modal job). | | 🌱 **OpenBMB Awards** (standout MiniCPM builds, per track) | The **agent is planned by OpenBMB MiniCPM** (`openbmb/MiniCPM4.1-8B-GGUF`, Q4; the 1B variant is a config switch) on a second local llama-server, driving this Space's own MCP tools (`extract_events β†’ check_conflicts β†’ make_ics`) as a visible multi-step agent ([`server/orchestrator.py`](./server/orchestrator.py)). MiniCPM is the agent's brain, not a garnish. | *(Not claimed: the OpenAI Track β€” no Codex-attributed commits β€” and the NVIDIA Nemotron Quest β€” different model family. We'd rather be honest than eligible.)* ### Special awards β€” our case | Award | Our case | |---|---| | πŸŽ–οΈ **Bonus Quest Champion** | All **six** collectable quests claimed with evidence β€” the full sash (table below). | | 🎨 **Off-Brand Award** | Custom landing page, hero + carousel, grouped nav, bespoke results cards and Activity dashboard β€” [`ui/blocks.py`](./ui/blocks.py) + [`static/app.css`](./static/app.css), far past the stock Gradio look. | | 🐜 **Tiny Titan** | The platform's one and only model is **Gemma E4B β€” ~4B *effective* parameters** (~5 GB at Q4, serves on a 16 GB T4 or a laptop), and a 1B MiniCPM planner variant is a config switch. Honest framing: E4B is a MatFormer "effective-4B" β€” judges' call whether that's tiny enough. | | 🎬 **Best Demo** | App + demo video + social post as one package β€” storyboard with every quest named on-camera in [`docs/demo-script.md`](./docs/demo-script.md). | | πŸ€– **Best Agent** | The MiniCPM-planned, MCP-tool-driven agent above β€” real multi-step tool use, every model under the 32B cap. | | πŸƒ **Judges' Wildcard** | No entry needed β€” but if "eval-gated fine-tuning with a public failure post-mortem" fits no category, we know where to find you. | ### Collectable quests β€” all six claimed | Quest | Evidence | |---|---| | πŸ”Œ **Off the Grid** (local-first, no cloud APIs) | All inference is llama.cpp inside the Space; the only optional outbound call is the user's own Google Calendar push. | | 🎯 **Well-Tuned** (published fine-tune) | [`gemma-cal` E4B](https://huggingface.co/build-small-hackathon/gemma-4-cal-gguf) β€” our QLoRA fine-tune **is the model production serves**, shipped through the eval gate with the [honest scorecard public](./docs/eval-roadmap.md). | | 🎨 **Off-Brand** (custom UI) | See the Off-Brand Award case above. | | πŸ¦™ **Llama Champion** (llama.cpp runtime) | The official `ghcr.io/ggml-org/llama.cpp` server image runs the GGUF + vision mmproj ([`Dockerfile`](./Dockerfile), [`scripts/start_space.sh`](./scripts/start_space.sh)). | | πŸ“‘ **Sharing is Caring** (open trace on the Hub) | Redacted agent traces published to [`ParetoOptimal/offgridschedula-traces`](https://huggingface.co/datasets/ParetoOptimal/offgridschedula-traces) β€” one click from the Activity tab. | | πŸ““ **Field Notes** (write-up) | [`FIELD_NOTES.md`](./FIELD_NOTES.md) + the [eval-gated fine-tuning post-mortem](./docs/blog-eval-gated-finetuning.md) + [project blog](https://huggingface.co/blog/build-small-hackathon/offgridschedula). | ## Fine-tune on Modal (GPU) `training/modal_train.py` runs the whole fine-tune on a serverless GPU and publishes the GGUF to HF β€” no local GPU needed. It's a thin wrapper that ships this repo to Modal and runs the existing pipeline (`make_dataset.py` β†’ `train_qlora.py` β†’ `export_gguf.sh`) on an A100/H100, then uploads the quantized GGUF + `mmproj` to your HF repo. This is all *offline* prep, so **Off the Grid** is untouched (the rule applies to the running app's inference, not dataset/training prep). ```bash pip install modal modal token new modal secret create huggingface HF_TOKEN=hf_xxxxxxxx # your HF *write* token # Validate the full pipeline cheaply first (cheap edge model, ~a couple $): modal run training/modal_train.py --base-model google/gemma-4-E4B-it # Then the real run (default A100-80GB; --gpu H100 for speed): modal run training/modal_train.py modal run training/modal_train.py --gpu H100 --num-epochs 3 ``` On finish it prints the `MODEL_REPO` / `MODEL_FILE` / `MMPROJ_FILE` to set on the Space. Two persistent Modal Volumes cache the base-model download and the outputs across runs, so iterating on `training/data/dataset.jsonl` only re-pays for the training itself. > Cost (A100-80GB β‰ˆ $2.5/hr, per-second billing): a few-hundred-to-2000-example QLoRA run is > ~1–3 hr β‰ˆ $5–15, so ~$250 of credit β‰ˆ 15–40 full iterations. Expand the dataset before the > first real 31B run β€” the seeds in `make_dataset.py` are a smoke test, not a training set. ### Publish your fine-tune & point the Space at it The training run is the one step that spends **your** GPU/Modal credits β€” it's not done for you. Once you've run it, the path is turnkey: 1. **Recommended:** `python training/gated_retrain.py` β€” train β†’ staging upload β†’ 60-example eval β†’ **promote only if it beats the gate**. A regressed model cannot reach production. (Raw `modal run training/modal_train.py` is the ungated equivalent for experiments.) 2. Point the Space at *your* model via **Space variables** (`scripts/start_space.sh` reads them at launch; set in *Settings β†’ Variables* or with `HfApi().add_space_variable`): ``` MODEL_HF_REPO = /gemma-cal-gguf MODEL_FILE = gemma-cal-e4b-Q4_K_M.gguf # explicit file β€” repo may hold several quants/tiers MMPROJ_REPO = unsloth/gemma-4-E4B-it-GGUF # projector repo, if different from the LLM's MMPROJ_FILE = mmproj-F16.gguf # enables screenshot/vision input ``` The deploy workflow stays a plain git mirror β€” the model is pulled at runtime, never committed. 3. Push to `main` β†’ CI deploys β†’ the Space now serves your fine-tune (**Well-Tuned**). ## Share a trace (Sharing is Caring) Want others to learn from a run? In the **Activity** tab, click **⬇ Download trace (JSON)** β€” the trace stays on your device, and the hosted Space holds **no Hub token**. Personal data is redacted by default (the activity log only carries counts + status; the one chat-name field is stripped). Then publish it from your own machine, with your own login: ```bash huggingface-cli login # or export HF_TOKEN=... python training/share_trace.py trace.json --public # -> a HF dataset repo of traces ``` ## Field notes [**FIELD_NOTES.md**](./FIELD_NOTES.md) is the build retrospective β€” the iOSβ†’`chat.db` pivot, the `attributedBody` trap, why conflict math is deterministic, stub-first architecture, the reframe-around-one-person lesson, and the Off-the-Grid trade-offs. ## Remote automation (runs without an interactive session) | Workflow | Trigger | What it does | Needs | |---|---|---|---| | `.github/workflows/ci.yml` β†’ **test** | push / PR | compile + `pytest` (stub mode, no GPU) | nothing | | `.github/workflows/ci.yml` β†’ **deploy** | push to `main`, after tests pass | `huggingface-cli upload` the repo to the HF Space (Gradio SDK; model excluded, pulled at runtime) | secret `HF_TOKEN`, var `SPACE_ID` | | `.github/workflows/maintenance.yml` | daily + manual | ping the Space `/health`, audit outdated deps β†’ open/update a GitHub issue | var `SPACE_HEALTH_URL` | One-time setup for deploy + monitoring: ```bash gh secret set HF_TOKEN # HF write token gh variable set SPACE_ID -b "/" gh variable set SPACE_HEALTH_URL -b "https://-.hf.space/health" ``` CI installs `requirements-ci.txt` (excludes `llama-cpp-python` and the Google libs β€” both are imported lazily and not needed for the stub-mode tests). A weekly Claude `/schedule` routine handles the judgment work (grow `training/data/dataset.jsonl` β†’ PR, triage CI failures).