| --- |
| title: The Apprentice |
| emoji: ๐ฒ |
| colorFrom: indigo |
| colorTo: yellow |
| sdk: docker |
| app_port: 7860 |
| suggested_hardware: cpu-basic |
| pinned: false |
| license: mit |
| short_description: Five oracles, five trials โ branching pixel-art game. |
| tags: |
| - track:wood |
| - sponsor:modal |
| - achievement:offgrid |
| - achievement:welltuned |
| - achievement:offbrand |
| - achievement:llama |
| - achievement:sharing |
| - achievement:fieldnotes |
| |
| - thousand-token-wood |
| |
| - well-tuned |
| - off-brand |
| - field-notes |
| - sharing-is-caring |
| - llama-champion |
| - tiny-titan |
| |
| - branching-narrative |
| - game |
| - pixel-art |
| - gradio |
| - vllm |
| - lora |
| - qwen |
| - bilingual |
| models: |
| - Qwen/Qwen2.5-14B-Instruct |
| - Qwen/Qwen2.5-1.5B-Instruct |
| - AndrewRqy/oracles-wizard-14b-lora |
| - AndrewRqy/oracles-wizard-1.5b-lora |
| --- |
| # Acknowledgement |
|
|
| This app is built by AndrewRqy. |
|
|
| # The Apprentice โ Build Small Hackathon |
|
|
| > A pixel-art branching fairy-tale. You inscribe five short oracles before the journey begins; an apprentice has to make every one of them save his life across five trials in a tree that converges on one of five distinct endings. |
|
|
| ## The idea |
|
|
| You play the mentor. You write five short oracles into a parchment โ any words at all: advice, gibberish, emoji, names, typos, whatever. After that you don't get to explain anything. Your apprentice walks five trials, and at each one he draws ONE oracle at random. Whatever it says, a fine-tuned Qwen2.5-14B has to take it seriously enough to save his life โ three humor modes (wild imagination / accidental trip / last-minute revelation), a 15-node branching tree that converges on one of five distinct endings, and six themes ร two languages (English + ็ฎไฝไธญๆ). The core joke: the player can write nonsense, but the world has to take it seriously. |
|
|
| ## The tech |
|
|
| - **Frontend**: a single-file Gradio Blocks app (~5000 lines), wrapped in a custom Docker image. ~2000 lines of bespoke CSS make sure nothing on the page looks like default Gradio โ Press Start 2P + VT323 fonts, NES-style sharp corners, hand-laid pixel-art panels. |
| - **Backend**: Qwen2.5-14B served via vLLM on a Modal-hosted L40S, with a custom-trained humor LoRA (rank 16, 23k examples, ~6.5h on H100, ~$22 of compute). The Gradio app talks to it via the OpenAI SDK. |
| - **Tiny Titan variant**: same 23k corpus trained into a Qwen2.5-1.5B LoRA โ eligible for the โค4B prize. |
| - **Llama Champion path**: the merged 14B exported to GGUF (Q4_K_M, 8.4 GB) and served via `llama-cpp-python`'s OpenAI-compatible server. `./run.sh --local-llama` swaps cloud for fully-local inference. |
| - **All art generated locally** via Klein-4B on a Modal H100, then chroma-keyed offline. ~105 pixel-art sprites. No FLUX, no commercial generators. |
|
|
| ## Quick links |
|
|
| - **Track**: Thousand Token Wood |
| - **Stack**: Docker + Gradio + Modal-hosted vLLM + Qwen2.5-14B + custom humor LoRA |
| - **Languages**: English + ็ฎไฝไธญๆ |
| - **Demo video**: https://youtu.be/Ica9BgX5ZDk |
| - **Social post**: https://x.com/AndrewRenqy/status/2066549274930741648 |
| - **Field notes (blog post)**: https://huggingface.co/blog/AndrewRqy/apprentice-blog-url |
| - **Field notes (repo)**: [`docs/FIELD_NOTES_apprentice.md`](../docs/FIELD_NOTES_apprentice.md) |
|
|
| > **Recommended for the best experience: run it locally in full mode.** The HF Space defaults to a stripped-down lean visual variant because of the bandwidth + cold-start constraints below. To see the parallax banner, parchment textures, scene landscapes, mentor/apprentice figures, animated trial scenes, and all the polish the way they were designed, clone the repo, drop the three Modal secrets into `.env.local`, and run `./run.sh --full`. See [Running it](#running-it) below for the full setup. |
| > |
| > **Note on loading time**: this Space ships ~100 pixel-art sprites + theme backdrops. HF Space's free CPU tier has slow egress bandwidth, so first paint of a fresh container can take a minute or two; subsequent page transitions are faster as the browser caches assets. The front-page dropdown lets you flip between **Lean** (small payload, fast loading, default on the Space) and **Full** (parallax banner, scene landscapes, all decorative PNGs โ recommended only on a fast connection or once the Space is warm). |
| > |
| > **Note on LLM cold start**: the Modal-hosted LLM container scales to zero when idle to avoid 24/7 billing during the review period. The first LLM call after the container has been idle (~20 min) pays a **~60-120s cold start** while vLLM loads the 14B weights + the LoRA adapter onto an L40S GPU. To hide this from the player, the app fires a background warmup ping to the Modal endpoint at startup, so by the time you've finished inscribing five oracles (~2-5 min of typing), the container should already be warm. If you click "Let the journey begin" immediately on a cold Space, expect the first trial to wait an extra minute. Every subsequent trial in the same session is instant. |
|
|
| ## What's inside |
|
|
| - **Frontend** โ Single-file Gradio app with a hand-authored pixel-art aesthetic. Press Start 2P + VT323 fonts, NES-style sharp corners, custom theme suppressing all default Gradio chrome. |
| - **Backend** โ Qwen2.5-14B + custom humor LoRA (`AndrewRqy/oracles-wizard-14b-lora`) served via vLLM on Modal. Frontend talks to it through the OpenAI SDK. |
| - **Tiny Titan path** โ Same 23k humor corpus trained into a Qwen2.5-1.5B LoRA (`AndrewRqy/oracles-wizard-1.5b-lora`). Eligible for the โค4B prize. |
| - **Branching narrative** โ Hand-authored 15-node story tree with 5 endings. Each fork at trials 2โ4 is decided by an LLM call seeded with one of the player's oracles, so the path the apprentice walks is shaped by what was inscribed. |
| - **6 themes ร 2 languages** โ Fantasy, Space-Cowboy, Galactic-Light, Black-Land, Mistgate, Quiet-Years. Theme-neutral story nodes + per-theme vocabulary expansion at runtime. |
| - **All art generated locally** โ ~105 pixel-art sprites via Klein-4B on a Modal H100, chroma-keyed offline. No FLUX, no commercial generators. |
|
|
| ## How to play |
|
|
| 1. **Inscribe** โ pick a language, theme, visual mode, and narration length. Then write five short oracles. Any words; gibberish counts, emoji counts. |
| 2. **Send-off** โ the mentor seals the parchments. The apprentice leaves. |
| 3. **Five trials** โ at each obstacle, the apprentice draws ONE oracle. The model takes the obstacle + oracle and writes a ~200-word resolution in one of three humor modes (wild imagination / accidental trip / last-minute revelation). |
| 4. **Boss** โ trial 5 is the world's finale (dragon, warlord-king, etc.). Different paths through the tree lead to different bosses. |
| 5. **Ending** โ one of 5 distinct endings plays, each with a hand-authored framing (why the boss behaved as it did + what the apprentice carried home), expanded by the LLM into a 3-paragraph epilogue. |
| 6. **Summary** โ the story tree shows the path you walked lit gold; the four endings you didn't reach blur behind "???" for replay. |
|
|
| ## Badge claims |
|
|
| | Badge | Why we claim it | |
| |---|---| |
| | ๐ฏ **Well-Tuned** | Qwen2.5-14B + a hand-distilled 23k-example humor LoRA (rank 16, ~6.5h on H100). Visibly steers all three humor modes; details in field notes. | |
| | ๐จ **Off-Brand** | ~2000 lines of bespoke CSS, Press Start 2P + VT323 fonts, hand-painted pixel-art sprites, custom story-tree visualization, custom ending banner. No stock Gradio chrome reaches the page. | |
| | ๐ **Field Notes** | Blog post: https://huggingface.co/blog/AndrewRqy/apprentice-blog-url โ
Repo mirror: [`docs/FIELD_NOTES_apprentice.md`](../docs/FIELD_NOTES_apprentice.md) โ a build diary covering what we designed and what broke. | |
| | ๐ก **Sharing-is-Caring** | [`traces/sample/`](traces/sample/) โ JSONL captures of every LLM call from a real playthrough (prompts, responses, latency, token usage, both requested and returned model id). LLM-call tracing is default-on; opt out with `ORACLES_TRACE_DISABLE=1`. | |
| | ๐ฆ **Llama Champion** | The LoRA-merged Qwen2.5-14B is exported to GGUF (Q4_K_M, ~8.4 GB) via the conversion job in [`modal_backend/modal_gguf_convert.py`](../modal_backend/modal_gguf_convert.py) and runs locally through `llama-cpp-python`'s OpenAI-compatible server. Launch with `./run.sh --local-llama` โ no Modal call required. | |
| | โก **Tiny Titan** | Same 23k corpus trained into a Qwen2.5-1.5B LoRA (~$5.50, ~1.5h on H100). Eligible for the โค4B prize. | |
|
|
| ## Running it |
|
|
| Three environment variables go in HF Space โ **Settings โ Variables and secrets** (or `.env.local` for local runs): |
|
|
| ``` |
| MODAL_URL = https://<workspace>--<app>-serve.modal.run |
| MODAL_KEY = wk-โฆ (Modal proxy auth key) |
| MODAL_SECRET = ws-โฆ (Modal proxy auth secret) |
| ``` |
|
|
| Locally: |
|
|
| ```bash |
| ./run.sh # lean mode, default |
| ./run.sh --full # all visual assets enabled (recommended on fast connections) |
| ``` |
|
|
| If `MODAL_URL` is unset OR `ORACLES_FORCE_MOCK=1`, the app runs in **mock mode** โ the UI still works, but narrations are hand-written placeholders. |
|
|
| ## Repo layout |
|
|
| ``` |
| oracles_app/ |
| โโโ app.py # main Gradio file |
| โโโ Dockerfile # HF Space Docker SDK entry |
| โโโ requirements.txt |
| โโโ oracles/ # state, LLM client, story graph, themes, i18n |
| โโโ prompts/ # LLM prompt templates |
| โโโ assets/sprites/ # ~105 chroma-keyed pixel-art PNGs |
| ``` |
|
|
| Dev-only dirs (`modal_backend/`, `scripts/`, `training/`, `tests/`, `lora-out/`) live on local disk but are `.gitignore`d from the Space upload. |
|
|
| ## Credits |
|
|
| - Base model โ Qwen2.5-14B-Instruct + Qwen2.5-1.5B-Instruct (Alibaba) |
| - Distillation teacher โ Claude Sonnet 4.5 (Anthropic) via OpenRouter |
| - Sprite generator โ Klein-4B (Anthropic) on Modal H100 |
| - Pixel-art fonts โ Press Start 2P + VT323 (Google Fonts) |
|
|
| Built for the **Build Small Hackathon โ Thousand Token Wood track** (2026-06-15). |
|
|