File size: 9,916 Bytes
5afb7b3 b32b87d 5afb7b3 0da3d81 5afb7b3 0a4cd0e 95df4b5 5afb7b3 95df4b5 5afb7b3 95df4b5 5afb7b3 95df4b5 5afb7b3 95df4b5 5afb7b3 80d1ab1 5afb7b3 f669e5b 5afb7b3 95df4b5 f669e5b 05d71bb de38060 5afb7b3 f60684e 0fc9e33 a777ed3 0fc9e33 5afb7b3 f669e5b 5afb7b3 f669e5b 5afb7b3 f669e5b 5afb7b3 f669e5b de38060 fd94fc8 fd741cf f669e5b 5afb7b3 f669e5b 5afb7b3 f669e5b 5afb7b3 f669e5b 5afb7b3 f669e5b 5afb7b3 f669e5b 5afb7b3 f669e5b 5afb7b3 f669e5b 5afb7b3 f669e5b 5afb7b3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | ---
title: The Apprentice
emoji: ๐ฒ
colorFrom: indigo
colorTo: yellow
sdk: docker
app_port: 7860
suggested_hardware: cpu-basic
pinned: false
license: mit
short_description: Five oracles, five trials โ branching pixel-art game.
tags:
- track:wood
- sponsor:modal
- achievement:offgrid
- achievement:welltuned
- achievement:offbrand
- achievement:llama
- achievement:sharing
- achievement:fieldnotes
# Track
- thousand-token-wood
# Badge claims
- well-tuned
- off-brand
- field-notes
- sharing-is-caring
- llama-champion
- tiny-titan
# Descriptive
- branching-narrative
- game
- pixel-art
- gradio
- vllm
- lora
- qwen
- bilingual
models:
- Qwen/Qwen2.5-14B-Instruct
- Qwen/Qwen2.5-1.5B-Instruct
- AndrewRqy/oracles-wizard-14b-lora
- AndrewRqy/oracles-wizard-1.5b-lora
---
# Acknowledgement
This app is built by AndrewRqy.
# The Apprentice โ Build Small Hackathon
> A pixel-art branching fairy-tale. You inscribe five short oracles before the journey begins; an apprentice has to make every one of them save his life across five trials in a tree that converges on one of five distinct endings.
## The idea
You play the mentor. You write five short oracles into a parchment โ any words at all: advice, gibberish, emoji, names, typos, whatever. After that you don't get to explain anything. Your apprentice walks five trials, and at each one he draws ONE oracle at random. Whatever it says, a fine-tuned Qwen2.5-14B has to take it seriously enough to save his life โ three humor modes (wild imagination / accidental trip / last-minute revelation), a 15-node branching tree that converges on one of five distinct endings, and six themes ร two languages (English + ็ฎไฝไธญๆ). The core joke: the player can write nonsense, but the world has to take it seriously.
## The tech
- **Frontend**: a single-file Gradio Blocks app (~5000 lines), wrapped in a custom Docker image. ~2000 lines of bespoke CSS make sure nothing on the page looks like default Gradio โ Press Start 2P + VT323 fonts, NES-style sharp corners, hand-laid pixel-art panels.
- **Backend**: Qwen2.5-14B served via vLLM on a Modal-hosted L40S, with a custom-trained humor LoRA (rank 16, 23k examples, ~6.5h on H100, ~$22 of compute). The Gradio app talks to it via the OpenAI SDK.
- **Tiny Titan variant**: same 23k corpus trained into a Qwen2.5-1.5B LoRA โ eligible for the โค4B prize.
- **Llama Champion path**: the merged 14B exported to GGUF (Q4_K_M, 8.4 GB) and served via `llama-cpp-python`'s OpenAI-compatible server. `./run.sh --local-llama` swaps cloud for fully-local inference.
- **All art generated locally** via Klein-4B on a Modal H100, then chroma-keyed offline. ~105 pixel-art sprites. No FLUX, no commercial generators.
## Quick links
- **Track**: Thousand Token Wood
- **Stack**: Docker + Gradio + Modal-hosted vLLM + Qwen2.5-14B + custom humor LoRA
- **Languages**: English + ็ฎไฝไธญๆ
- **Demo video**: https://youtu.be/Ica9BgX5ZDk
- **Social post**: https://x.com/AndrewRenqy/status/2066549274930741648
- **Field notes (blog post)**: https://huggingface.co/blog/AndrewRqy/apprentice-blog-url
- **Field notes (repo)**: [`docs/FIELD_NOTES_apprentice.md`](../docs/FIELD_NOTES_apprentice.md)
> **Recommended for the best experience: run it locally in full mode.** The HF Space defaults to a stripped-down lean visual variant because of the bandwidth + cold-start constraints below. To see the parallax banner, parchment textures, scene landscapes, mentor/apprentice figures, animated trial scenes, and all the polish the way they were designed, clone the repo, drop the three Modal secrets into `.env.local`, and run `./run.sh --full`. See [Running it](#running-it) below for the full setup.
>
> **Note on loading time**: this Space ships ~100 pixel-art sprites + theme backdrops. HF Space's free CPU tier has slow egress bandwidth, so first paint of a fresh container can take a minute or two; subsequent page transitions are faster as the browser caches assets. The front-page dropdown lets you flip between **Lean** (small payload, fast loading, default on the Space) and **Full** (parallax banner, scene landscapes, all decorative PNGs โ recommended only on a fast connection or once the Space is warm).
>
> **Note on LLM cold start**: the Modal-hosted LLM container scales to zero when idle to avoid 24/7 billing during the review period. The first LLM call after the container has been idle (~20 min) pays a **~60-120s cold start** while vLLM loads the 14B weights + the LoRA adapter onto an L40S GPU. To hide this from the player, the app fires a background warmup ping to the Modal endpoint at startup, so by the time you've finished inscribing five oracles (~2-5 min of typing), the container should already be warm. If you click "Let the journey begin" immediately on a cold Space, expect the first trial to wait an extra minute. Every subsequent trial in the same session is instant.
## What's inside
- **Frontend** โ Single-file Gradio app with a hand-authored pixel-art aesthetic. Press Start 2P + VT323 fonts, NES-style sharp corners, custom theme suppressing all default Gradio chrome.
- **Backend** โ Qwen2.5-14B + custom humor LoRA (`AndrewRqy/oracles-wizard-14b-lora`) served via vLLM on Modal. Frontend talks to it through the OpenAI SDK.
- **Tiny Titan path** โ Same 23k humor corpus trained into a Qwen2.5-1.5B LoRA (`AndrewRqy/oracles-wizard-1.5b-lora`). Eligible for the โค4B prize.
- **Branching narrative** โ Hand-authored 15-node story tree with 5 endings. Each fork at trials 2โ4 is decided by an LLM call seeded with one of the player's oracles, so the path the apprentice walks is shaped by what was inscribed.
- **6 themes ร 2 languages** โ Fantasy, Space-Cowboy, Galactic-Light, Black-Land, Mistgate, Quiet-Years. Theme-neutral story nodes + per-theme vocabulary expansion at runtime.
- **All art generated locally** โ ~105 pixel-art sprites via Klein-4B on a Modal H100, chroma-keyed offline. No FLUX, no commercial generators.
## How to play
1. **Inscribe** โ pick a language, theme, visual mode, and narration length. Then write five short oracles. Any words; gibberish counts, emoji counts.
2. **Send-off** โ the mentor seals the parchments. The apprentice leaves.
3. **Five trials** โ at each obstacle, the apprentice draws ONE oracle. The model takes the obstacle + oracle and writes a ~200-word resolution in one of three humor modes (wild imagination / accidental trip / last-minute revelation).
4. **Boss** โ trial 5 is the world's finale (dragon, warlord-king, etc.). Different paths through the tree lead to different bosses.
5. **Ending** โ one of 5 distinct endings plays, each with a hand-authored framing (why the boss behaved as it did + what the apprentice carried home), expanded by the LLM into a 3-paragraph epilogue.
6. **Summary** โ the story tree shows the path you walked lit gold; the four endings you didn't reach blur behind "???" for replay.
## Badge claims
| Badge | Why we claim it |
|---|---|
| ๐ฏ **Well-Tuned** | Qwen2.5-14B + a hand-distilled 23k-example humor LoRA (rank 16, ~6.5h on H100). Visibly steers all three humor modes; details in field notes. |
| ๐จ **Off-Brand** | ~2000 lines of bespoke CSS, Press Start 2P + VT323 fonts, hand-painted pixel-art sprites, custom story-tree visualization, custom ending banner. No stock Gradio chrome reaches the page. |
| ๐ **Field Notes** | Blog post: https://huggingface.co/blog/AndrewRqy/apprentice-blog-url โ
Repo mirror: [`docs/FIELD_NOTES_apprentice.md`](../docs/FIELD_NOTES_apprentice.md) โ a build diary covering what we designed and what broke. |
| ๐ก **Sharing-is-Caring** | [`traces/sample/`](traces/sample/) โ JSONL captures of every LLM call from a real playthrough (prompts, responses, latency, token usage, both requested and returned model id). LLM-call tracing is default-on; opt out with `ORACLES_TRACE_DISABLE=1`. |
| ๐ฆ **Llama Champion** | The LoRA-merged Qwen2.5-14B is exported to GGUF (Q4_K_M, ~8.4 GB) via the conversion job in [`modal_backend/modal_gguf_convert.py`](../modal_backend/modal_gguf_convert.py) and runs locally through `llama-cpp-python`'s OpenAI-compatible server. Launch with `./run.sh --local-llama` โ no Modal call required. |
| โก **Tiny Titan** | Same 23k corpus trained into a Qwen2.5-1.5B LoRA (~$5.50, ~1.5h on H100). Eligible for the โค4B prize. |
## Running it
Three environment variables go in HF Space โ **Settings โ Variables and secrets** (or `.env.local` for local runs):
```
MODAL_URL = https://<workspace>--<app>-serve.modal.run
MODAL_KEY = wk-โฆ (Modal proxy auth key)
MODAL_SECRET = ws-โฆ (Modal proxy auth secret)
```
Locally:
```bash
./run.sh # lean mode, default
./run.sh --full # all visual assets enabled (recommended on fast connections)
```
If `MODAL_URL` is unset OR `ORACLES_FORCE_MOCK=1`, the app runs in **mock mode** โ the UI still works, but narrations are hand-written placeholders.
## Repo layout
```
oracles_app/
โโโ app.py # main Gradio file
โโโ Dockerfile # HF Space Docker SDK entry
โโโ requirements.txt
โโโ oracles/ # state, LLM client, story graph, themes, i18n
โโโ prompts/ # LLM prompt templates
โโโ assets/sprites/ # ~105 chroma-keyed pixel-art PNGs
```
Dev-only dirs (`modal_backend/`, `scripts/`, `training/`, `tests/`, `lora-out/`) live on local disk but are `.gitignore`d from the Space upload.
## Credits
- Base model โ Qwen2.5-14B-Instruct + Qwen2.5-1.5B-Instruct (Alibaba)
- Distillation teacher โ Claude Sonnet 4.5 (Anthropic) via OpenRouter
- Sprite generator โ Klein-4B (Anthropic) on Modal H100
- Pixel-art fonts โ Press Start 2P + VT323 (Google Fonts)
Built for the **Build Small Hackathon โ Thousand Token Wood track** (2026-06-15).
|