the-apprentice

Sleeping

File size: 9,916 Bytes

5afb7b3
 
 
 
 
b32b87d
 
5afb7b3
 
 
0da3d81
5afb7b3
0a4cd0e
 
 
 
 
 
 
 
95df4b5
5afb7b3
95df4b5
5afb7b3
95df4b5
5afb7b3
95df4b5
 
5afb7b3
95df4b5
5afb7b3
 
 
 
 
 
 
 
 
 
 
 
 
 
80d1ab1
 
 
5afb7b3
 
 
f669e5b
5afb7b3
95df4b5
 
 
 
 
 
 
 
 
 
 
 
 
 
f669e5b
 
 
 
05d71bb
de38060
 
5afb7b3
f60684e
 
0fc9e33
a777ed3
0fc9e33
5afb7b3
f669e5b
5afb7b3
f669e5b
 
 
 
 
 
5afb7b3
 
 
f669e5b
 
 
 
 
 
5afb7b3
 
 
 
 
f669e5b
 
de38060
fd94fc8
fd741cf
f669e5b
5afb7b3
f669e5b
5afb7b3
f669e5b
5afb7b3
 
f669e5b
5afb7b3
 
 
 
f669e5b
 
 
 
 
 
5afb7b3
f669e5b
5afb7b3
f669e5b
5afb7b3
 
 
f669e5b
 
 
 
 
 
5afb7b3
 
f669e5b
5afb7b3

---
title: The Apprentice
emoji: 🌲
colorFrom: indigo
colorTo: yellow
sdk: docker
app_port: 7860
suggested_hardware: cpu-basic
pinned: false
license: mit
short_description: Five oracles, five trials — branching pixel-art game.
tags:
  - track:wood
  - sponsor:modal
  - achievement:offgrid
  - achievement:welltuned
  - achievement:offbrand
  - achievement:llama
  - achievement:sharing
  - achievement:fieldnotes
  # Track
  - thousand-token-wood
  # Badge claims
  - well-tuned
  - off-brand
  - field-notes
  - sharing-is-caring
  - llama-champion
  - tiny-titan
  # Descriptive
  - branching-narrative
  - game
  - pixel-art
  - gradio
  - vllm
  - lora
  - qwen
  - bilingual
models:
  - Qwen/Qwen2.5-14B-Instruct
  - Qwen/Qwen2.5-1.5B-Instruct
  - AndrewRqy/oracles-wizard-14b-lora
  - AndrewRqy/oracles-wizard-1.5b-lora
---
# Acknowledgement

This app is built by AndrewRqy.

# The Apprentice — Build Small Hackathon

> A pixel-art branching fairy-tale. You inscribe five short oracles before the journey begins; an apprentice has to make every one of them save his life across five trials in a tree that converges on one of five distinct endings.

## The idea

You play the mentor. You write five short oracles into a parchment — any words at all: advice, gibberish, emoji, names, typos, whatever. After that you don't get to explain anything. Your apprentice walks five trials, and at each one he draws ONE oracle at random. Whatever it says, a fine-tuned Qwen2.5-14B has to take it seriously enough to save his life — three humor modes (wild imagination / accidental trip / last-minute revelation), a 15-node branching tree that converges on one of five distinct endings, and six themes × two languages (English + 简体中文). The core joke: the player can write nonsense, but the world has to take it seriously.

## The tech

- **Frontend**: a single-file Gradio Blocks app (~5000 lines), wrapped in a custom Docker image. ~2000 lines of bespoke CSS make sure nothing on the page looks like default Gradio — Press Start 2P + VT323 fonts, NES-style sharp corners, hand-laid pixel-art panels.
- **Backend**: Qwen2.5-14B served via vLLM on a Modal-hosted L40S, with a custom-trained humor LoRA (rank 16, 23k examples, ~6.5h on H100, ~$22 of compute). The Gradio app talks to it via the OpenAI SDK.
- **Tiny Titan variant**: same 23k corpus trained into a Qwen2.5-1.5B LoRA — eligible for the ≤4B prize.
- **Llama Champion path**: the merged 14B exported to GGUF (Q4_K_M, 8.4 GB) and served via `llama-cpp-python`'s OpenAI-compatible server. `./run.sh --local-llama` swaps cloud for fully-local inference.
- **All art generated locally** via Klein-4B on a Modal H100, then chroma-keyed offline. ~105 pixel-art sprites. No FLUX, no commercial generators.

## Quick links

- **Track**: Thousand Token Wood
- **Stack**: Docker + Gradio + Modal-hosted vLLM + Qwen2.5-14B + custom humor LoRA
- **Languages**: English + 简体中文
- **Demo video**: https://youtu.be/Ica9BgX5ZDk
- **Social post**: https://x.com/AndrewRenqy/status/2066549274930741648
- **Field notes (blog post)**: https://huggingface.co/blog/AndrewRqy/apprentice-blog-url
- **Field notes (repo)**: [`docs/FIELD_NOTES_apprentice.md`](../docs/FIELD_NOTES_apprentice.md)

> **Recommended for the best experience: run it locally in full mode.** The HF Space defaults to a stripped-down lean visual variant because of the bandwidth + cold-start constraints below. To see the parallax banner, parchment textures, scene landscapes, mentor/apprentice figures, animated trial scenes, and all the polish the way they were designed, clone the repo, drop the three Modal secrets into `.env.local`, and run `./run.sh --full`. See [Running it](#running-it) below for the full setup.
>
> **Note on loading time**: this Space ships ~100 pixel-art sprites + theme backdrops. HF Space's free CPU tier has slow egress bandwidth, so first paint of a fresh container can take a minute or two; subsequent page transitions are faster as the browser caches assets. The front-page dropdown lets you flip between **Lean** (small payload, fast loading, default on the Space) and **Full** (parallax banner, scene landscapes, all decorative PNGs — recommended only on a fast connection or once the Space is warm).
>
> **Note on LLM cold start**: the Modal-hosted LLM container scales to zero when idle to avoid 24/7 billing during the review period. The first LLM call after the container has been idle (~20 min) pays a **~60-120s cold start** while vLLM loads the 14B weights + the LoRA adapter onto an L40S GPU. To hide this from the player, the app fires a background warmup ping to the Modal endpoint at startup, so by the time you've finished inscribing five oracles (~2-5 min of typing), the container should already be warm. If you click "Let the journey begin" immediately on a cold Space, expect the first trial to wait an extra minute. Every subsequent trial in the same session is instant.

## What's inside

- **Frontend** — Single-file Gradio app with a hand-authored pixel-art aesthetic. Press Start 2P + VT323 fonts, NES-style sharp corners, custom theme suppressing all default Gradio chrome.
- **Backend** — Qwen2.5-14B + custom humor LoRA (`AndrewRqy/oracles-wizard-14b-lora`) served via vLLM on Modal. Frontend talks to it through the OpenAI SDK.
- **Tiny Titan path** — Same 23k humor corpus trained into a Qwen2.5-1.5B LoRA (`AndrewRqy/oracles-wizard-1.5b-lora`). Eligible for the ≤4B prize.
- **Branching narrative** — Hand-authored 15-node story tree with 5 endings. Each fork at trials 2–4 is decided by an LLM call seeded with one of the player's oracles, so the path the apprentice walks is shaped by what was inscribed.
- **6 themes × 2 languages** — Fantasy, Space-Cowboy, Galactic-Light, Black-Land, Mistgate, Quiet-Years. Theme-neutral story nodes + per-theme vocabulary expansion at runtime.
- **All art generated locally** — ~105 pixel-art sprites via Klein-4B on a Modal H100, chroma-keyed offline. No FLUX, no commercial generators.

## How to play

1. **Inscribe** — pick a language, theme, visual mode, and narration length. Then write five short oracles. Any words; gibberish counts, emoji counts.
2. **Send-off** — the mentor seals the parchments. The apprentice leaves.
3. **Five trials** — at each obstacle, the apprentice draws ONE oracle. The model takes the obstacle + oracle and writes a ~200-word resolution in one of three humor modes (wild imagination / accidental trip / last-minute revelation).
4. **Boss** — trial 5 is the world's finale (dragon, warlord-king, etc.). Different paths through the tree lead to different bosses.
5. **Ending** — one of 5 distinct endings plays, each with a hand-authored framing (why the boss behaved as it did + what the apprentice carried home), expanded by the LLM into a 3-paragraph epilogue.
6. **Summary** — the story tree shows the path you walked lit gold; the four endings you didn't reach blur behind "???" for replay.

## Badge claims

| Badge | Why we claim it |
|---|---|
| 🎯 **Well-Tuned** | Qwen2.5-14B + a hand-distilled 23k-example humor LoRA (rank 16, ~6.5h on H100). Visibly steers all three humor modes; details in field notes. |
| 🎨 **Off-Brand** | ~2000 lines of bespoke CSS, Press Start 2P + VT323 fonts, hand-painted pixel-art sprites, custom story-tree visualization, custom ending banner. No stock Gradio chrome reaches the page. |
| 📓 **Field Notes** | Blog post: https://huggingface.co/blog/AndrewRqy/apprentice-blog-url ⋅ Repo mirror: [`docs/FIELD_NOTES_apprentice.md`](../docs/FIELD_NOTES_apprentice.md) — a build diary covering what we designed and what broke. |
| 📡 **Sharing-is-Caring** | [`traces/sample/`](traces/sample/) — JSONL captures of every LLM call from a real playthrough (prompts, responses, latency, token usage, both requested and returned model id). LLM-call tracing is default-on; opt out with `ORACLES_TRACE_DISABLE=1`. |
| 🦙 **Llama Champion** | The LoRA-merged Qwen2.5-14B is exported to GGUF (Q4_K_M, ~8.4 GB) via the conversion job in [`modal_backend/modal_gguf_convert.py`](../modal_backend/modal_gguf_convert.py) and runs locally through `llama-cpp-python`'s OpenAI-compatible server. Launch with `./run.sh --local-llama` — no Modal call required. |
| ⚡ **Tiny Titan** | Same 23k corpus trained into a Qwen2.5-1.5B LoRA (~$5.50, ~1.5h on H100). Eligible for the ≤4B prize. |

## Running it

Three environment variables go in HF Space → **Settings → Variables and secrets** (or `.env.local` for local runs):

```
MODAL_URL     = https://<workspace>--<app>-serve.modal.run
MODAL_KEY     = wk-…  (Modal proxy auth key)
MODAL_SECRET  = ws-…  (Modal proxy auth secret)
```

Locally:

```bash
./run.sh              # lean mode, default
./run.sh --full       # all visual assets enabled (recommended on fast connections)
```

If `MODAL_URL` is unset OR `ORACLES_FORCE_MOCK=1`, the app runs in **mock mode** — the UI still works, but narrations are hand-written placeholders.

## Repo layout

```
oracles_app/
├── app.py                # main Gradio file
├── Dockerfile            # HF Space Docker SDK entry
├── requirements.txt
├── oracles/              # state, LLM client, story graph, themes, i18n
├── prompts/              # LLM prompt templates
└── assets/sprites/       # ~105 chroma-keyed pixel-art PNGs
```

Dev-only dirs (`modal_backend/`, `scripts/`, `training/`, `tests/`, `lora-out/`) live on local disk but are `.gitignore`d from the Space upload.

## Credits

- Base model — Qwen2.5-14B-Instruct + Qwen2.5-1.5B-Instruct (Alibaba)
- Distillation teacher — Claude Sonnet 4.5 (Anthropic) via OpenRouter
- Sprite generator — Klein-4B (Anthropic) on Modal H100
- Pixel-art fonts — Press Start 2P + VT323 (Google Fonts)

Built for the **Build Small Hackathon — Thousand Token Wood track** (2026-06-15).