the-apprentice / README.md
laoliu5280's picture
update readme for compatibility (#1)
0a4cd0e
|
Raw
History Blame Contribute Delete
9.92 kB
metadata
title: The Apprentice
emoji: ๐ŸŒฒ
colorFrom: indigo
colorTo: yellow
sdk: docker
app_port: 7860
suggested_hardware: cpu-basic
pinned: false
license: mit
short_description: Five oracles, five trials โ€” branching pixel-art game.
tags:
  - track:wood
  - sponsor:modal
  - achievement:offgrid
  - achievement:welltuned
  - achievement:offbrand
  - achievement:llama
  - achievement:sharing
  - achievement:fieldnotes
  - thousand-token-wood
  - well-tuned
  - off-brand
  - field-notes
  - sharing-is-caring
  - llama-champion
  - tiny-titan
  - branching-narrative
  - game
  - pixel-art
  - gradio
  - vllm
  - lora
  - qwen
  - bilingual
models:
  - Qwen/Qwen2.5-14B-Instruct
  - Qwen/Qwen2.5-1.5B-Instruct
  - AndrewRqy/oracles-wizard-14b-lora
  - AndrewRqy/oracles-wizard-1.5b-lora

Acknowledgement

This app is built by AndrewRqy.

The Apprentice โ€” Build Small Hackathon

A pixel-art branching fairy-tale. You inscribe five short oracles before the journey begins; an apprentice has to make every one of them save his life across five trials in a tree that converges on one of five distinct endings.

The idea

You play the mentor. You write five short oracles into a parchment โ€” any words at all: advice, gibberish, emoji, names, typos, whatever. After that you don't get to explain anything. Your apprentice walks five trials, and at each one he draws ONE oracle at random. Whatever it says, a fine-tuned Qwen2.5-14B has to take it seriously enough to save his life โ€” three humor modes (wild imagination / accidental trip / last-minute revelation), a 15-node branching tree that converges on one of five distinct endings, and six themes ร— two languages (English + ็ฎ€ไฝ“ไธญๆ–‡). The core joke: the player can write nonsense, but the world has to take it seriously.

The tech

  • Frontend: a single-file Gradio Blocks app (~5000 lines), wrapped in a custom Docker image. ~2000 lines of bespoke CSS make sure nothing on the page looks like default Gradio โ€” Press Start 2P + VT323 fonts, NES-style sharp corners, hand-laid pixel-art panels.
  • Backend: Qwen2.5-14B served via vLLM on a Modal-hosted L40S, with a custom-trained humor LoRA (rank 16, 23k examples, ~6.5h on H100, ~$22 of compute). The Gradio app talks to it via the OpenAI SDK.
  • Tiny Titan variant: same 23k corpus trained into a Qwen2.5-1.5B LoRA โ€” eligible for the โ‰ค4B prize.
  • Llama Champion path: the merged 14B exported to GGUF (Q4_K_M, 8.4 GB) and served via llama-cpp-python's OpenAI-compatible server. ./run.sh --local-llama swaps cloud for fully-local inference.
  • All art generated locally via Klein-4B on a Modal H100, then chroma-keyed offline. ~105 pixel-art sprites. No FLUX, no commercial generators.

Quick links

Recommended for the best experience: run it locally in full mode. The HF Space defaults to a stripped-down lean visual variant because of the bandwidth + cold-start constraints below. To see the parallax banner, parchment textures, scene landscapes, mentor/apprentice figures, animated trial scenes, and all the polish the way they were designed, clone the repo, drop the three Modal secrets into .env.local, and run ./run.sh --full. See Running it below for the full setup.

Note on loading time: this Space ships ~100 pixel-art sprites + theme backdrops. HF Space's free CPU tier has slow egress bandwidth, so first paint of a fresh container can take a minute or two; subsequent page transitions are faster as the browser caches assets. The front-page dropdown lets you flip between Lean (small payload, fast loading, default on the Space) and Full (parallax banner, scene landscapes, all decorative PNGs โ€” recommended only on a fast connection or once the Space is warm).

Note on LLM cold start: the Modal-hosted LLM container scales to zero when idle to avoid 24/7 billing during the review period. The first LLM call after the container has been idle (20 min) pays a **60-120s cold start** while vLLM loads the 14B weights + the LoRA adapter onto an L40S GPU. To hide this from the player, the app fires a background warmup ping to the Modal endpoint at startup, so by the time you've finished inscribing five oracles (~2-5 min of typing), the container should already be warm. If you click "Let the journey begin" immediately on a cold Space, expect the first trial to wait an extra minute. Every subsequent trial in the same session is instant.

What's inside

  • Frontend โ€” Single-file Gradio app with a hand-authored pixel-art aesthetic. Press Start 2P + VT323 fonts, NES-style sharp corners, custom theme suppressing all default Gradio chrome.
  • Backend โ€” Qwen2.5-14B + custom humor LoRA (AndrewRqy/oracles-wizard-14b-lora) served via vLLM on Modal. Frontend talks to it through the OpenAI SDK.
  • Tiny Titan path โ€” Same 23k humor corpus trained into a Qwen2.5-1.5B LoRA (AndrewRqy/oracles-wizard-1.5b-lora). Eligible for the โ‰ค4B prize.
  • Branching narrative โ€” Hand-authored 15-node story tree with 5 endings. Each fork at trials 2โ€“4 is decided by an LLM call seeded with one of the player's oracles, so the path the apprentice walks is shaped by what was inscribed.
  • 6 themes ร— 2 languages โ€” Fantasy, Space-Cowboy, Galactic-Light, Black-Land, Mistgate, Quiet-Years. Theme-neutral story nodes + per-theme vocabulary expansion at runtime.
  • All art generated locally โ€” ~105 pixel-art sprites via Klein-4B on a Modal H100, chroma-keyed offline. No FLUX, no commercial generators.

How to play

  1. Inscribe โ€” pick a language, theme, visual mode, and narration length. Then write five short oracles. Any words; gibberish counts, emoji counts.
  2. Send-off โ€” the mentor seals the parchments. The apprentice leaves.
  3. Five trials โ€” at each obstacle, the apprentice draws ONE oracle. The model takes the obstacle + oracle and writes a ~200-word resolution in one of three humor modes (wild imagination / accidental trip / last-minute revelation).
  4. Boss โ€” trial 5 is the world's finale (dragon, warlord-king, etc.). Different paths through the tree lead to different bosses.
  5. Ending โ€” one of 5 distinct endings plays, each with a hand-authored framing (why the boss behaved as it did + what the apprentice carried home), expanded by the LLM into a 3-paragraph epilogue.
  6. Summary โ€” the story tree shows the path you walked lit gold; the four endings you didn't reach blur behind "???" for replay.

Badge claims

Badge Why we claim it
๐ŸŽฏ Well-Tuned Qwen2.5-14B + a hand-distilled 23k-example humor LoRA (rank 16, ~6.5h on H100). Visibly steers all three humor modes; details in field notes.
๐ŸŽจ Off-Brand ~2000 lines of bespoke CSS, Press Start 2P + VT323 fonts, hand-painted pixel-art sprites, custom story-tree visualization, custom ending banner. No stock Gradio chrome reaches the page.
๐Ÿ““ Field Notes Blog post: https://huggingface.co/blog/AndrewRqy/apprentice-blog-url โ‹… Repo mirror: docs/FIELD_NOTES_apprentice.md โ€” a build diary covering what we designed and what broke.
๐Ÿ“ก Sharing-is-Caring traces/sample/ โ€” JSONL captures of every LLM call from a real playthrough (prompts, responses, latency, token usage, both requested and returned model id). LLM-call tracing is default-on; opt out with ORACLES_TRACE_DISABLE=1.
๐Ÿฆ™ Llama Champion The LoRA-merged Qwen2.5-14B is exported to GGUF (Q4_K_M, ~8.4 GB) via the conversion job in modal_backend/modal_gguf_convert.py and runs locally through llama-cpp-python's OpenAI-compatible server. Launch with ./run.sh --local-llama โ€” no Modal call required.
โšก Tiny Titan Same 23k corpus trained into a Qwen2.5-1.5B LoRA (~$5.50, ~1.5h on H100). Eligible for the โ‰ค4B prize.

Running it

Three environment variables go in HF Space โ†’ Settings โ†’ Variables and secrets (or .env.local for local runs):

MODAL_URL     = https://<workspace>--<app>-serve.modal.run
MODAL_KEY     = wk-โ€ฆ  (Modal proxy auth key)
MODAL_SECRET  = ws-โ€ฆ  (Modal proxy auth secret)

Locally:

./run.sh              # lean mode, default
./run.sh --full       # all visual assets enabled (recommended on fast connections)

If MODAL_URL is unset OR ORACLES_FORCE_MOCK=1, the app runs in mock mode โ€” the UI still works, but narrations are hand-written placeholders.

Repo layout

oracles_app/
โ”œโ”€โ”€ app.py                # main Gradio file
โ”œโ”€โ”€ Dockerfile            # HF Space Docker SDK entry
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ oracles/              # state, LLM client, story graph, themes, i18n
โ”œโ”€โ”€ prompts/              # LLM prompt templates
โ””โ”€โ”€ assets/sprites/       # ~105 chroma-keyed pixel-art PNGs

Dev-only dirs (modal_backend/, scripts/, training/, tests/, lora-out/) live on local disk but are .gitignored from the Space upload.

Credits

  • Base model โ€” Qwen2.5-14B-Instruct + Qwen2.5-1.5B-Instruct (Alibaba)
  • Distillation teacher โ€” Claude Sonnet 4.5 (Anthropic) via OpenRouter
  • Sprite generator โ€” Klein-4B (Anthropic) on Modal H100
  • Pixel-art fonts โ€” Press Start 2P + VT323 (Google Fonts)

Built for the Build Small Hackathon โ€” Thousand Token Wood track (2026-06-15).