ai-time-machine / docs /developer_implementation_guide.md
manikandanj's picture
Prepare AI Time Machine hackathon Space
5862322 verified
|
Raw
History Blame Contribute Delete
22.9 kB
# Trans-Temporal Express β€” Developer Implementation Guide
> **For: AI Development Agent**
> **From: UX Design Agent**
> **Status: All visual assets generated. Audio files need sourcing. Ready for implementation.**
---
## 1. Vision & Experience Summary
You are re-skinning the "AI Time Machine" β€” a voice-to-voice conversation app where users travel to different historical/future eras and have live conversations with AI-generated characters. The current UI uses a dark steampunk/industrial theme with concentric rings and abstract portals. **You are replacing it entirely** with a warm, whimsical "train cab" experience called **"The Trans-Temporal Express."**
### The Core Metaphor
The user is the **engineer** of a magical time-traveling locomotive. They sit inside the cab, look out through a large arched windshield, pull a brass throttle lever, and hurtle through time β€” watching era signs fly past β€” until they brake at a destination. Steam clears to reveal the world, and a historical character walks up to the window for a voice conversation.
### Art Style
Warm, polished cartoon illustration β€” like a high-quality animated film. NOT photorealistic. Think Studio Ghibli meets Art Deco. Color palette: warm browns, amber brass, copper accents, cream text, cyan holographic accents.
### Reference Image
See `docs/images/Gemini_Generated_Image_6eqvyh6eqvyh6eqv(1).png` β€” this is the approved visual target for the cockpit view. Note the simple warm wood panels, arched window, brass throttle at center, holographic comm screen at bottom-right, and a character visible through the windshield.
---
## 2. What Already Exists (DO NOT REBUILD)
The backend is fully functional. You only need to rebuild the **frontend layer** (HTML/CSS/JS). The following work:
### Backend (Leave Untouched)
| File | What It Does |
|------|-------------|
| `app.py` | FastAPI + Gradio mount. Entry point. `/blank` β†’ intro, `/app` β†’ cockpit |
| `src/time_machine/ui/handlers.py` | `GradioHandlers` β€” `launch()`, `send_text()`, `send_audio()`, `bring_souvenir()`, `save()` |
| `src/time_machine/ui/view_models.py` | `UiRenderState` β€” processes events, builds `render_immersive_payload()` JSON |
| `src/time_machine/ui/realtime.py` | WebSocket realtime voice β€” STT/TTS pipeline |
| `src/time_machine/ui/gradio_app.py` | Gradio Blocks layout β€” **you WILL modify this** to inject new HTML/CSS/JS |
| `src/time_machine/domain/models.py` | Domain models (Destination, Persona, ImmersiveScene, etc.) |
| `src/time_machine/application/` | Service layer β€” encounter orchestration, image gen, TTS |
### Key Communication Pattern
The frontend receives state via a hidden Gradio `Textbox` with `elem_id="tm-immersive-payload"`. The JavaScript polls this every 750ms via `currentPayload()` and calls `updateCockpitState(payload)` to update the DOM.
**The payload JSON structure** (from `render_immersive_payload()` in `view_models.py`):
```json
{
"session_id": "...",
"encounter_id": "...",
"machine_state": "dormant|launching|destination|persona|conversation_ready",
"current_year": 2026,
"target_year": 1492,
"direction": "rewind|fast_forward|hold",
"destination": {
"place": "Port of Palos, Spain",
"mode": "past",
"atmosphere": "Salt-stung harbor at dawn...",
"visual_preset": "harbor_dawn",
"motifs": ["wooden_ships", "compass_roses"]
},
"persona": {
"name": "Isabella",
"role": "Harbor master's daughter",
"situation": "Watching strange ships being loaded..."
},
"scene": { "image_b64": "...", "prompt": "...", "ambient_key": "ocean" },
"portrait": { "image_b64": "...", "prompt": "..." },
"narration": { "text": "...", "audio_path": "data:audio/wav;base64,..." },
"artifact": { "title": "...", "kind": "...", "image_b64": "..." },
"ambient_key": "ocean"
}
```
### Realtime Voice (Leave Untouched)
The voice system connects via WebSocket at `/realtime/voice`. The existing `cockpit.js` handles:
- Microphone capture β†’ PCM audio β†’ WebSocket
- WebSocket β†’ PCM playback (character voice)
- JSON control messages for turn management
- VAD (Voice Activity Detection) with configurable thresholds
**You must preserve the realtime voice section of cockpit.js** (the `initRealtimeVoice()` IIFE starting at ~line 509). The UI just needs the toggle button and status display wired up.
---
## 3. What You Need To Build
### 3A. Intro Sequence (replaces `intro.html` + `intro.css` + `intro.js`)
**Current**: A 32-second industrial facility walkthrough (corridors, vault doors, observatory clock). Too dark/sterile.
**New**: A ~20-second magical ticket-punching sequence:
| Beat | Duration | Visual | Audio | Assets |
|------|----------|--------|-------|--------|
| **1. Ticket Appears** | 0–6s | Dark void. A brass "Temporal Ticket" materializes, floating center-screen with a gentle glow and slow rotation | Low procedural hum (Web Audio oscillator) | `static/img/intro/temporal_ticket.png` |
| **2. Conductor Box** | 6–11s | An ornate mechanical conductor box appears. Ticket slides into it. Flash of light on punch. Gears begin to grind. Energy lines ripple outward | `ticket_clang.mp3`, `charge_up.mp3`, `gears_grinding.mp3` | `static/img/intro/conductor_box.png`, `static/img/overlays/energy_lines.png` |
| **3. Train Materializes** | 11–18s | Shockwave explosion of golden energy. The Trans-Temporal Express materializes through the energy β€” fading from transparent to solid. Train whistle sounds. Steam billows | `shockwave.mp3`, `materialize.mp3`, `train_whistle.mp3`, `steam_burst.mp3` | `static/img/intro/train_materialization.png`, `static/img/overlays/spark_particles.png` |
| **4. Board the Train** | 18–22s | Screen flashes white, then fades into the cockpit view. Transition to `/app` | Flash transition (CSS) | None β€” redirect to cockpit |
**Skip button**: Preserve the existing skip-intro button behavior (click or 32s timeout β†’ redirect to `/app`).
**Intro HTML/CSS**: Build entirely new. The current corridor/vault/clock HTML is not reused.
---
### 3B. Cockpit View (replaces `cockpit.html` + `cockpit.css` + cockpit portions of `cockpit.js`)
**Current**: Dark steampunk portal with concentric rings, timeline bar, separate portrait aperture.
**New**: First-person train cab with arched windshield. Reference: `docs/images/Gemini_Generated_Image_6eqvyh6eqvyh6eqv(1).png`
#### DOM Structure
```
#tm-cockpit (full viewport container, carries state classes)
β”œβ”€β”€ .tte-viewport (z:10 β€” windshield content area)
β”‚ β”œβ”€β”€ .tte-idle-bg (idle_viewport.png β€” default tracks-to-sky)
β”‚ β”œβ”€β”€ .tte-travel-bg (travel_future.png or travel_past.png β€” during travel)
β”‚ β”œβ”€β”€ .tte-world-scene (#tm-world-image β€” AI-generated destination)
β”‚ β”œβ”€β”€ .tte-character (#tm-portrait-image β€” AI-generated character)
β”‚ └── .tte-era-signs-container (dynamically spawned HTML signs)
β”œβ”€β”€ .tte-overlays (z:50-90 β€” effect layers)
β”‚ β”œβ”€β”€ .tte-speed-lines (speed_lines.png, mix-blend-mode:screen)
β”‚ β”œβ”€β”€ .tte-energy-lines (energy_lines.png, for materialization)
β”‚ β”œβ”€β”€ .tte-steam-cloud (steam_cloud.png, for landing reveal)
β”‚ β”œβ”€β”€ .tte-vignette (CSS radial-gradient, always on)
β”‚ └── .tte-film-grain (CSS SVG noise, always on, very subtle)
β”œβ”€β”€ .tte-cockpit-frame (z:100 β€” the cab interior frame)
β”‚ └── Built with CSS gradients (see approach below)
β”œβ”€β”€ .tte-dashboard (z:110 β€” bottom dashboard)
β”‚ β”œβ”€β”€ .tte-gauges (gauge_cluster.png β€” bottom-left)
β”‚ β”œβ”€β”€ .tte-throttle (chrono_throttle_neutral.png β€” bottom-center)
β”‚ └── .tte-comm-screen (comm_screen.png β€” bottom-right)
β”œβ”€β”€ .tte-year-module (z:130 β€” top center)
β”‚ β”œβ”€β”€ .tte-year-label ("DESTINATION")
β”‚ └── .tte-year-counter (#tm-year-counter β€” animated year display)
β”œβ”€β”€ .tte-narration (#tm-narration-caption β€” bottom text overlay)
β”œβ”€β”€ .tte-live-voice (voice toggle button + meter + status)
β”œβ”€β”€ #tm-narration-audio (hidden audio element)
β”œβ”€β”€ #tm-ambient-audio (hidden audio element)
└── #tm-artifact-panel (artifact display, preserved from current)
```
#### Cockpit Frame Approach (IMPORTANT)
**Do NOT use cockpit_frame.png as the main frame.** Instead, build the frame with CSS gradients:
```css
/* Top arch shadow */
radial-gradient(ellipse 120% 30% at 50% 0%, #2A1810 0%, transparent 70%),
/* Left panel */
linear-gradient(90deg, #2A1810 0%, transparent 12%),
/* Right panel */
linear-gradient(270deg, #2A1810 0%, transparent 12%),
/* Bottom dashboard */
linear-gradient(0deg, #1A0F08 0%, #2A1810 15%, transparent 35%)
```
Add a brass trim border using `::after` pseudo-element with a rounded top border-radius.
This gives full flexibility for responsive layouts and the windshield transparency.
#### State-Driven CSS Classes
The cockpit element `#tm-cockpit` should have these state classes (same pattern as current code):
| State | Class | Visual |
|-------|-------|--------|
| Dormant | `.tm-state-dormant` | Idle viewport visible, throttle neutral, gauges calm, launch pulse |
| Launching | `.tm-state-launching` | Throttle rotates, charge-up effect, energy builds |
| Traveling | `.tm-state-traveling` | Travel bg visible, speed lines rotating, screen shake, era signs flying, year counter animating |
| Braking | `.tm-state-braking` | Violent shake, sparks, speed lines intensify |
| Destination | `.tm-state-destination` | Steam cloud covers viewport |
| Persona | `.tm-state-persona` | Steam clears, world image visible |
| Conversation Ready | `.tm-state-conversation-ready` | World + character visible, cockpit frame dimmed, character breathing animation, ambient audio |
---
### 3C. Travel Sequence (NEW β€” no current equivalent)
When the user clicks "Launch" and the backend starts processing, you need to show a travel animation. Map the `machine_state` transitions to the visual sequence:
1. **`launching`** β†’ Throttle rotates to PAST or FUTURE (based on `direction`). Charge-up SFX.
2. **Wait ~2s** β†’ Transition to travel animation (CSS-driven)
3. **`destination`** β†’ Era signs start flying. Speed lines activate. Year counter animates from `current_year` β†’ `target_year`. Travel background visible.
4. **When world image arrives** (`scene.image_b64` populated) β†’ Begin braking: violent shake, brake screech SFX.
5. **`persona`** β†’ Steam cloud rises, world image loads behind it. Steam clears.
6. **`conversation_ready`** β†’ Character visible, ambient audio playing, voice channel enabled.
#### Era Signs (HTML elements, NOT images)
During travel, spawn HTML `<div>` elements with era names that fly across the viewport:
```html
<div class="tte-era-sign future" style="--sign-speed: 1.2s">THE SPACE AGE: 1960s</div>
```
CSS animates them from right-to-left (future) or left-to-right (past). They should have:
- Dark green background with brass border
- Playfair Display font, cream text, uppercase
- Brass post extending downward (::after pseudo-element)
- Varying vertical positions (30-60% from top) for depth
- Getting faster as travel progresses
See `static/css/tte_reference.css` for the complete animation keyframes.
#### Year Counter
The existing `animateYearCounter()` function in cockpit.js already handles this β€” keep it.
---
### 3D. Audio System Enhancements
#### Current Audio
- Procedural ambient via Web Audio API oscillators (keep this as fallback)
- Narration audio via `<audio>` element
- Ambient audio via `<audio>` element with loop
#### Enhancement: SFX Playback
Add an SFX system that loads and plays one-shot sound effects:
```javascript
// Preload during init
const sfxBuffers = {};
async function preloadSFX(key, url) {
const response = await fetch(url);
const buffer = await audioContext.decodeAudioData(await response.arrayBuffer());
sfxBuffers[key] = buffer;
}
function playSFX(key) {
if (!sfxBuffers[key]) return;
const source = audioContext.createBufferSource();
source.buffer = sfxBuffers[key];
source.connect(sfxGainNode);
source.start();
}
```
#### Audio Ducking
When the character speaks (realtime voice), duck the ambient audio to ~15% volume. Unduck when the user is speaking or silence.
#### Audio Files
Sound effects are NOT yet downloaded. See `static/audio/sound_sources.md` for specific search queries to find each sound on Pixabay/Mixkit/Freesound. All must be royalty-free.
**Directory structure:**
```
static/audio/
β”œβ”€β”€ sfx/
β”‚ β”œβ”€β”€ ticket_clang.mp3
β”‚ β”œβ”€β”€ steam_burst.mp3
β”‚ β”œβ”€β”€ brake_screech.mp3
β”‚ β”œβ”€β”€ gears_grinding.mp3
β”‚ β”œβ”€β”€ train_chug.mp3
β”‚ β”œβ”€β”€ train_whistle.mp3
β”‚ β”œβ”€β”€ materialize.mp3
β”‚ β”œβ”€β”€ time_warp.mp3
β”‚ β”œβ”€β”€ charge_up.mp3
β”‚ β”œβ”€β”€ shockwave.mp3
β”‚ β”œβ”€β”€ chime.mp3
β”‚ β”œβ”€β”€ lever_pull.mp3
β”‚ └── button_click.mp3
└── ambient/
β”œβ”€β”€ temporal_storm.mp3
β”œβ”€β”€ rain.mp3
β”œβ”€β”€ marketplace.mp3
β”œβ”€β”€ fire_hearth.mp3
β”œβ”€β”€ wind.mp3
β”œβ”€β”€ machinery.mp3
β”œβ”€β”€ ocean.mp3
β”œβ”€β”€ night_insects.mp3
β”œβ”€β”€ desert.mp3
└── scifi_hum.mp3
```
**If audio files are not available, gracefully degrade.** The experience must work without sounds β€” just without audio feedback.
---
## 4. Asset Locations
### Generated Image Assets
| Category | File | Usage |
|----------|------|-------|
| **Cockpit** | `static/img/cockpit/cockpit_frame.png` | Reference only β€” use CSS gradients instead |
| **Cockpit** | `static/img/cockpit/chrono_throttle_neutral.png` | Throttle lever β€” CSS rotate: -25Β° FUTURE, 0Β° NEUTRAL, +25Β° PAST |
| **Cockpit** | `static/img/cockpit/comm_screen.png` | Dashboard right β€” pulse brightness when character speaks |
| **Cockpit** | `static/img/cockpit/gauge_cluster.png` | Dashboard left β€” shake during travel |
| **Cockpit** | `static/img/cockpit/idle_viewport.png` | Default windshield content β€” magical tracks leading to glowing sky |
| **Intro** | `static/img/intro/temporal_ticket.png` | Floating ticket prop in intro |
| **Intro** | `static/img/intro/conductor_box.png` | Ticket punch machine in intro |
| **Intro** | `static/img/intro/ticket_scene.png` | Reference composition only β€” do not use directly |
| **Intro** | `static/img/intro/train_materialization.png` | Train appearing with golden energy in intro |
| **Travel** | `static/img/travel/travel_future.png` | Viewport bg during future travel (cyan/blue energy tunnel) |
| **Travel** | `static/img/travel/travel_past.png` | Viewport bg during past travel (warm amber/sepia tunnel) |
| **Overlay** | `static/img/overlays/steam_cloud.png` | Landing steam reveal β€” normal blend, z:90 |
| **Overlay** | `static/img/overlays/energy_lines.png` | Intro shockwave β€” screen blend, z:80 |
| **Overlay** | `static/img/overlays/speed_lines.png` | Travel velocity β€” screen blend, z:50, CSS rotate |
| **Overlay** | `static/img/overlays/spark_particles.png` | Particle sprite reference β€” screen blend |
| **Reference** | `static/img/reference/concept_cab_interior.png` | Initial concept art β€” reference only |
### Reference Code
| File | What It Contains |
|------|-----------------|
| `static/css/tte_reference.css` | Complete CSS with design tokens, layout, 15+ animation keyframes, state transitions, responsive breakpoints |
| `static/js/tte_animation_reference.js` | AudioManager class, intro orchestrator, era sign spawner, year counter, landing sequence |
| `static/asset_manifest.json` | Complete JSON manifest with all assets, timing, z-index, animation specs, state machine |
---
## 5. Design System
### Colors (CSS Custom Properties)
```css
--tte-brass: #C4943D;
--tte-amber: #D4A843;
--tte-burgundy: #6B2D3E;
--tte-copper: #B87333;
--tte-cyan: #8ED8D1;
--tte-cream: #F5E6C8;
--tte-dark: #1A0F08;
--tte-darker: #0A0605;
--tte-mahogany: #2A1810;
```
### Typography
```css
--tte-font-heading: 'Playfair Display', Georgia, serif; /* Era signs, year counter, titles */
--tte-font-body: 'Outfit', system-ui, sans-serif; /* Status text, captions, UI labels */
```
Add Google Fonts link in the HTML head:
```html
<link href="https://fonts.googleapis.com/css2?family=Playfair+Display:wght@600;700&family=Outfit:wght@300;400;500&display=swap" rel="stylesheet">
```
### Key Animations (see `tte_reference.css` for full keyframes)
- `tte-screen-shake` β€” gentle cockpit vibration
- `tte-violent-shake` β€” braking impact with rotation
- `tte-speed-rotate` β€” speed lines 360Β° rotation over 30s
- `tte-shockwave` β€” energy expanding from center
- `tte-steam-drift` β€” steam cloud rising and dissipating
- `tte-idle-drift` β€” subtle parallax on idle viewport
- `tte-sign-fly-future` / `tte-sign-fly-past` β€” era signs crossing viewport
- `tte-comm-pulse` β€” comm screen glow when speaking
- `tte-character-breathe` β€” subtle scale 1.0β†’1.008 on character portrait
- `tte-launch-pulse` β€” ambient glow on launch button
---
## 6. Files You Need To Modify/Create
### Files to REPLACE (new content):
| File | Action |
|------|--------|
| `src/time_machine/ui/assets/intro.html` | **Replace entirely** β€” new ticket/train intro |
| `src/time_machine/ui/assets/intro.css` | **Replace entirely** β€” new intro styling |
| `src/time_machine/ui/assets/intro.js` | **Replace entirely** β€” new intro sequence logic |
| `src/time_machine/ui/assets/cockpit.html` | **Replace entirely** β€” new train cab layout |
| `src/time_machine/ui/assets/cockpit.css` | **Replace entirely** β€” new train cab styling |
### Files to MODIFY (careful edits):
| File | What to Change |
|------|---------------|
| `src/time_machine/ui/assets/cockpit.js` | Replace `updateCockpitState()` and the immersive cockpit init. **PRESERVE the `initRealtimeVoice()` IIFE** (lines ~509-1031) untouched. Add travel sequence logic, era sign spawning, SFX system |
| `src/time_machine/ui/gradio_app.py` | Update CSS/HTML injection. Load new assets. Add Google Fonts. **Keep all Gradio component wiring identical** β€” the handler inputs/outputs must not change |
| `src/time_machine/ui/blank_app.py` | May need update for intro page if it serves the intro HTML separately |
### Files to CREATE:
| File | Purpose |
|------|---------|
| (none required β€” all new code goes in the existing files above) | |
### Files to LEAVE ALONE:
Everything in `src/time_machine/application/`, `src/time_machine/domain/`, `src/time_machine/ports/`, `src/time_machine/adapters/`, `src/time_machine/prompts/`, `config/`, `app.py`
---
## 7. Gradio Integration Notes
### How the current system injects frontend code (`gradio_app.py`):
```python
css = (ASSET_DIR / "cockpit.css").read_text()
cockpit_html = (ASSET_DIR / "cockpit.html").read_text()
cockpit_js = (ASSET_DIR / "cockpit.js").read_text()
# CSS is embedded via: gr.HTML(f"<style>{css}</style>{cockpit_html}")
# JS is embedded via: gr.Blocks(head=f"<script>{cockpit_js}</script>")
```
### Image paths in CSS/HTML
Gradio serves static files from the `static/` directory. Reference images as:
```
/file=static/img/cockpit/idle_viewport.png
```
The current `bg.png` is base64-encoded and embedded in CSS by `gradio_app.py`. You can either:
1. Continue this pattern for critical images (faster load, no extra requests)
2. Use `/file=static/...` paths for non-critical images (cleaner code)
### Hidden Gradio Components (MUST PRESERVE)
These invisible components carry state between Python and JS:
- `#tm-immersive-payload` β€” JSON payload polled by cockpit.js every 750ms
- `#tm-realtime-session` β€” realtime voice config (encounter_id, ready state)
- All Gradio inputs (mode dropdown, coordinate prompt, launch button, etc.)
### The Gradio Controls Row
The current Gradio layout has visible controls (mode dropdown, coordinate prompt, launch button, send text, microphone, etc.) in `gr.Row()` blocks below the cockpit HTML. These **must remain functional** but can be visually restyled to match the new theme. Consider:
- Styling the Launch button as a brass-colored action button
- Making the control rows blend with the dark theme
- Keeping all `elem_id` values the same
---
## 8. Critical Implementation Order
1. **Cockpit HTML + CSS first** β€” get the static layout right
2. **State-driven CSS transitions** β€” wire up class toggling
3. **cockpit.js updateCockpitState()** β€” adapt to new DOM structure
4. **Travel sequence** β€” era signs, speed lines, year counter
5. **Landing/steam reveal** β€” test with actual launch flow
6. **Intro sequence** β€” ticket β†’ conductor β†’ materialization β†’ redirect
7. **Audio SFX system** β€” layer on top of working visuals
8. **Polish** β€” responsive, reduced-motion, edge cases
---
## 9. Testing
### Manual Test Flow
1. Go to `/blank` β†’ intro should play β†’ skip or wait β†’ redirect to `/app`
2. At `/app` β†’ cockpit visible in dormant state (idle viewport, throttle neutral)
3. Select mode "past", type "ancient rome" β†’ click Launch
4. Observe: launching β†’ traveling (signs, speed lines) β†’ destination β†’ steam β†’ conversation_ready
5. Click "Live voice" β†’ speak β†’ character responds via voice
6. Verify ambient audio matches destination (`ambient_key` from payload)
### Edge Cases
- Launch without coordinates (mode="surprise") β€” should still work
- Multiple rapid launches β€” state should reset cleanly
- No audio permission β€” SFX silently fails, visuals still work
- Mobile viewport β€” responsive layout for smaller screens
- `prefers-reduced-motion` β€” disable all animations
---
## 10. Asset Status Summary
| Category | Count | Status |
|----------|-------|--------|
| Cockpit images | 5 | βœ… Ready |
| Intro images | 4 | βœ… Ready (1 is reference-only) |
| Travel backgrounds | 2 | βœ… Ready |
| Overlay effects | 4 | βœ… Ready |
| Reference images | 1 | βœ… Ready |
| CSS reference | 1 | βœ… Ready (`static/css/tte_reference.css`) |
| JS reference | 1 | βœ… Ready (`static/js/tte_animation_reference.js`) |
| Asset manifest | 1 | βœ… Ready (`static/asset_manifest.json`) |
| Sound effects | 13 | ⬜ Need to download (see `static/audio/sound_sources.md`) |
| Ambient loops | 10 | ⬜ Need to download (see `static/audio/sound_sources.md`) |
> **Note**: The app must work WITHOUT audio files. Build visuals first, then layer audio on top. The SFX system should gracefully degrade if files are missing.