A newer version of the Gradio SDK is available: 6.19.0
Field Notes: Building The Immersive AI Time Machine
Date: 2026-06-14
What Changed
The project started as a working voice conversation loop. The immersive upgrade turns each launch into a staged time-travel scene:
- A portal animation hides generation latency.
- A year counter creates the illusion of moving through time.
- The world and character are represented as generated visual assets.
- A distinct narrator introduces the scene before the character speaks.
- Ambient sound is available procedurally in the browser, with hooks for real loops later.
- Souvenirs now become visual artifacts, not just markdown text.
Design Decisions
Transport First, Chat Second
The most important judge-visible improvement is the transition from "voice chat" to "arrival." The launch sequence, world reveal, portrait, narration, and artifact give users a physical sense that they crossed into a scene before the conversation begins.
Ordinary People Stay Central
The character should not be a famous historical figure or a generic narrator. They should be an ordinary person with a practical concern, limited worldview, and a believable misunderstanding of the user.
Generated Assets Are Optional At Runtime
Real image generation uses FLUX.1 Schnell through Together AI when credentials are present. Fixture/fallback SVG assets keep the app reliable in local development, tests, and demos without network access.
Narrator And Character Voices Are Separate
Narration uses a distinct voice profile. It introduces the world like the beginning of a film, then gets out of the way so the character can own the conversation.
No Heavy Avatar Yet
A full talking head, WebXR, or 3D world would increase risk. For this hackathon, image-backed world/portrait reveal plus audio-reactive UI gives most of the perceived immersion with less fragility.
Hackathon Fit
- Delight: portal launch, time movement, cinematic reveal, voice, and artifact.
- AI Is Essential: destination, persona, conversation, voice, scene, portrait, narration, and artifact are all AI-shaped.
- Originality: the app is a time-travel encounter with ordinary people, not a chatbot wrapper.
- Gradio Polish: custom cockpit UI, animations, audio hooks, and visual artifact panel.
- Field Notes: this document.
- Sharing Is Caring: JSONL traces already record event streams.
Model Rule
The 32B cap is treated as a per-model limit. The registry and code now check the largest enabled model against the cap instead of summing all enabled models.