ai-time-machine / docs /field_notes.md
manikandanj's picture
Prepare AI Time Machine hackathon Space
5862322 verified
|
Raw
History Blame Contribute Delete
2.61 kB
# Field Notes: Building The Immersive AI Time Machine
Date: 2026-06-14
## What Changed
The project started as a working voice conversation loop. The immersive upgrade turns each launch into a staged time-travel scene:
- A portal animation hides generation latency.
- A year counter creates the illusion of moving through time.
- The world and character are represented as generated visual assets.
- A distinct narrator introduces the scene before the character speaks.
- Ambient sound is available procedurally in the browser, with hooks for real loops later.
- Souvenirs now become visual artifacts, not just markdown text.
## Design Decisions
### Transport First, Chat Second
The most important judge-visible improvement is the transition from "voice chat" to "arrival." The launch sequence, world reveal, portrait, narration, and artifact give users a physical sense that they crossed into a scene before the conversation begins.
### Ordinary People Stay Central
The character should not be a famous historical figure or a generic narrator. They should be an ordinary person with a practical concern, limited worldview, and a believable misunderstanding of the user.
### Generated Assets Are Optional At Runtime
Real image generation uses FLUX.1 Schnell through Together AI when credentials are present. Fixture/fallback SVG assets keep the app reliable in local development, tests, and demos without network access.
### Narrator And Character Voices Are Separate
Narration uses a distinct voice profile. It introduces the world like the beginning of a film, then gets out of the way so the character can own the conversation.
### No Heavy Avatar Yet
A full talking head, WebXR, or 3D world would increase risk. For this hackathon, image-backed world/portrait reveal plus audio-reactive UI gives most of the perceived immersion with less fragility.
## Hackathon Fit
- **Delight:** portal launch, time movement, cinematic reveal, voice, and artifact.
- **AI Is Essential:** destination, persona, conversation, voice, scene, portrait, narration, and artifact are all AI-shaped.
- **Originality:** the app is a time-travel encounter with ordinary people, not a chatbot wrapper.
- **Gradio Polish:** custom cockpit UI, animations, audio hooks, and visual artifact panel.
- **Field Notes:** this document.
- **Sharing Is Caring:** JSONL traces already record event streams.
## Model Rule
The 32B cap is treated as a per-model limit. The registry and code now check the largest enabled model against the cap instead of summing all enabled models.