SignBridge β Demo Video Script
Target length: 2:30 (β€ 3 min). Format: 1080p MP4, MP3 audio. Aspect ratio 16:9. Tools: QuickTime Player (Mac) for screen + camera capture, iMovie or CapCut for editing.
Story arc (3 acts)
| Time | Act | Beat |
|---|---|---|
| 0:00β0:20 | Hook | Open with the human problem; viewer must feel the gap. |
| 0:20β1:30 | Demo | Live SignBridge in action β both fingerspelling AND a motion sign. |
| 1:30β2:30 | Why AMD + close | Architecture diagram + concrete MI300X comparison + open-source ethics + URL. |
Hard rule: no slide-by-slide voice-over reading. The demo should play live; voice-over should narrate what we're seeing, not summarise text on screen.
Shot list
Act 1 β Hook (0:00 β 0:20)
Visual A (5 s): Plain background, bold text card fades in:
70 million deaf people. Interpreters cost $50β200 / hour. They're scarce.
Visual B (5 s): Text card β "What if your phone could just translate?"
Visual C (10 s): Camera shot of you (Lucas) in a quiet room, signing HELLO at the camera silently. No voice-over yet. Hold the silence β let the viewer feel that the sign means nothing to them.
Voice-over: (starts at 0:15)
"Most of us can't read this. SignBridge can."
Act 2 β Live demo (0:20 β 1:30)
Setup (0:20 β 0:25): 5-second screen-recording of the live HF Space loading at huggingface.co/spaces/lablab-ai-amd-developer-hackathon/signbridge. URL bar visible. Tabs visible: "Snapshot" and "Record sign". This proves it's a live deployed product, not a slide deck.
Beat 2A β Fingerspelling (0:25 β 0:55):
Visual (split screen recommended): Left = your face/hand on webcam, right = the Gradio app receiving frames.
- Sign L clearly. Click the π· camera button in the preview. App shows "β added L (98%)".
- Sign U. Click π· again.
- Sign C. π·.
- Sign A. π·.
- Sign S. π·.
- Click π Speak. App composes β speaks: "Lucas."
Voice-over during this beat:
"First, fingerspelling. I sign each letter, the app captures it, andβ" (pause for the speak) β "composed in natural English."
Beat 2B β Motion sign (0:55 β 1:25):
Visual: Switch tabs to Record sign. Hit Record, sign HELLO (the wave-from-forehead motion), stop, click Submit.
- Detected: hello (85%). Click Speak.
- App says: "Hello."
Repeat one more sign for variety: THANK_YOU.
Voice-over:
"But fingerspelling alone isn't real ASL β most signs are motion. Hold-to-record captures the whole gesture, not just one frame. The system detects the motion across frames and..." (pause for the speak)
Beat 2C β Two-person scene (1:25 β 1:30): (optional but high-impact)
Visual: You sign something to a hearing person; they hear the AI say it; they react. Hold the human reaction for 2 seconds.
No voice-over during this beat β let the moment land.
Act 3 β Architecture + AMD pitch (1:30 β 2:30)
Beat 3A β Architecture diagram (1:30 β 1:55):
Visual: Static slide showing the pipeline:
Webcam recording β ffmpeg β fine-tuned Qwen3-VL-8B (native video_url)
β
Qwen3-8B (composer)
β
gTTS (speech)
Both LLMs concurrent on a single AMD Instinct MI300X
Voice-over:
"Under the hood: our fine-tuned Qwen3-VL-8B receives the recorded clip natively via vLLM's video_url block, Qwen3-8B composes the sentence, gTTS speaks it β both Qwen models running concurrently on a single AMD Instinct MI300X. Vision and reasoning on one GPU."
Beat 3B β The MI300X comparison (1:55 β 2:15):
Visual: The comparison table from the walkthrough:
| MI300X 1Γ | H100 80 GB | |
|---|---|---|
| V1 pipeline (~34 GB) | β comfortable | β tight |
| V2 with Llama-3.1-70B FP8 (~70 GB extra) | β still fits | β doesn't fit |
Voice-over:
"192 GB of HBM3. Same workload on NVIDIA H100 needs three GPUs. Practical accessibility tools running globally need the cost-and-availability profile that AMD enables."
Beat 3C β Substrate + close (2:15 β 2:30):
Visual: Final slide:
- "Open source, MIT β github.com/seekerPrice/signbridge"
- "Hugging Face Space β huggingface.co/spaces/lablab-ai-amd-developer-hackathon/signbridge"
- "ASL V1. Deaf-led teams own the rest."
- π€ SignBridge
Voice-over:
"SignBridge is open source under MIT. It's a substrate β Deaf-led organisations deploy it for their own languages. The hardest part of accessibility isn't building. It's deploying. AMD makes the deploying possible. Thanks for watching."
Voice-over recording tips
- Record voice separately from screen capture (better audio quality). Use QuickTime "New Audio Recording" with a mic 6β12 inches away.
- One take, then cut. Don't try to dub multiple takes line-by-line.
- Cadence: ~140 words/min. Pause for 0.5 s after each section.
- If you have a good pop filter / lavalier, use it. AirPods Pro built-in mic is workable but compresses dynamics.
Editing notes
- Captions/subtitles required. Burn in the spoken English text below the speaker's face throughout β both for accessibility and so judges can follow with sound off.
- Highlight the recognized token visually. When the app shows "detected: hello (85%)", zoom in or add a brief highlight box on that text β judges' eyes need to find it fast.
- Music: skip. The demo is loud enough on its own; background music distracts from the speech-output beats.
- Smooth transitions only β don't use fancy wipes; cut on action.
- Final cut export: 1080p, H.264, MP4, β€100 MB if possible (lablab uploader has size limits).
Prep before recording
- AMD Dev Cloud credit landed (so the live demo uses MI300X β this is the hackathon talk-track); fall back to HF Inference if not.
- Lighting: front-facing soft light. No back-window glare.
- Plain background (white wall ideal).
- Wear a contrasting solid colour (not patterns) β VLM accuracy improves.
- Webcam height: at eye level. Hands need to be in frame for signs.
- Test the live HF Space URL once before recording. If it errors, fix before pressing record.
- One dry run end-to-end with a stopwatch. Trim if over 2:45.
Recording order (don't shoot in story order)
- Live demo screen recording first β 3 takes of the full demo flow, pick the cleanest.
- Voice-over second β record continuous narration over the picked demo take.
- B-roll of you signing alone (Act 1 silent shot, Act 2C two-person reaction) β last, since they're easier to re-shoot.
- Edit it together in iMovie / CapCut.
- Export.
- Upload to YouTube as Unlisted, copy URL.
- Paste URL into lablab.ai submission form's "Video Presentation" field.
Export checklist
- Length 2:00β3:00
- Captions visible throughout
- AMD Dev Cloud / MI300X mentioned by name β₯3 times
- Qwen3-VL mentioned by name β₯2 times (Qwen Special Reward eligibility)
- HF Space URL shown on screen at least once
- GitHub URL shown on screen at least once
- No copyrighted music / footage
- Speaker face visible (judges remember faces)
- Final shot: SignBridge logo + URLs