Spaces:
Runtime error
A newer version of the Gradio SDK is available: 6.19.0
title: Memory Bridge
emoji: ๐ฏ๏ธ
colorFrom: indigo
colorTo: red
sdk: gradio
python_version: '3.12'
app_file: app.py
pinned: false
license: mit
short_description: Talk to a loved one via letters, photos & voice.
tags:
- track:wood
- sponsor:openbmb
- sponsor:nvidia
- sponsor:modal
- achievement:sharing
Memory Bridge (Memory Keeper)
๐ Try it live here: https://huggingface.co/spaces/build-small-hackathon/memory-bridge
๐ Hackathon Submission: Build Small Hackathon
Value Proposition: Memory Bridge offers a comforting way to preserve the legacy and essence of loved ones who have passed away. By uploading fragments of their lifeโletters, photos, and voice recordingsโusers can interact with an AI persona that embodies their personality, memories, and voice.
Intended Users: Individuals grieving the loss of a loved one or families wishing to preserve the digital memory of their ancestors for future generations.
- Demo Video URL: [Placeholder: Add Video URL]
- Social Post URL: https://x.com/sheikhMdRakib23/status/2066510302041235609
๐ Overview
Memory Bridge (also known as Memory Keeper) is a multimodal AI application that allows you to "talk" to a deceased loved one. By analyzing letters, photographs, scanned documents, and voice notes, the system constructs a rich AI persona that preserves their memories, personality traits, and even their voice.
๐ก Problem Being Solved
When we lose someone, their voice, their stories, and their unique way of speaking slowly fade from our immediate memory. While photos and letters remain static, Memory Bridge brings these artifacts to life, offering a dynamic, conversational way to interact with the essence of a loved one to help with the grieving process and to keep their memory alive.
โจ Key Features
- Multimodal Persona Generation: Upload text (letters, diaries), photos, scanned documents, and audio (voice notes) to build a comprehensive persona.
- Voice Synthesis: Hear their responses in a synthesized voice.
- Interactive Chat Interface: Talk to the generated persona in real-time.
- Multilingual Support: Chat in English, Bengali, Hindi, Chinese, Japanese, Korean, and Thai.
- Persistent Memories: Save personas and retrieve them later using a unique Persona ID.
๐ ๏ธ Tech Stack
- Frontend / UI: Gradio
- Backend Infrastructure: Modal
- Hosting: Hugging Face Spaces
- Storage: Modal Volumes
- Inference: Open-source AI models running on Modal and MiniCPM endpoints
๐ค Models Used
All models used in Memory Bridge are individually below the 32B parameter limit required by the Build Small Hackathon.
| Model | Purpose |
|---|---|
| MiniCPM4.1-8B | Persona generation and conversational AI |
| MiniCPM-V 4.6 (8B) | Photo understanding and visual memory extraction |
| Cohere Transcribe 03-2026 (~2B) | Speech-to-text transcription of voice recordings |
| NVIDIA Nemotron Parse v1.2 (<1B) | OCR and document understanding for scanned letters and documents |
| VoxCPM2 (~1B) | Text-to-speech voice synthesis |
| Tiny Aya Fire (3.35B) | South Asian multilingual language support |
| Tiny Aya Water (3.35B) | Asia-Pacific multilingual language support |
AI Pipeline
Voice Notes โ Cohere Transcribe
- Converts uploaded audio into text.
Photos โ MiniCPM-V 4.6
- Generates detailed descriptions of people, scenes, and emotional context.
Scanned Documents โ Nemotron Parse
- Extracts text from handwritten or printed documents.
Persona Creation โ MiniCPM4.1-8B
Builds a structured memory profile including:
- Personality traits
- Speech style
- Key memories
- Values
- Voice characteristics
Conversation โ MiniCPM4.1-8B
- Powers real-time conversations with the preserved persona.
Voice Response โ VoxCPM2
- Converts generated responses into speech.
Multilingual Support โ Tiny Aya Fire & Tiny Aya Water
- Supports Bengali, Hindi, Chinese, Japanese, Korean, Thai, and other languages.
๐ Repository Structure
.
โโโ app.py # Main Gradio application frontend and API orchestration
โโโ modal_app.py # Backend Modal endpoints and AI model inference logic
โโโ requirements.txt # Python dependencies
โโโ README.md # Project documentation
(Note: The backend model inference code in modal_app.py runs externally on Modal endpoints.)
๐ง Configuration
The application connects to external Modal endpoints. The URLs are hardcoded in app.py. No additional local configuration or API keys are required to run the frontend if the Modal endpoints are active and public.
๐ป Usage Instructions
First, visit the live application at Memory Bridge.
- Preserve a Memory (Tab 1): Enter the person's name, your relationship, and upload any available texts, photos, or voice notes. Click "Preserve Their Memory" and wait for the Persona ID to be generated.
- Talk to Them (Tab 2): Paste the generated Persona ID, select a language, and enable "Voice Response" if desired. Start chatting!
- Saved Memories (Tab 3): View a list of previously created personas and their IDs.
โ๏ธ Deployment Instructions
This application consists of two parts: a backend hosted on Modal and a frontend hosted on Hugging Face Spaces.
Backend (Modal)
- Install Modal and set up your token:
pip install modalandmodal setup. - Deploy the backend endpoints:
modal deploy modal_app.py
(Note: Update the endpoint URLs in app.py if your deployed Modal URLs differ).
Frontend (Hugging Face Spaces)
- Create a new Hugging Face Space and select the Gradio SDK.
- Push
app.py,requirements.txt, andREADME.mdto the Space repository. - The Space will automatically build and launch the Gradio app.
โ ๏ธ Limitations
- Cold Starts: The backend Modal endpoints may go to sleep. Initial requests (building a persona or the first chat message) might take 1โ3 minutes or require a retry.
- Backend Dependency: The frontend is entirely reliant on the specific Modal endpoints defined in the code. If these endpoints are taken down, the app will not function.
- Emotional Impact: Interacting with a simulated deceased loved one can be emotionally taxing.
๐ฎ Future Improvements
- Local inference support to remove reliance on external endpoints.
- More granular control over voice cloning parameters.
- Support for video uploads to extract dynamic mannerisms.
- Enhanced long-term memory retrieval in chat.
๐ Credits
- Backend Modal endpoints developed by Sheikh Md Rakib.
- Powered by open-source models from Hugging Face, Cohere, OpenBMB, and NVIDIA.
- Hosted on Hugging Face Spaces and Modal.