--- title: Memory Bridge emoji: ๐Ÿ•ฏ๏ธ colorFrom: indigo colorTo: red sdk: gradio python_version: '3.12' app_file: app.py pinned: false license: mit short_description: Talk to a loved one via letters, photos & voice. tags: - track:wood - sponsor:openbmb - sponsor:nvidia - sponsor:modal - achievement:sharing --- # Memory Bridge (Memory Keeper) ๐Ÿš€ **Try it live here:** [https://huggingface.co/spaces/build-small-hackathon/memory-bridge](https://huggingface.co/spaces/build-small-hackathon/memory-bridge) ## ๐Ÿ† Hackathon Submission: Build Small Hackathon **Value Proposition:** Memory Bridge offers a comforting way to preserve the legacy and essence of loved ones who have passed away. By uploading fragments of their lifeโ€”letters, photos, and voice recordingsโ€”users can interact with an AI persona that embodies their personality, memories, and voice. **Intended Users:** Individuals grieving the loss of a loved one or families wishing to preserve the digital memory of their ancestors for future generations. - **Demo Video URL:** [Placeholder: Add Video URL] - **Social Post URL:** [https://x.com/sheikhMdRakib23/status/2066510302041235609](https://x.com/sheikhMdRakib23/status/2066510302041235609) --- ## ๐Ÿ“– Overview Memory Bridge (also known as Memory Keeper) is a multimodal AI application that allows you to "talk" to a deceased loved one. By analyzing letters, photographs, scanned documents, and voice notes, the system constructs a rich AI persona that preserves their memories, personality traits, and even their voice. ## ๐Ÿ’ก Problem Being Solved When we lose someone, their voice, their stories, and their unique way of speaking slowly fade from our immediate memory. While photos and letters remain static, Memory Bridge brings these artifacts to life, offering a dynamic, conversational way to interact with the essence of a loved one to help with the grieving process and to keep their memory alive. ## โœจ Key Features - **Multimodal Persona Generation:** Upload text (letters, diaries), photos, scanned documents, and audio (voice notes) to build a comprehensive persona. - **Voice Synthesis:** Hear their responses in a synthesized voice. - **Interactive Chat Interface:** Talk to the generated persona in real-time. - **Multilingual Support:** Chat in English, Bengali, Hindi, Chinese, Japanese, Korean, and Thai. - **Persistent Memories:** Save personas and retrieve them later using a unique Persona ID. ## ๐Ÿ› ๏ธ Tech Stack * **Frontend / UI:** Gradio * **Backend Infrastructure:** Modal * **Hosting:** Hugging Face Spaces * **Storage:** Modal Volumes * **Inference:** Open-source AI models running on Modal and MiniCPM endpoints ## ๐Ÿค– Models Used All models used in Memory Bridge are individually below the 32B parameter limit required by the Build Small Hackathon. | Model | Purpose | | ------------------------------------ | ---------------------------------------------------------------- | | **MiniCPM4.1-8B** | Persona generation and conversational AI | | **MiniCPM-V 4.6 (8B)** | Photo understanding and visual memory extraction | | **Cohere Transcribe 03-2026 (~2B)** | Speech-to-text transcription of voice recordings | | **NVIDIA Nemotron Parse v1.2 (<1B)** | OCR and document understanding for scanned letters and documents | | **VoxCPM2 (~1B)** | Text-to-speech voice synthesis | | **Tiny Aya Fire (3.35B)** | South Asian multilingual language support | | **Tiny Aya Water (3.35B)** | Asia-Pacific multilingual language support | ### AI Pipeline 1. **Voice Notes โ†’ Cohere Transcribe** * Converts uploaded audio into text. 2. **Photos โ†’ MiniCPM-V 4.6** * Generates detailed descriptions of people, scenes, and emotional context. 3. **Scanned Documents โ†’ Nemotron Parse** * Extracts text from handwritten or printed documents. 4. **Persona Creation โ†’ MiniCPM4.1-8B** * Builds a structured memory profile including: * Personality traits * Speech style * Key memories * Values * Voice characteristics 5. **Conversation โ†’ MiniCPM4.1-8B** * Powers real-time conversations with the preserved persona. 6. **Voice Response โ†’ VoxCPM2** * Converts generated responses into speech. 7. **Multilingual Support โ†’ Tiny Aya Fire & Tiny Aya Water** * Supports Bengali, Hindi, Chinese, Japanese, Korean, Thai, and other languages. ## ๐Ÿ“ Repository Structure ``` . โ”œโ”€โ”€ app.py # Main Gradio application frontend and API orchestration โ”œโ”€โ”€ modal_app.py # Backend Modal endpoints and AI model inference logic โ”œโ”€โ”€ requirements.txt # Python dependencies โ””โ”€โ”€ README.md # Project documentation ``` *(Note: The backend model inference code in `modal_app.py` runs externally on Modal endpoints.)* ## ๐Ÿ”ง Configuration The application connects to external Modal endpoints. The URLs are hardcoded in `app.py`. No additional local configuration or API keys are required to run the frontend if the Modal endpoints are active and public. ## ๐Ÿ’ป Usage Instructions First, visit the live application at [Memory Bridge](https://huggingface.co/spaces/build-small-hackathon/memory-bridge). 1. **Preserve a Memory (Tab 1):** Enter the person's name, your relationship, and upload any available texts, photos, or voice notes. Click "Preserve Their Memory" and wait for the Persona ID to be generated. 2. **Talk to Them (Tab 2):** Paste the generated Persona ID, select a language, and enable "Voice Response" if desired. Start chatting! 3. **Saved Memories (Tab 3):** View a list of previously created personas and their IDs. ## โ˜๏ธ Deployment Instructions This application consists of two parts: a backend hosted on Modal and a frontend hosted on Hugging Face Spaces. ### Backend (Modal) 1. Install Modal and set up your token: `pip install modal` and `modal setup`. 2. Deploy the backend endpoints: ```bash modal deploy modal_app.py ``` *(Note: Update the endpoint URLs in `app.py` if your deployed Modal URLs differ).* ### Frontend (Hugging Face Spaces) 1. Create a new Hugging Face Space and select the **Gradio** SDK. 2. Push `app.py`, `requirements.txt`, and `README.md` to the Space repository. 3. The Space will automatically build and launch the Gradio app. ## โš ๏ธ Limitations - **Cold Starts:** The backend Modal endpoints may go to sleep. Initial requests (building a persona or the first chat message) might take 1โ€“3 minutes or require a retry. - **Backend Dependency:** The frontend is entirely reliant on the specific Modal endpoints defined in the code. If these endpoints are taken down, the app will not function. - **Emotional Impact:** Interacting with a simulated deceased loved one can be emotionally taxing. ## ๐Ÿ”ฎ Future Improvements - Local inference support to remove reliance on external endpoints. - More granular control over voice cloning parameters. - Support for video uploads to extract dynamic mannerisms. - Enhanced long-term memory retrieval in chat. ## ๐Ÿ™ Credits - Backend Modal endpoints developed by Sheikh Md Rakib. - Powered by open-source models from Hugging Face, Cohere, OpenBMB, and NVIDIA. - Hosted on Hugging Face Spaces and Modal.