Spaces:

build-small-hackathon
/

memory-bridge

Runtime error

App Files Files Community

memory-bridge / README.md

Sheikh Mohammad Rakib

docs: update README with live link, social media URL, and refined technical documentation

a9ffeba 16 days ago

preview code

Raw

History Blame Contribute Delete

7.48 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

metadata

title: Memory Bridge
emoji: 🕯️
colorFrom: indigo
colorTo: red
sdk: gradio
python_version: '3.12'
app_file: app.py
pinned: false
license: mit
short_description: Talk to a loved one via letters, photos & voice.
tags:
  - track:wood
  - sponsor:openbmb
  - sponsor:nvidia
  - sponsor:modal
  - achievement:sharing

Memory Bridge (Memory Keeper)

🚀 Try it live here: https://huggingface.co/spaces/build-small-hackathon/memory-bridge

🏆 Hackathon Submission: Build Small Hackathon

Value Proposition: Memory Bridge offers a comforting way to preserve the legacy and essence of loved ones who have passed away. By uploading fragments of their life—letters, photos, and voice recordings—users can interact with an AI persona that embodies their personality, memories, and voice.

Intended Users: Individuals grieving the loss of a loved one or families wishing to preserve the digital memory of their ancestors for future generations.

Demo Video URL: [Placeholder: Add Video URL]
Social Post URL: https://x.com/sheikhMdRakib23/status/2066510302041235609

📖 Overview

Memory Bridge (also known as Memory Keeper) is a multimodal AI application that allows you to "talk" to a deceased loved one. By analyzing letters, photographs, scanned documents, and voice notes, the system constructs a rich AI persona that preserves their memories, personality traits, and even their voice.

💡 Problem Being Solved

When we lose someone, their voice, their stories, and their unique way of speaking slowly fade from our immediate memory. While photos and letters remain static, Memory Bridge brings these artifacts to life, offering a dynamic, conversational way to interact with the essence of a loved one to help with the grieving process and to keep their memory alive.

✨ Key Features

Multimodal Persona Generation: Upload text (letters, diaries), photos, scanned documents, and audio (voice notes) to build a comprehensive persona.
Voice Synthesis: Hear their responses in a synthesized voice.
Interactive Chat Interface: Talk to the generated persona in real-time.
Multilingual Support: Chat in English, Bengali, Hindi, Chinese, Japanese, Korean, and Thai.
Persistent Memories: Save personas and retrieve them later using a unique Persona ID.

🛠️ Tech Stack

Frontend / UI: Gradio
Backend Infrastructure: Modal
Hosting: Hugging Face Spaces
Storage: Modal Volumes
Inference: Open-source AI models running on Modal and MiniCPM endpoints

🤖 Models Used

All models used in Memory Bridge are individually below the 32B parameter limit required by the Build Small Hackathon.

Model	Purpose
MiniCPM4.1-8B	Persona generation and conversational AI
MiniCPM-V 4.6 (8B)	Photo understanding and visual memory extraction
Cohere Transcribe 03-2026 (~2B)	Speech-to-text transcription of voice recordings
NVIDIA Nemotron Parse v1.2 (<1B)	OCR and document understanding for scanned letters and documents
VoxCPM2 (~1B)	Text-to-speech voice synthesis
Tiny Aya Fire (3.35B)	South Asian multilingual language support
Tiny Aya Water (3.35B)	Asia-Pacific multilingual language support

AI Pipeline

Voice Notes → Cohere Transcribe
- Converts uploaded audio into text.
Photos → MiniCPM-V 4.6
- Generates detailed descriptions of people, scenes, and emotional context.
Scanned Documents → Nemotron Parse
- Extracts text from handwritten or printed documents.
Persona Creation → MiniCPM4.1-8B
- Builds a structured memory profile including:
  - Personality traits
  - Speech style
  - Key memories
  - Values
  - Voice characteristics
Conversation → MiniCPM4.1-8B
- Powers real-time conversations with the preserved persona.
Voice Response → VoxCPM2
- Converts generated responses into speech.
Multilingual Support → Tiny Aya Fire & Tiny Aya Water
- Supports Bengali, Hindi, Chinese, Japanese, Korean, Thai, and other languages.

📁 Repository Structure

.
├── app.py             # Main Gradio application frontend and API orchestration
├── modal_app.py       # Backend Modal endpoints and AI model inference logic
├── requirements.txt   # Python dependencies
└── README.md          # Project documentation

(Note: The backend model inference code in modal_app.py runs externally on Modal endpoints.)

🔧 Configuration

The application connects to external Modal endpoints. The URLs are hardcoded in app.py. No additional local configuration or API keys are required to run the frontend if the Modal endpoints are active and public.

💻 Usage Instructions

First, visit the live application at Memory Bridge.

Preserve a Memory (Tab 1): Enter the person's name, your relationship, and upload any available texts, photos, or voice notes. Click "Preserve Their Memory" and wait for the Persona ID to be generated.
Talk to Them (Tab 2): Paste the generated Persona ID, select a language, and enable "Voice Response" if desired. Start chatting!
Saved Memories (Tab 3): View a list of previously created personas and their IDs.

☁️ Deployment Instructions

This application consists of two parts: a backend hosted on Modal and a frontend hosted on Hugging Face Spaces.

Backend (Modal)

Install Modal and set up your token: pip install modal and modal setup.
Deploy the backend endpoints:
```
modal deploy modal_app.py
```

(Note: Update the endpoint URLs in app.py if your deployed Modal URLs differ).

Frontend (Hugging Face Spaces)

Create a new Hugging Face Space and select the Gradio SDK.
Push app.py, requirements.txt, and README.md to the Space repository.
The Space will automatically build and launch the Gradio app.

⚠️ Limitations

Cold Starts: The backend Modal endpoints may go to sleep. Initial requests (building a persona or the first chat message) might take 1–3 minutes or require a retry.
Backend Dependency: The frontend is entirely reliant on the specific Modal endpoints defined in the code. If these endpoints are taken down, the app will not function.
Emotional Impact: Interacting with a simulated deceased loved one can be emotionally taxing.

🔮 Future Improvements

Local inference support to remove reliance on external endpoints.
More granular control over voice cloning parameters.
Support for video uploads to extract dynamic mannerisms.
Enhanced long-term memory retrieval in chat.

🙏 Credits

Backend Modal endpoints developed by Sheikh Md Rakib.
Powered by open-source models from Hugging Face, Cohere, OpenBMB, and NVIDIA.
Hosted on Hugging Face Spaces and Modal.