memory-keeper / README.md
likki1715's picture
Update README.md
cc20c37 verified
|
Raw
History Blame Contribute Delete
3 kB
metadata
title: Memory Keeper
emoji: πŸƒ
colorFrom: indigo
colorTo: pink
sdk: docker
pinned: false
license: mit
short_description: Turn voice and photos into memory books
tags:
  - build-small-hackathon
  - thousand-token-wood
  - modal
  - custom-ui

Memory Keeper 🌲 (Track: Thousand Token Wood)

πŸ’» GitHub Repository | πŸŽ₯ Watch the Demo Video | πŸ“ LinkedIn Post

πŸ“– The Story: Why I Built Memory Keeper

We all have those little momentsβ€”a beautiful sunset, a fleeting thought we record as a voice note, or a random photo that captures a specific feeling. But more often than not, these memories get lost in the endless scroll of our camera rolls or the unorganized abyss of our voice memos.

I built Memory Keeper for the Hugging Face "Build Small" Hackathon because I wanted a whimsical, personal digital archive that actually understands these fragments. I wanted a tool that could take my raw audio notes and spontaneous photos, and weave them together into beautifully structured storybooks and letters to my future self.

I chose the Thousand Token Wood track because this project isn't just about utility; it’s about creating something deeply personal, experimental, and delightful.

✨ The Magic: How It Works

Memory Keeper is an entirely open-weight, multi-modal AI pipeline that acts as your personal archivist:

  1. Upload: You upload a photo, record a voice note, or simply type a thought into the custom glassmorphic UI.
  2. Perception: The backend immediately spins up specialized "small models" to perceive the inputs. It runs openai/whisper-base to transcribe the audio and Salesforce/blip-image-captioning-base to generate rich semantic descriptions of the photos.
  3. Synthesis: A central orchestrator LLM (Qwen/Qwen2.5-7B-Instruct) takes all these pieces, looks at your history, and writes a narrative timeline, a structured story, and a personal letter summarizing the memory.

πŸ—οΈ Architecture & Deployment

To keep the application incredibly lightweight while maintaining a premium feel, the system uses a decoupled frontend-backend architecture:

  • Frontend (Hugging Face Spaces): A completely custom HTML/CSS UI built on top of gradio.Server. This bypasses the standard Gradio blocks to deliver a stunning visual experience while strictly adhering to the hackathon's Gradio requirement.
  • Compute Engine (Modal): Heavy AI perception tasks are offloaded to A10G GPUs via Modal. These endpoints are entirely serverless, meaning they scale to zero when not in use, keeping the memory footprint minimal.

Local Development

To run the frontend locally for testing:

pip install -r requirements.txt
python app.py