rescored / docs /README.md
calebhan's picture
vocal separation and bytedance integration
e7bf1e6

Rescored Documentation

Project Vision

Rescored is an AI-powered music transcription and notation editor that converts YouTube videos into editable sheet music. Musicians can paste a URL, get professional-quality notation, edit it, and export in standard formats.

What This Documentation Covers

This documentation serves as the technical blueprint for implementing Rescored. It focuses on:

  • Architecture & Design: System structure, technology choices, and rationale
  • Backend Processing: Audio extraction, ML transcription pipeline, API design
  • Frontend Interface: Sheet music rendering, interactive editing, playback
  • Integration: How components communicate and data flows through the system
  • Implementation Roadmap: MVP scope and future feature phases

Reading Guide

For Full-Stack Developers (Start Here)

  1. Getting Started - Overview and context
  2. Architecture Overview - How everything fits together
  3. MVP Scope - What to build first
  4. Deep dive into Backend and Frontend sections

For Backend Engineers

  1. Architecture Overview
  2. Technology Stack - Backend choices
  3. Audio Processing Pipeline - Core workflow
  4. API Design
  5. Background Workers
  6. Testing Guide - Writing and running tests

For Frontend Engineers

  1. Architecture Overview
  2. Technology Stack - Frontend choices
  3. Notation Rendering
  4. Interactive Editor
  5. Playback System
  6. Data Flow

For Product/Design

  1. MVP Scope
  2. Architecture Overview - User flow
  3. Known Challenges - Limitations to design around

Documentation Structure

Architecture

Backend

Frontend

Integration

Features

Research

Reference

Key Principles

  1. MVP First: Focus on single-instrument (piano) transcription before multi-instrument
  2. Quality Over Speed: Prioritize transcription accuracy over processing time
  3. Editable Output: Transcription won't be perfect—editor is critical for fixing errors
  4. Standard Formats: Use MusicXML/MIDI for maximum compatibility
  5. Async Everything: Audio processing is slow—use queues and WebSocket updates

Quick Reference

Primary Use Case: YouTube URL → Transcribed piano sheet music → Edit → Export

Tech Stack Summary:

  • Frontend: React + VexFlow + Tone.js
  • Backend: Python/FastAPI + Celery + Redis
  • ML: Demucs (source separation) + basic-pitch (transcription)
  • Formats: MusicXML (primary), MIDI (export)

MVP Timeline: Focused on getting piano transcription working end-to-end with basic editing

Contributing to Documentation

As implementation progresses, update these docs with:

  • Actual code examples and API samples
  • Performance benchmarks and metrics
  • Lessons learned and gotchas
  • Configuration details and environment setup

Need Help?