--- title: SYNTHIA emoji: ๐ŸŽน colorFrom: purple colorTo: blue sdk: gradio sdk_version: 5.49.1 app_file: app.py pinned: false short_description: Browser-based MIDI keyboard with recording and synthesis --- # SYNTHIA Play, record, and let AI continue your musical phrases in real-time. ๐ŸŽน ## ๏ฟฝ Quick Start ```bash # Install dependencies uv sync # Run the app uv run python app.py ``` Open **http://127.0.0.1:7860** --- ## ๐Ÿ—๏ธ Architecture Overview **SYNTHIA** is a browser-based MIDI keyboard with three main layers: 1. **Backend** (Python/Gradio): Configuration, MIDI engines, model loading 2. **Frontend** (JavaScript/Tone.js): Audio synthesis, keyboard rendering, event handling 3. **Communication**: Gradio bridge for sending recorded MIDI to backend for processing ### Data Flow ``` User plays keyboard โ†“ JavaScript captures MIDI events โ†’ records to array โ†“ User clicks "Play/Process" โ†“ Backend engine processes recorded events โ†“ Result returned as MIDI events โ†“ JavaScript plays result through Tone.js synth ``` --- ## ๐Ÿ“‚ File Responsibilities ### Backend Files | File | Purpose | |------|---------| | **app.py** | Gradio app setup, UI layout, instrument definitions, API endpoints | | **config.py** | Global settings (audio parameters, model paths, inference defaults) | | **engines.py** | Three MIDI processing engines: `parrot` (repeat), `reverse_parrot` (reverse), `godzilla_continue` (AI generation) | | **midi_model.py** | Godzilla model loading, tokenization, inference | | **midi.py** | MIDI file utilities (encode/decode, cleanup, utilities) | ### Frontend Files | File | Purpose | |------|---------| | **keyboard.html** | DOM structure (keyboard grid, controls, terminal) | | **keyboard.js** | Main application logic: keyboard rendering, audio synthesis (Tone.js), recording, UI event binding, engine communication | | **styles.css** | Styling and animations | ### Configuration & Dependencies | File | Purpose | |------|---------| | **requirements.txt** | Python dependencies | | **pyproject.toml** | Project metadata | --- ## ๐ŸŽน Core Functionality ### Keyboard Controls - **Click keys** or **press computer keys** to play notes - **Record button**: Capture MIDI events from keyboard - **Play button**: Play back recorded events - **Save button**: Download recording as .mid file - **Game mode**: Take turns with AI completing phrases ### MIDI Engines 1. **Parrot**: Repeats your exact melody 2. **Reverse Parrot**: Plays melody backward 3. **Godzilla**: AI generates musical continuations using transformer model ### UI Features - **Engine selector**: Choose processing method - **Style selector**: AI style (melodic, energetic, ornate, etc.) - **Response mode**: Control AI generation behavior - **Runtime selector**: GPU (fast) vs CPU (reliable) - **Instrument selector**: Change synth sound - **AI voice selector**: Change AI synth sound - **Terminal**: Real-time event logging --- ## ๐Ÿ”ง How to Add New Functionality ### Adding a New MIDI Engine 1. **In `engines.py`**, add a new function: ```python def my_new_engine(events, options): # Process MIDI events return processed_events ``` 2. **In `app.py`**, register the engine in `process_events()`: ```python elif engine == 'my_engine': result_events = my_new_engine(events, options) ``` 3. **In `app.py`**, add to engine dropdown: ```python with gr.Group(label="Engine"): engine = gr.Dropdown( choices=['parrot', 'reverse_parrot', 'godzilla_continue', 'my_engine'], # ... ) ``` 4. **In `keyboard.js`**, add tooltip (line ~215 in `populateEngineSelect()`): ```javascript const engineTooltips = { 'my_engine': 'Description of what your engine does' }; ``` ### Adding a New Control Selector 1. **In `app.py`**, create the selector in the UI: ```python my_control = gr.Dropdown( choices=['option1', 'option2'], label="My Control", value='option1' ) ``` 2. **In `keyboard.js`** (line ~1510), add to `selectControls` array: ```javascript { element: myControlSelect, getter: () => ({ label: myControlSelect.value }), message: (result) => `Control switched to: ${result.label}` } ``` 3. **In `keyboard.js`**, pass control to engine via `processEventsThroughEngine()`: ```javascript const engineOptions = { my_control: document.getElementById('myControl').value, // ... other options }; ``` ### Adding a New Response Mode 1. **In `keyboard.js`** (line ~175), add preset definition: ```javascript const RESPONSE_MODES = { 'my_mode': { label: 'My Mode', processFunction: (events) => { // Processing logic return processedEvents; } } }; ``` 2. **In `app.py`**, add to response mode dropdown 3. **Use in engine logic** via `getSelectedResponseMode()` --- ## ๐Ÿ”„ Recent Refactoring (Feb 2026) Code consolidation to improve maintainability: - **Consolidated getter functions**: Single `getSelectedPreset()` replaces 3 similar functions - **Unified event listeners**: Loop-based pattern for select controls (runtime, style, mode, length) - **Extracted helper functions**: `resetAllNotesAndVisuals()` replaces 3 duplicated blocks - **Result**: Reduced redundancy, easier to modify preset logic, consistent patterns --- ## โšก Benchmarking `benchmark.py` measures Godzilla model generation speed across all combinations of input length and generation length, with CPU and GPU compared side by side. ### What it tests | Axis | Values | |------|--------| | Input length | Short (8 notes, ~4 s) ยท Long (90 notes, ~18 s) | | Generation length | 32 ยท 64 ยท 96 ยท 128 tokens (matches the four UI presets) | | Devices | CPU always ยท CUDA if available | Each combination runs a warm-up pass (model load, timing discarded) followed by `--runs` timed passes. The summary tables report mean, std, min, max in both ms and seconds, plus tokens/sec and GPU speedup. ### Usage ```bash # Full sweep โ€” CPU + GPU (if available), 5 runs per combination uv run python benchmark.py # CPU only (useful for verifying the script or on CPU-only machines) uv run python benchmark.py --cpu-only # Increase runs for tighter statistics uv run python benchmark.py --runs 10 # Multi-candidate generation (higher quality, slower) uv run python benchmark.py --candidates 3 ``` Results are printed to stdout and saved to `benchmark_results.txt` (override with `--output`). ### Example output ``` ============================================================ Device: CUDA | candidates=1 ============================================================ [warm-up] loading model + first inference... input=short (8 notes, ~4s) gen= 32 tokens [1:85ms] [2:82ms] ... ... ================================================================================ SUMMARY โ€” CUDA | candidates=1 ================================================================================ Input Gen tok Mean ms Mean s Std ms Min ms Max ms tok/s ----------------------------------------------------------------------------------------- short (8 notes, ~4s) 32 85 0.09 2.1 82 89 376.5 short (8 notes, ~4s) 128 290 0.29 4.3 284 297 441.4 long (90 notes, ~18s) 32 91 0.09 1.8 88 94 351.6 long (90 notes, ~18s) 128 305 0.31 3.9 299 312 419.7 ``` --- ## ๐Ÿ› ๏ธ Development Tips ### Debugging - **Terminal in UI**: Shows all MIDI events and engine responses - **Browser console**: `F12` for JavaScript errors - **Python terminal**: Check server-side logs for model loading, inference errors ### Testing New Engines 1. Record a simple 3-5 note progression 2. Play back with different engines 3. Check terminal for processing details 4. Verify output notes are in valid range (0-127) ### Performance - **Recording**: Event capture happens in JavaScript (fast, local) - **Processing**: May take 2-5 seconds depending on engine and model - **Playback**: Tone.js synthesis is real-time (instant) --- ## ๐Ÿ”ง Technology Stack - **Frontend**: Tone.js v6+ (Web Audio API) - **Backend**: Gradio 5.49.1 + Python 3.10+ - **MIDI**: mido library - **Model**: Godzilla Piano Transformer (via Hugging Face) --- ## ๐Ÿ“ License Open source - free to use and modify.