File size: 8,549 Bytes
e75ee4c
9ccc1e0
0f3e4c3
9ccc1e0
df756df
e75ee4c
cee0097
e75ee4c
 
9ccc1e0
e75ee4c
 
9ccc1e0
5b51485
cee0097
5b51485
cee0097
5b51485
 
9ccc1e0
4d8957d
9ccc1e0
 
5b51485
 
 
cee0097
5b51485
cee0097
f7e26a4
cee0097
f7e26a4
cee0097
f7e26a4
cee0097
 
 
 
 
1e69b92
 
cee0097
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5b51485
cee0097
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29dbf34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cee0097
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5b51485
9ccc1e0
cee0097
9ccc1e0
f7e26a4
5b51485
cee0097
 
9ccc1e0
5b51485
9ccc1e0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
---
title: SYNTHIA
emoji: 🎹
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: Browser-based MIDI keyboard with recording and synthesis
---

# SYNTHIA

Play, record, and let AI continue your musical phrases in real-time. 🎹

## οΏ½ Quick Start

```bash
# Install dependencies
uv sync

# Run the app
uv run python app.py
```

Open **http://127.0.0.1:7860**

---

## πŸ—οΈ Architecture Overview

**SYNTHIA** is a browser-based MIDI keyboard with three main layers:

1. **Backend** (Python/Gradio): Configuration, MIDI engines, model loading
2. **Frontend** (JavaScript/Tone.js): Audio synthesis, keyboard rendering, event handling
3. **Communication**: Gradio bridge for sending recorded MIDI to backend for processing

### Data Flow

```
User plays keyboard
    ↓
JavaScript captures MIDI events β†’ records to array
    ↓
User clicks "Play/Process"
    ↓
Backend engine processes recorded events
    ↓
Result returned as MIDI events
    ↓
JavaScript plays result through Tone.js synth
```

---

## πŸ“‚ File Responsibilities

### Backend Files

| File | Purpose |
|------|---------|
| **app.py** | Gradio app setup, UI layout, instrument definitions, API endpoints |
| **config.py** | Global settings (audio parameters, model paths, inference defaults) |
| **engines.py** | Three MIDI processing engines: `parrot` (repeat), `reverse_parrot` (reverse), `godzilla_continue` (AI generation) |
| **midi_model.py** | Godzilla model loading, tokenization, inference |
| **midi.py** | MIDI file utilities (encode/decode, cleanup, utilities) |

### Frontend Files

| File | Purpose |
|------|---------|
| **keyboard.html** | DOM structure (keyboard grid, controls, terminal) |
| **keyboard.js** | Main application logic: keyboard rendering, audio synthesis (Tone.js), recording, UI event binding, engine communication |
| **styles.css** | Styling and animations |

### Configuration & Dependencies

| File | Purpose |
|------|---------|
| **requirements.txt** | Python dependencies |
| **pyproject.toml** | Project metadata |

---

## 🎹 Core Functionality

### Keyboard Controls
- **Click keys** or **press computer keys** to play notes
- **Record button**: Capture MIDI events from keyboard
- **Play button**: Play back recorded events
- **Save button**: Download recording as .mid file
- **Game mode**: Take turns with AI completing phrases

### MIDI Engines
1. **Parrot**: Repeats your exact melody
2. **Reverse Parrot**: Plays melody backward
3. **Godzilla**: AI generates musical continuations using transformer model

### UI Features
- **Engine selector**: Choose processing method
- **Style selector**: AI style (melodic, energetic, ornate, etc.)
- **Response mode**: Control AI generation behavior
- **Runtime selector**: GPU (fast) vs CPU (reliable)
- **Instrument selector**: Change synth sound
- **AI voice selector**: Change AI synth sound
- **Terminal**: Real-time event logging

---

## πŸ”§ How to Add New Functionality

### Adding a New MIDI Engine

1. **In `engines.py`**, add a new function:
   ```python
   def my_new_engine(events, options):
       # Process MIDI events
       return processed_events
   ```

2. **In `app.py`**, register the engine in `process_events()`:
   ```python
   elif engine == 'my_engine':
       result_events = my_new_engine(events, options)
   ```

3. **In `app.py`**, add to engine dropdown:
   ```python
   with gr.Group(label="Engine"):
       engine = gr.Dropdown(
           choices=['parrot', 'reverse_parrot', 'godzilla_continue', 'my_engine'],
           # ...
       )
   ```

4. **In `keyboard.js`**, add tooltip (line ~215 in `populateEngineSelect()`):
   ```javascript
   const engineTooltips = {
       'my_engine': 'Description of what your engine does'
   };
   ```

### Adding a New Control Selector

1. **In `app.py`**, create the selector in the UI:
   ```python
   my_control = gr.Dropdown(
       choices=['option1', 'option2'],
       label="My Control",
       value='option1'
   )
   ```

2. **In `keyboard.js`** (line ~1510), add to `selectControls` array:
   ```javascript
   {
       element: myControlSelect,
       getter: () => ({ label: myControlSelect.value }),
       message: (result) => `Control switched to: ${result.label}`
   }
   ```

3. **In `keyboard.js`**, pass control to engine via `processEventsThroughEngine()`:
   ```javascript
   const engineOptions = {
       my_control: document.getElementById('myControl').value,
       // ... other options
   };
   ```

### Adding a New Response Mode

1. **In `keyboard.js`** (line ~175), add preset definition:
   ```javascript
   const RESPONSE_MODES = {
       'my_mode': {
           label: 'My Mode',
           processFunction: (events) => {
               // Processing logic
               return processedEvents;
           }
       }
   };
   ```

2. **In `app.py`**, add to response mode dropdown

3. **Use in engine logic** via `getSelectedResponseMode()`

---

## πŸ”„ Recent Refactoring (Feb 2026)

Code consolidation to improve maintainability:

- **Consolidated getter functions**: Single `getSelectedPreset()` replaces 3 similar functions
- **Unified event listeners**: Loop-based pattern for select controls (runtime, style, mode, length)
- **Extracted helper functions**: `resetAllNotesAndVisuals()` replaces 3 duplicated blocks
- **Result**: Reduced redundancy, easier to modify preset logic, consistent patterns

---

## ⚑ Benchmarking

`benchmark.py` measures Godzilla model generation speed across all combinations of input length and generation length, with CPU and GPU compared side by side.

### What it tests

| Axis | Values |
|------|--------|
| Input length | Short (8 notes, ~4 s) Β· Long (90 notes, ~18 s) |
| Generation length | 32 Β· 64 Β· 96 Β· 128 tokens (matches the four UI presets) |
| Devices | CPU always Β· CUDA if available |

Each combination runs a warm-up pass (model load, timing discarded) followed by `--runs` timed passes. The summary tables report mean, std, min, max in both ms and seconds, plus tokens/sec and GPU speedup.

### Usage

```bash
# Full sweep β€” CPU + GPU (if available), 5 runs per combination
uv run python benchmark.py

# CPU only (useful for verifying the script or on CPU-only machines)
uv run python benchmark.py --cpu-only

# Increase runs for tighter statistics
uv run python benchmark.py --runs 10

# Multi-candidate generation (higher quality, slower)
uv run python benchmark.py --candidates 3
```

Results are printed to stdout and saved to `benchmark_results.txt` (override with `--output`).

### Example output

```
============================================================
  Device: CUDA  |  candidates=1
============================================================
  [warm-up] loading model + first inference...
  input=short (8 notes, ~4s)   gen= 32 tokens  [1:85ms] [2:82ms] ...
  ...

================================================================================
  SUMMARY β€” CUDA  |  candidates=1
================================================================================
  Input                     Gen tok   Mean ms    Mean s   Std ms   Min ms   Max ms   tok/s
  -----------------------------------------------------------------------------------------
  short (8 notes, ~4s)           32        85      0.09      2.1       82       89   376.5
  short (8 notes, ~4s)          128       290      0.29      4.3      284      297   441.4
  long  (90 notes, ~18s)         32        91      0.09      1.8       88       94   351.6
  long  (90 notes, ~18s)        128       305      0.31      3.9      299      312   419.7
```

---

## πŸ› οΈ Development Tips

### Debugging
- **Terminal in UI**: Shows all MIDI events and engine responses
- **Browser console**: `F12` for JavaScript errors
- **Python terminal**: Check server-side logs for model loading, inference errors

### Testing New Engines
1. Record a simple 3-5 note progression
2. Play back with different engines
3. Check terminal for processing details
4. Verify output notes are in valid range (0-127)

### Performance
- **Recording**: Event capture happens in JavaScript (fast, local)
- **Processing**: May take 2-5 seconds depending on engine and model
- **Playback**: Tone.js synthesis is real-time (instant)

---

## πŸ”§ Technology Stack

- **Frontend**: Tone.js v6+ (Web Audio API)
- **Backend**: Gradio 5.49.1 + Python 3.10+
- **MIDI**: mido library
- **Model**: Godzilla Piano Transformer (via Hugging Face)

---

## πŸ“ License

Open source - free to use and modify.