susnato commited on
Commit
bfedd54
·
1 Parent(s): 352d412

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -14
README.md CHANGED
@@ -4,8 +4,6 @@
4
  {}
5
  ---
6
 
7
- DISCLAIMER : I don't own the weights of Pop2Piano, this repo was created during the integration of Pop2Piano to HF transformers.
8
-
9
  # POP2PIANO
10
 
11
  Pop2Piano, a Transformer network that generates piano covers given waveforms of pop
@@ -14,44 +12,99 @@ music.
14
 
15
  Pop2Piano was proposed in the paper [Pop2Piano : Pop Audio-based Piano Cover Generation](https://arxiv.org/abs/2211.00895) by Jongho Choi and Kyogu Lee.
16
 
17
- Inspired by [T5](https://arxiv.org/abs/1910.10683), Pop2Piano
18
- is the first model to generate a piano cover directly from pop audio without using melody and
19
- chord extraction modules.
 
 
 
 
 
 
20
 
21
  ## Model Sources
22
 
23
- - [**Original Repository**](https://github.com/sweetcocoa/pop2piano)
24
  - [**Paper**](https://arxiv.org/abs/2211.00895)
25
- - [**Demo**]# TODO (after the ongoing PR is merged)
 
26
 
27
  # Usage
28
 
29
- First, install the required packages:
30
 
31
  ```
32
- pip install --upgrade transformers
 
33
  ```
 
34
 
35
  ## Pop music to Piano
36
 
37
- TODO (after the ongoing PR is merged)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
  ## Example
 
40
 
41
- ### Pop Music
42
 
43
  <audio controls>
44
  <source src="https://datasets-server.huggingface.co/assets/sweetcocoa/pop2piano_ci/--/sweetcocoa--pop2piano_ci/test/0/audio/audio.mp3" type="audio/mpeg">
45
  Your browser does not support the audio element.
46
  </audio>
47
 
48
- ### Generated MIDI
49
 
50
- TODO (after the MIDI version is uploaded to the same repo above)
 
 
 
51
 
52
  ## Tips
53
 
54
- TODO
 
 
 
 
 
55
 
56
  # Citation
57
 
 
4
  {}
5
  ---
6
 
 
 
7
  # POP2PIANO
8
 
9
  Pop2Piano, a Transformer network that generates piano covers given waveforms of pop
 
12
 
13
  Pop2Piano was proposed in the paper [Pop2Piano : Pop Audio-based Piano Cover Generation](https://arxiv.org/abs/2211.00895) by Jongho Choi and Kyogu Lee.
14
 
15
+ Piano covers of pop music are widely enjoyed, but generating them from music is not a trivial task. It requires great
16
+ expertise with playing piano as well as knowing different characteristics and melodies of a song. With Pop2Piano you
17
+ can directly generate a cover from a song's audio waveform. It is the first model to directly generate a piano cover
18
+ from pop audio without melody and chord extraction modules.
19
+
20
+ Pop2Piano is an encoder-decoder Transformer model based on [T5](https://arxiv.org/pdf/1910.10683.pdf). The input audio
21
+ is transformed to its waveform and passed to the encoder, which transforms it to a latent representation. The decoder
22
+ uses these latent representations to generate token ids in an autoregressive way. Each token id corresponds to one of four
23
+ different token types: time, velocity, note and 'special'. The token ids are then decoded to their equivalent MIDI file.
24
 
25
  ## Model Sources
26
 
 
27
  - [**Paper**](https://arxiv.org/abs/2211.00895)
28
+ - [**Original Repository**](https://github.com/sweetcocoa/pop2piano)
29
+ - [**HuggingFace Space Demo**](https://huggingface.co/spaces/sweetcocoa/pop2piano)
30
 
31
  # Usage
32
 
33
+ To use Pop2Piano, you will need to install the 🤗 Transformers library, as well as the following third party modules:
34
 
35
  ```
36
+ pip install https://github.com/huggingface/transformers.git
37
+ pip install pretty-midi==0.2.9 essentia==2.1b6.dev1034 librosa scipy
38
  ```
39
+ Please note that you may need to restart your runtime after installation.
40
 
41
  ## Pop music to Piano
42
 
43
+ ### Code Example
44
+
45
+ - Using your own Audio
46
+
47
+ ```python
48
+ >>> import librosa
49
+ >>> from transformers import Pop2PianoForConditionalGeneration, Pop2PianoProcessor
50
+
51
+ >>> audio, sr = librosa.load("<your_audio_file_here>", sr=44100) # feel free to change the sr to a suitable value.
52
+ >>> model = Pop2PianoForConditionalGeneration.from_pretrained("sweetcocoa/pop2piano")
53
+ >>> processor = Pop2PianoProcessor.from_pretrained("sweetcocoa/pop2piano")
54
+
55
+ >>> inputs = processor(audio=audio, sampling_rate=sr, return_tensors="pt")
56
+ >>> model_output = model.generate(input_features=inputs["input_features"], composer="composer1")
57
+ >>> tokenizer_output = processor.batch_decode(
58
+ ... token_ids=model_output, feature_extractor_output=inputs
59
+ ... )["pretty_midi_objects"][0]
60
+ >>> tokenizer_output.write("./Outputs/midi_output.mid")
61
+ ```
62
+
63
+ - Audio from Hugging Face Hub
64
+
65
+ ```python
66
+ >>> from datasets import load_dataset
67
+ >>> from transformers import Pop2PianoForConditionalGeneration, Pop2PianoProcessor
68
+
69
+ >>> model = Pop2PianoForConditionalGeneration.from_pretrained("sweetcocoa/pop2piano")
70
+ >>> processor = Pop2PianoProcessor.from_pretrained("sweetcocoa/pop2piano")
71
+ >>> ds = load_dataset("sweetcocoa/pop2piano_ci", split="test")
72
+
73
+ >>> inputs = processor(
74
+ ... audio=ds["audio"][0]["array"], sampling_rate=ds["audio"][0]["sampling_rate"], return_tensors="pt"
75
+ ... )
76
+ >>> model_output = model.generate(input_features=inputs["input_features"], composer="composer1")
77
+ >>> tokenizer_output = processor.batch_decode(
78
+ ... token_ids=model_output, feature_extractor_output=inputs
79
+ ... )["pretty_midi_objects"][0]
80
+ >>> tokenizer_output.write("./Outputs/midi_output.mid")
81
+ ```
82
 
83
  ## Example
84
+ Here we present an example of generated MIDI.
85
 
86
+ - Actual Pop Music
87
 
88
  <audio controls>
89
  <source src="https://datasets-server.huggingface.co/assets/sweetcocoa/pop2piano_ci/--/sweetcocoa--pop2piano_ci/test/0/audio/audio.mp3" type="audio/mpeg">
90
  Your browser does not support the audio element.
91
  </audio>
92
 
93
+ - Generated MIDI
94
 
95
+ <audio controls>
96
+ <source src="https://datasets-server.huggingface.co/assets/sweetcocoa/pop2piano_ci/--/sweetcocoa--pop2piano_ci/test/1/audio/audio.mp3" type="audio/mpeg">
97
+ Your browser does not support the audio element.
98
+ </audio>
99
 
100
  ## Tips
101
 
102
+ 1. Pop2Piano is an Encoder-Decoder based model like T5.
103
+ 2. Pop2Piano can be used to generate midi-audio files for a given audio sequence.
104
+ 3. Choosing different composers in `Pop2PianoForConditionalGeneration.generate()` can lead to variety of different results.
105
+ 4. Setting the sampling rate to 44.1 kHz when loading the audio file can give good performance.
106
+ 5. Though Pop2Piano was mainly trained on Korean Pop music, it also does pretty well on other Western Pop or Hip Hop songs.
107
+
108
 
109
  # Citation
110