Thatguy099 commited on
Commit
2d9747a
·
verified ·
1 Parent(s): 9967888

Upload dataset_downloader.ipynb

Browse files
Files changed (1) hide show
  1. dataset_downloader.ipynb +89 -0
dataset_downloader.ipynb ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Dataset Downloader for URMP, MDB-stem-synth, and MIR1k
2
+
3
+ This Google Colab notebook provides code to download the URMP, MDB-stem-synth, and MIR1k datasets.
4
+
5
+ ## Setup
6
+
7
+ First, we will install necessary libraries and set up the environment.
8
+
9
+ ```python
10
+ # No specific Python libraries are needed for direct downloads via shell commands.
11
+ # We will use shell commands directly in the notebook.
12
+ ```
13
+
14
+ ## URMP Dataset
15
+
16
+ The URMP (University of Rochester Multi-Modal Music Performance) dataset is approximately 12.5 GB.
17
+
18
+ **Source:** [https://labsites.rochester.edu/air/projects/URMP.html](https://labsites.rochester.edu/air/projects/URMP.html)
19
+
20
+ ```bash
21
+ # The direct download link for URMP dataset is currently unavailable or requires special access.
22
+ # Please visit the official website and follow their instructions for download.
23
+ # Source: https://labsites.rochester.edu/air/projects/URMP.html
24
+
25
+ # If you manage to download the zip file, you can unzip it using:
26
+ # unzip URMP_dataset.zip
27
+
28
+ # Optional: Remove the zip file after extraction to save space
29
+ # rm URMP_dataset.zip
30
+ ```
31
+
32
+ ## MDB-stem-synth Dataset
33
+
34
+ The MDB-stem-synth dataset is approximately 1.8 GB.
35
+
36
+ **Source:** [https://zenodo.org/records/1481172](https://zenodo.org/records/1481172)
37
+
38
+ ```bash
39
+ # Download the MDB-stem-synth dataset
40
+ wget -c https://zenodo.org/record/1481172/files/MDB-stem-synth.tar.gz
41
+
42
+ # Extract the dataset
43
+ tar -xzf MDB-stem-synth.tar.gz
44
+
45
+ # Optional: Remove the tar.gz file after extraction
46
+ # rm MDB-stem-synth.tar.gz
47
+ ```
48
+
49
+ ## MIR1k Dataset
50
+
51
+ The MIR1k dataset is designed for singing voice separation.
52
+
53
+ **Source:** [http://mirlab.org/dataset/public/MIR-1K.rar](http://mirlab.org/dataset/public/MIR-1K.rar)
54
+
55
+ ```bash
56
+ # Download the MIR1k dataset
57
+ # The MIR1k dataset on Zenodo requires login for download. Please visit the link below and download it manually after logging in.
58
+ # Source: https://zenodo.org/records/3532216
59
+
60
+ # Install unrar to extract .rar files
61
+ apt-get update
62
+ apt-get install -y unrar
63
+
64
+ # Extract the dataset
65
+ unrar x MIR-1K.rar
66
+
67
+ # Optional: Remove the rar file after extraction
68
+ # rm MIR-1K.rar
69
+ ```
70
+
71
+ ## Verification
72
+
73
+ After running the above cells, you should find the extracted datasets in your Colab environment's file system. You can verify their presence using the `ls` command.
74
+
75
+ ```bash
76
+ # List contents of the current directory to see downloaded datasets
77
+ ls -F
78
+
79
+ # List contents of the URMP dataset directory (adjust path if needed)
80
+ ls -F URMP_dataset/
81
+
82
+ # List contents of the MDB-stem-synth dataset directory (adjust path if needed)
83
+ ls -F MDB-stem-synth/
84
+
85
+ # List contents of the MIR1k dataset directory (adjust path if needed)
86
+ ls -F MIR-1K/
87
+ ```
88
+
89
+