Heinrich Dinkel commited on
Commit
6c0b1e0
Β·
1 Parent(s): f8e3b40
Files changed (2) hide show
  1. README.md +5 -5
  2. notebook.ipynb +180 -15
README.md CHANGED
@@ -10,14 +10,14 @@ license: apache-2.0
10
  # DashengTokenizer
11
 
12
  DashengTokenizer is a high-performance continious audio tokenizer designed for audio understanding and generation tasks.
13
- Compared to previous works, our framework simply trains a single linear layer to enable audio generation for semantically strong encoders.
14
 
15
  Achievements:
16
 
17
- * State-of-the-Art Audio Understanding: DashengTokenizer consistently outperforms most previous self-supervised and supervised audio encoders.
18
- * High-Fidelity Signal Reconstruction: Maintains exceptional signal integrity, ensuring that audio remains crisp and accurate after processing.
19
- * Accelerated Audio Generation Training: Achieves optimal performance significantly faster than standard VAE models, reducing training time and costs.
20
- * Superior Speech Enhancement: Provides a more robust encoding foundation for isolating and clarifying speech in noisy environments.
21
 
22
 
23
  ![Framework](./figures/framework.png)
 
10
  # DashengTokenizer
11
 
12
  DashengTokenizer is a high-performance continious audio tokenizer designed for audio understanding and generation tasks.
13
+ Compared to previous works, our framework trains a **single linear layer** to enable audio generation for semantically strong encoders.
14
 
15
  Achievements:
16
 
17
+ * **State-of-the-Art** Audio Understanding: DashengTokenizer consistently outperforms most previous self-supervised and supervised audio encoders.
18
+ * **High-Fidelity** Signal Reconstruction: Maintains exceptional signal integrity, ensuring that audio remains crisp and accurate after processing.
19
+ * Accelerated **Audio Generation** Training: Achieves optimal performance significantly faster than standard VAE models, reducing training time and costs.
20
+ * Superior **Speech Enhancement**: Provides a more robust encoding foundation for isolating and clarifying speech in noisy environments.
21
 
22
 
23
  ![Framework](./figures/framework.png)
notebook.ipynb CHANGED
@@ -2,18 +2,115 @@
2
  "cells": [
3
  {
4
  "cell_type": "code",
5
- "execution_count": null,
6
  "metadata": {},
7
- "outputs": [],
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  "source": [
9
  "!pip install transformers torch torchaudio librosa pandas scikit-learn tqdm"
10
  ]
11
  },
12
  {
13
  "cell_type": "code",
14
- "execution_count": null,
15
  "metadata": {},
16
- "outputs": [],
 
 
 
 
 
 
 
 
 
17
  "source": [
18
  "import torch\n",
19
  "import torch.nn as nn\n",
@@ -30,7 +127,7 @@
30
  },
31
  {
32
  "cell_type": "code",
33
- "execution_count": null,
34
  "metadata": {},
35
  "outputs": [],
36
  "source": [
@@ -60,7 +157,7 @@
60
  },
61
  {
62
  "cell_type": "code",
63
- "execution_count": null,
64
  "metadata": {},
65
  "outputs": [],
66
  "source": [
@@ -82,7 +179,7 @@
82
  },
83
  {
84
  "cell_type": "code",
85
- "execution_count": null,
86
  "metadata": {},
87
  "outputs": [],
88
  "source": [
@@ -110,9 +207,61 @@
110
  },
111
  {
112
  "cell_type": "code",
113
- "execution_count": null,
114
  "metadata": {},
115
- "outputs": [],
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
  "source": [
117
  "# Download dataset\n",
118
  "download_esc50()\n",
@@ -140,9 +289,17 @@
140
  },
141
  {
142
  "cell_type": "code",
143
- "execution_count": null,
144
  "metadata": {},
145
- "outputs": [],
 
 
 
 
 
 
 
 
146
  "source": [
147
  "# Create datasets\n",
148
  "audio_dir = 'ESC-50/audio'\n",
@@ -165,7 +322,15 @@
165
  "cell_type": "code",
166
  "execution_count": null,
167
  "metadata": {},
168
- "outputs": [],
 
 
 
 
 
 
 
 
169
  "source": [
170
  "# Training setup\n",
171
  "optimizer = torch.optim.Adam(classifier.parameters(), lr=1e-3)\n",
@@ -260,7 +425,7 @@
260
  ],
261
  "metadata": {
262
  "kernelspec": {
263
- "display_name": "Python 3",
264
  "language": "python",
265
  "name": "python3"
266
  },
@@ -274,9 +439,9 @@
274
  "name": "python",
275
  "nbconvert_exporter": "python",
276
  "pygments_lexer": "ipython3",
277
- "version": "3.8.0"
278
  }
279
  },
280
  "nbformat": 4,
281
  "nbformat_minor": 4
282
- }
 
2
  "cells": [
3
  {
4
  "cell_type": "code",
5
+ "execution_count": 1,
6
  "metadata": {},
7
+ "outputs": [
8
+ {
9
+ "name": "stdout",
10
+ "output_type": "stream",
11
+ "text": [
12
+ "Requirement already satisfied: transformers in ./.venv/lib/python3.11/site-packages (5.1.0)\n",
13
+ "Requirement already satisfied: torch in ./.venv/lib/python3.11/site-packages (2.10.0)\n",
14
+ "Requirement already satisfied: torchaudio in ./.venv/lib/python3.11/site-packages (2.10.0)\n",
15
+ "Requirement already satisfied: librosa in ./.venv/lib/python3.11/site-packages (0.11.0)\n",
16
+ "Collecting pandas\n",
17
+ " Downloading pandas-3.0.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (79 kB)\n",
18
+ "Requirement already satisfied: scikit-learn in ./.venv/lib/python3.11/site-packages (1.8.0)\n",
19
+ "Requirement already satisfied: tqdm in ./.venv/lib/python3.11/site-packages (4.67.3)\n",
20
+ "Requirement already satisfied: huggingface-hub<2.0,>=1.3.0 in ./.venv/lib/python3.11/site-packages (from transformers) (1.4.1)\n",
21
+ "Requirement already satisfied: numpy>=1.17 in ./.venv/lib/python3.11/site-packages (from transformers) (2.3.5)\n",
22
+ "Requirement already satisfied: packaging>=20.0 in ./.venv/lib/python3.11/site-packages (from transformers) (26.0)\n",
23
+ "Requirement already satisfied: pyyaml>=5.1 in ./.venv/lib/python3.11/site-packages (from transformers) (6.0.3)\n",
24
+ "Requirement already satisfied: regex!=2019.12.17 in ./.venv/lib/python3.11/site-packages (from transformers) (2026.1.15)\n",
25
+ "Requirement already satisfied: tokenizers<=0.23.0,>=0.22.0 in ./.venv/lib/python3.11/site-packages (from transformers) (0.22.2)\n",
26
+ "Requirement already satisfied: typer-slim in ./.venv/lib/python3.11/site-packages (from transformers) (0.23.0)\n",
27
+ "Requirement already satisfied: safetensors>=0.4.3 in ./.venv/lib/python3.11/site-packages (from transformers) (0.7.0)\n",
28
+ "Requirement already satisfied: filelock in ./.venv/lib/python3.11/site-packages (from huggingface-hub<2.0,>=1.3.0->transformers) (3.21.2)\n",
29
+ "Requirement already satisfied: fsspec>=2023.5.0 in ./.venv/lib/python3.11/site-packages (from huggingface-hub<2.0,>=1.3.0->transformers) (2026.2.0)\n",
30
+ "Requirement already satisfied: hf-xet<2.0.0,>=1.2.0 in ./.venv/lib/python3.11/site-packages (from huggingface-hub<2.0,>=1.3.0->transformers) (1.2.0)\n",
31
+ "Requirement already satisfied: httpx<1,>=0.23.0 in ./.venv/lib/python3.11/site-packages (from huggingface-hub<2.0,>=1.3.0->transformers) (0.28.1)\n",
32
+ "Requirement already satisfied: shellingham in ./.venv/lib/python3.11/site-packages (from huggingface-hub<2.0,>=1.3.0->transformers) (1.5.4)\n",
33
+ "Requirement already satisfied: typing-extensions>=4.1.0 in ./.venv/lib/python3.11/site-packages (from huggingface-hub<2.0,>=1.3.0->transformers) (4.15.0)\n",
34
+ "Requirement already satisfied: anyio in ./.venv/lib/python3.11/site-packages (from httpx<1,>=0.23.0->huggingface-hub<2.0,>=1.3.0->transformers) (4.12.1)\n",
35
+ "Requirement already satisfied: certifi in ./.venv/lib/python3.11/site-packages (from httpx<1,>=0.23.0->huggingface-hub<2.0,>=1.3.0->transformers) (2026.1.4)\n",
36
+ "Requirement already satisfied: httpcore==1.* in ./.venv/lib/python3.11/site-packages (from httpx<1,>=0.23.0->huggingface-hub<2.0,>=1.3.0->transformers) (1.0.9)\n",
37
+ "Requirement already satisfied: idna in ./.venv/lib/python3.11/site-packages (from httpx<1,>=0.23.0->huggingface-hub<2.0,>=1.3.0->transformers) (3.11)\n",
38
+ "Requirement already satisfied: h11>=0.16 in ./.venv/lib/python3.11/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->huggingface-hub<2.0,>=1.3.0->transformers) (0.16.0)\n",
39
+ "Requirement already satisfied: sympy>=1.13.3 in ./.venv/lib/python3.11/site-packages (from torch) (1.14.0)\n",
40
+ "Requirement already satisfied: networkx>=2.5.1 in ./.venv/lib/python3.11/site-packages (from torch) (3.6.1)\n",
41
+ "Requirement already satisfied: jinja2 in ./.venv/lib/python3.11/site-packages (from torch) (3.1.6)\n",
42
+ "Requirement already satisfied: cuda-bindings==12.9.4 in ./.venv/lib/python3.11/site-packages (from torch) (12.9.4)\n",
43
+ "Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.8.93 in ./.venv/lib/python3.11/site-packages (from torch) (12.8.93)\n",
44
+ "Requirement already satisfied: nvidia-cuda-runtime-cu12==12.8.90 in ./.venv/lib/python3.11/site-packages (from torch) (12.8.90)\n",
45
+ "Requirement already satisfied: nvidia-cuda-cupti-cu12==12.8.90 in ./.venv/lib/python3.11/site-packages (from torch) (12.8.90)\n",
46
+ "Requirement already satisfied: nvidia-cudnn-cu12==9.10.2.21 in ./.venv/lib/python3.11/site-packages (from torch) (9.10.2.21)\n",
47
+ "Requirement already satisfied: nvidia-cublas-cu12==12.8.4.1 in ./.venv/lib/python3.11/site-packages (from torch) (12.8.4.1)\n",
48
+ "Requirement already satisfied: nvidia-cufft-cu12==11.3.3.83 in ./.venv/lib/python3.11/site-packages (from torch) (11.3.3.83)\n",
49
+ "Requirement already satisfied: nvidia-curand-cu12==10.3.9.90 in ./.venv/lib/python3.11/site-packages (from torch) (10.3.9.90)\n",
50
+ "Requirement already satisfied: nvidia-cusolver-cu12==11.7.3.90 in ./.venv/lib/python3.11/site-packages (from torch) (11.7.3.90)\n",
51
+ "Requirement already satisfied: nvidia-cusparse-cu12==12.5.8.93 in ./.venv/lib/python3.11/site-packages (from torch) (12.5.8.93)\n",
52
+ "Requirement already satisfied: nvidia-cusparselt-cu12==0.7.1 in ./.venv/lib/python3.11/site-packages (from torch) (0.7.1)\n",
53
+ "Requirement already satisfied: nvidia-nccl-cu12==2.27.5 in ./.venv/lib/python3.11/site-packages (from torch) (2.27.5)\n",
54
+ "Requirement already satisfied: nvidia-nvshmem-cu12==3.4.5 in ./.venv/lib/python3.11/site-packages (from torch) (3.4.5)\n",
55
+ "Requirement already satisfied: nvidia-nvtx-cu12==12.8.90 in ./.venv/lib/python3.11/site-packages (from torch) (12.8.90)\n",
56
+ "Requirement already satisfied: nvidia-nvjitlink-cu12==12.8.93 in ./.venv/lib/python3.11/site-packages (from torch) (12.8.93)\n",
57
+ "Requirement already satisfied: nvidia-cufile-cu12==1.13.1.3 in ./.venv/lib/python3.11/site-packages (from torch) (1.13.1.3)\n",
58
+ "Requirement already satisfied: triton==3.6.0 in ./.venv/lib/python3.11/site-packages (from torch) (3.6.0)\n",
59
+ "Requirement already satisfied: cuda-pathfinder~=1.1 in ./.venv/lib/python3.11/site-packages (from cuda-bindings==12.9.4->torch) (1.3.4)\n",
60
+ "Requirement already satisfied: audioread>=2.1.9 in ./.venv/lib/python3.11/site-packages (from librosa) (3.1.0)\n",
61
+ "Requirement already satisfied: numba>=0.51.0 in ./.venv/lib/python3.11/site-packages (from librosa) (0.63.1)\n",
62
+ "Requirement already satisfied: scipy>=1.6.0 in ./.venv/lib/python3.11/site-packages (from librosa) (1.17.0)\n",
63
+ "Requirement already satisfied: joblib>=1.0 in ./.venv/lib/python3.11/site-packages (from librosa) (1.5.3)\n",
64
+ "Requirement already satisfied: decorator>=4.3.0 in ./.venv/lib/python3.11/site-packages (from librosa) (5.2.1)\n",
65
+ "Requirement already satisfied: soundfile>=0.12.1 in ./.venv/lib/python3.11/site-packages (from librosa) (0.13.1)\n",
66
+ "Requirement already satisfied: pooch>=1.1 in ./.venv/lib/python3.11/site-packages (from librosa) (1.9.0)\n",
67
+ "Requirement already satisfied: soxr>=0.3.2 in ./.venv/lib/python3.11/site-packages (from librosa) (1.0.0)\n",
68
+ "Requirement already satisfied: lazy_loader>=0.1 in ./.venv/lib/python3.11/site-packages (from librosa) (0.4)\n",
69
+ "Requirement already satisfied: msgpack>=1.0 in ./.venv/lib/python3.11/site-packages (from librosa) (1.1.2)\n",
70
+ "Requirement already satisfied: python-dateutil>=2.8.2 in ./.venv/lib/python3.11/site-packages (from pandas) (2.9.0.post0)\n",
71
+ "Requirement already satisfied: threadpoolctl>=3.2.0 in ./.venv/lib/python3.11/site-packages (from scikit-learn) (3.6.0)\n",
72
+ "Requirement already satisfied: llvmlite<0.47,>=0.46.0dev0 in ./.venv/lib/python3.11/site-packages (from numba>=0.51.0->librosa) (0.46.0)\n",
73
+ "Requirement already satisfied: platformdirs>=2.5.0 in ./.venv/lib/python3.11/site-packages (from pooch>=1.1->librosa) (4.7.0)\n",
74
+ "Requirement already satisfied: requests>=2.19.0 in ./.venv/lib/python3.11/site-packages (from pooch>=1.1->librosa) (2.32.5)\n",
75
+ "Requirement already satisfied: six>=1.5 in ./.venv/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)\n",
76
+ "Requirement already satisfied: charset_normalizer<4,>=2 in ./.venv/lib/python3.11/site-packages (from requests>=2.19.0->pooch>=1.1->librosa) (3.4.4)\n",
77
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in ./.venv/lib/python3.11/site-packages (from requests>=2.19.0->pooch>=1.1->librosa) (2.6.3)\n",
78
+ "Requirement already satisfied: cffi>=1.0 in ./.venv/lib/python3.11/site-packages (from soundfile>=0.12.1->librosa) (2.0.0)\n",
79
+ "Requirement already satisfied: pycparser in ./.venv/lib/python3.11/site-packages (from cffi>=1.0->soundfile>=0.12.1->librosa) (3.0)\n",
80
+ "Requirement already satisfied: mpmath<1.4,>=1.1.0 in ./.venv/lib/python3.11/site-packages (from sympy>=1.13.3->torch) (1.3.0)\n",
81
+ "Requirement already satisfied: MarkupSafe>=2.0 in ./.venv/lib/python3.11/site-packages (from jinja2->torch) (3.0.3)\n",
82
+ "Requirement already satisfied: typer>=0.23.0 in ./.venv/lib/python3.11/site-packages (from typer-slim->transformers) (0.23.0)\n",
83
+ "Requirement already satisfied: click>=8.0.0 in ./.venv/lib/python3.11/site-packages (from typer>=0.23.0->typer-slim->transformers) (8.3.1)\n",
84
+ "Requirement already satisfied: rich>=10.11.0 in ./.venv/lib/python3.11/site-packages (from typer>=0.23.0->typer-slim->transformers) (14.3.2)\n",
85
+ "Requirement already satisfied: annotated-doc>=0.0.2 in ./.venv/lib/python3.11/site-packages (from typer>=0.23.0->typer-slim->transformers) (0.0.4)\n",
86
+ "Requirement already satisfied: markdown-it-py>=2.2.0 in ./.venv/lib/python3.11/site-packages (from rich>=10.11.0->typer>=0.23.0->typer-slim->transformers) (4.0.0)\n",
87
+ "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in ./.venv/lib/python3.11/site-packages (from rich>=10.11.0->typer>=0.23.0->typer-slim->transformers) (2.19.2)\n",
88
+ "Requirement already satisfied: mdurl~=0.1 in ./.venv/lib/python3.11/site-packages (from markdown-it-py>=2.2.0->rich>=10.11.0->typer>=0.23.0->typer-slim->transformers) (0.1.2)\n",
89
+ "Downloading pandas-3.0.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (11.2 MB)\n",
90
+ "\u001b[2K \u001b[38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m11.2/11.2 MB\u001b[0m \u001b[31m10.7 MB/s\u001b[0m \u001b[33m0:00:01\u001b[0m0.9 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m01\u001b[0m\n",
91
+ "\u001b[?25hInstalling collected packages: pandas\n",
92
+ "Successfully installed pandas-3.0.0\n"
93
+ ]
94
+ }
95
+ ],
96
  "source": [
97
  "!pip install transformers torch torchaudio librosa pandas scikit-learn tqdm"
98
  ]
99
  },
100
  {
101
  "cell_type": "code",
102
+ "execution_count": 2,
103
  "metadata": {},
104
+ "outputs": [
105
+ {
106
+ "name": "stderr",
107
+ "output_type": "stream",
108
+ "text": [
109
+ "/home/richman/Programming/dashengtokenizer_hf/.venv/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
110
+ " from .autonotebook import tqdm as notebook_tqdm\n"
111
+ ]
112
+ }
113
+ ],
114
  "source": [
115
  "import torch\n",
116
  "import torch.nn as nn\n",
 
127
  },
128
  {
129
  "cell_type": "code",
130
+ "execution_count": 3,
131
  "metadata": {},
132
  "outputs": [],
133
  "source": [
 
157
  },
158
  {
159
  "cell_type": "code",
160
+ "execution_count": 4,
161
  "metadata": {},
162
  "outputs": [],
163
  "source": [
 
179
  },
180
  {
181
  "cell_type": "code",
182
+ "execution_count": 5,
183
  "metadata": {},
184
  "outputs": [],
185
  "source": [
 
207
  },
208
  {
209
  "cell_type": "code",
210
+ "execution_count": 6,
211
  "metadata": {},
212
+ "outputs": [
213
+ {
214
+ "name": "stdout",
215
+ "output_type": "stream",
216
+ "text": [
217
+ "Downloading ESC-50 dataset...\n",
218
+ "ESC-50 dataset downloaded and extracted\n",
219
+ "class_reference='configuration_dasheng_tokenizer.DashengTokenizerConfig'\n"
220
+ ]
221
+ },
222
+ {
223
+ "name": "stderr",
224
+ "output_type": "stream",
225
+ "text": [
226
+ "A new version of the following files was downloaded from https://huggingface.co/mispeech/dashengtokenizer:\n",
227
+ "- configuration_dasheng_tokenizer.py\n",
228
+ ". Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.\n"
229
+ ]
230
+ },
231
+ {
232
+ "name": "stdout",
233
+ "output_type": "stream",
234
+ "text": [
235
+ "class_reference='modeling_dasheng_tokenizer.DashengTokenizerModel'\n"
236
+ ]
237
+ },
238
+ {
239
+ "name": "stderr",
240
+ "output_type": "stream",
241
+ "text": [
242
+ "A new version of the following files was downloaded from https://huggingface.co/mispeech/dashengtokenizer:\n",
243
+ "- modeling_dasheng_encoder.py\n",
244
+ ". Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.\n",
245
+ "A new version of the following files was downloaded from https://huggingface.co/mispeech/dashengtokenizer:\n",
246
+ "- vocos.py\n",
247
+ ". Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.\n",
248
+ "A new version of the following files was downloaded from https://huggingface.co/mispeech/dashengtokenizer:\n",
249
+ "- modeling_dasheng_tokenizer.py\n",
250
+ "- modeling_dasheng_encoder.py\n",
251
+ "- vocos.py\n",
252
+ ". Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.\n",
253
+ "Loading weights: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 522/522 [00:00<00:00, 1545.80it/s, Materializing param=upsampler.weight]\n"
254
+ ]
255
+ },
256
+ {
257
+ "name": "stdout",
258
+ "output_type": "stream",
259
+ "text": [
260
+ "Model embedding dimension: 1280\n",
261
+ "Using device: cpu\n"
262
+ ]
263
+ }
264
+ ],
265
  "source": [
266
  "# Download dataset\n",
267
  "download_esc50()\n",
 
289
  },
290
  {
291
  "cell_type": "code",
292
+ "execution_count": 8,
293
  "metadata": {},
294
+ "outputs": [
295
+ {
296
+ "name": "stdout",
297
+ "output_type": "stream",
298
+ "text": [
299
+ "Train samples: 1600, Val samples: 400\n"
300
+ ]
301
+ }
302
+ ],
303
  "source": [
304
  "# Create datasets\n",
305
  "audio_dir = 'ESC-50/audio'\n",
 
322
  "cell_type": "code",
323
  "execution_count": null,
324
  "metadata": {},
325
+ "outputs": [
326
+ {
327
+ "name": "stderr",
328
+ "output_type": "stream",
329
+ "text": [
330
+ "Epoch 1/10 Training: 5%|β–ˆβ–ˆβ–ˆβ–Š | 19/400 [00:53<15:40, 2.47s/it, loss=3.9561]"
331
+ ]
332
+ }
333
+ ],
334
  "source": [
335
  "# Training setup\n",
336
  "optimizer = torch.optim.Adam(classifier.parameters(), lr=1e-3)\n",
 
425
  ],
426
  "metadata": {
427
  "kernelspec": {
428
+ "display_name": "Python 3 (ipykernel)",
429
  "language": "python",
430
  "name": "python3"
431
  },
 
439
  "name": "python",
440
  "nbconvert_exporter": "python",
441
  "pygments_lexer": "ipython3",
442
+ "version": "3.11.11"
443
  }
444
  },
445
  "nbformat": 4,
446
  "nbformat_minor": 4
447
+ }