cheoljun95 commited on
Commit
4290101
·
verified ·
1 Parent(s): 77a7cb9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -3
README.md CHANGED
@@ -1,3 +1,47 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Sylber
2
+
3
+ This is official implementation of [Sylber: Syllabic Embedding Representation of Speech from Raw Audio](https://arxiv.org/abs/2410.07168).
4
+
5
+ Sylber is the first of its kind that yields extremely short tokens from raw audio (on average, 4.27 tokens/sec) through dynamic tokenization at the syllable granularity.
6
+
7
+ The model is developed and trained by Berkeley Speech Group.
8
+
9
+
10
+ ## Installation
11
+
12
+ The model can be installed through pypi for inference.
13
+
14
+ ```
15
+ pip install sylber
16
+ ```
17
+ Please check [demo notebook](demo.ipynb) for the usage.
18
+ For training, please follow the below instructions.
19
+
20
+ ### Usage
21
+
22
+ ```python
23
+
24
+ from sylber import Segmenter
25
+
26
+ # Loading Sylber
27
+ segmenter = Segmenter(model_ckpt="sylber")
28
+
29
+
30
+ # Run Sylber
31
+ wav_file = "samples/sample.wav"
32
+
33
+ outputs = segmenter(wav_file, in_second=True) # in_second can be False to output segments in frame numbers.
34
+
35
+ # outputs = {"segments": numpy array of [start, end] of segment,
36
+ # "segment_features": numpy array of segment-averaged features,
37
+ # "hidden_states": numpy array of raw features used for segmentation.
38
+ ```
39
+
40
+
41
+ ### Training
42
+
43
+ Please check [here](https://github.com/Berkeley-Speech-Group/sylber) for training.
44
+
45
+ ---
46
+ license: apache-2.0
47
+ ---