cheoljun95
/

sylber

cheoljun95 commited on Mar 2, 2025

Commit

4290101

verified ·

1 Parent(s): 77a7cb9

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,3 +1,47 @@
----
-license: apache-2.0
----

+# Sylber
+This is official implementation of [Sylber: Syllabic Embedding Representation of Speech from Raw Audio](https://arxiv.org/abs/2410.07168).
+Sylber is the first of its kind that yields extremely short tokens from raw audio (on average, 4.27 tokens/sec) through dynamic tokenization at the syllable granularity.
+The model is developed and trained by Berkeley Speech Group.
+## Installation
+The model can be installed through pypi for inference.
+```
+pip install sylber
+```
+Please check [demo notebook](demo.ipynb) for the usage.
+For training, please follow the below instructions.
+### Usage
+```python
+from sylber import Segmenter
+# Loading Sylber
+segmenter = Segmenter(model_ckpt="sylber")
+# Run Sylber
+wav_file = "samples/sample.wav"
+outputs = segmenter(wav_file, in_second=True) # in_second can be False to output segments in frame numbers.
+# outputs = {"segments": numpy array of [start, end] of segment,
+#            "segment_features": numpy array of segment-averaged features,
+#            "hidden_states": numpy array of raw features used for segmentation.
+```
+### Training
+Please check [here](https://github.com/Berkeley-Speech-Group/sylber) for training.
+---
+license: apache-2.0
+---