Improve model card: add paper link, code link, and metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +32 -14
README.md CHANGED
@@ -1,32 +1,50 @@
1
  ---
2
- license: mit
3
  library_name: pytorch
 
4
  pipeline_tag: text-to-speech
5
  tags:
6
- - accent-tts
7
- - mandarin
8
- - joycent
9
- - grad-tts
10
  ---
11
 
12
- # Joycent Mandarin Accent TTS
 
 
 
 
 
 
13
 
14
- This repository stores the pretrained Joycent acoustic-model checkpoint for
15
- Mandarin accent text-to-speech synthesis.
16
 
17
- The model implementation and inference instructions are available in the
18
- [Joycent repository](https://github.com/anonymous-accent-tts/Joycent_demo).
19
 
20
- Download the checkpoint with:
21
 
22
  ```python
23
  from huggingface_hub import hf_hub_download
24
 
25
  checkpoint_path = hf_hub_download(
26
- repo_id="<namespace>/<model-name>",
27
  filename="grad_210.pt",
28
  )
29
  ```
30
 
31
- Then pass the downloaded path to `joycent/inference_joycent.py` using
32
- `--acoustic-checkpoint`.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  library_name: pytorch
3
+ license: mit
4
  pipeline_tag: text-to-speech
5
  tags:
6
+ - accent-tts
7
+ - mandarin
8
+ - joycent
9
+ - grad-tts
10
  ---
11
 
12
+ # Joycent: Diffusion-based Accent TTS without Accented Phone Prediction
13
+
14
+ Joycent is a diffusion-based Mandarin accent text-to-speech (TTS) framework that synthesizes accented speech directly from standard phone sequences and speech references without requiring accented phone prediction. It integrates accent and speaker representations through conditional layer normalization (CLN) in the text encoder.
15
+
16
+ - **Paper:** [Joycent: Diffusion-based Accent TTS without Accented Phone Prediction](https://huggingface.co/papers/2606.16417)
17
+ - **Code:** [oshindow/Joycent-code](https://github.com/oshindow/Joycent-code)
18
+ - **Demo:** [Joycent Demo Page](https://anonymous-accent-tts.github.io/Joycent-demo/)
19
 
20
+ ## Usage
 
21
 
22
+ This repository stores the pretrained Joycent acoustic-model checkpoint (`grad_210.pt`). The model implementation and full inference instructions are available in the [official GitHub repository](https://github.com/oshindow/Joycent-code).
 
23
 
24
+ You can download the checkpoint using the following snippet:
25
 
26
  ```python
27
  from huggingface_hub import hf_hub_download
28
 
29
  checkpoint_path = hf_hub_download(
30
+ repo_id="walston/joycent",
31
  filename="grad_210.pt",
32
  )
33
  ```
34
 
35
+ Then, pass the downloaded path to `joycent/inference_joycent.py` using the `--acoustic-checkpoint` argument. Note that you will also need the [Joycent vocoder](https://huggingface.co/walston/joycent-vocoder) for full synthesis.
36
+
37
+ ## Citation
38
+
39
+ If you find Joycent useful for your research, please cite:
40
+
41
+ ```bibtex
42
+ @misc{wang2026joycentdiffusionbasedaccenttts,
43
+ title={Joycent: Diffusion-based Accent TTS without Accented Phone Prediction},
44
+ author={Xintong Wang and Ye Wang},
45
+ year={2026},
46
+ eprint={2606.16417},
47
+ archivePrefix={arXiv},
48
+ primaryClass={cs.SD},
49
+ }
50
+ ```