Pendrokar commited on
Commit
774cb85
·
verified ·
1 Parent(s): 61b10e6

languages; origins; papers

Browse files
Files changed (1) hide show
  1. README.md +48 -3
README.md CHANGED
@@ -2,11 +2,56 @@
2
  license: cc-by-4.0
3
  language:
4
  - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  pipeline_tag: text-to-speech
6
  ---
7
 
8
- xVASynth's xVAPitch (v3) type of voice model.
9
 
10
- Legal note: While model is trained on a CC dataset, xVATrainer pretrained models used to train this model include non-CC datasets.
 
11
 
12
- NVIDIA HIFI 6670 M
 
 
 
 
 
 
 
 
 
 
2
  license: cc-by-4.0
3
  language:
4
  - en
5
+ - de
6
+ - es
7
+ - it
8
+ - nl
9
+ - pt
10
+ - pl
11
+ - ro
12
+ - sv
13
+ - da
14
+ - fi
15
+ - hu
16
+ - el
17
+ - fr
18
+ - ru
19
+ - uk
20
+ - tr
21
+ - ar
22
+ - hi
23
+ - jp
24
+ - ko
25
+ - zh
26
+ - vi
27
+ - la
28
+ - ha
29
+ - sw
30
+ - yo
31
+ - wo
32
+ thumbnail: >-
33
+ https://raw.githubusercontent.com/DanRuta/xVA-Synth/master/assets/x-icon.png
34
+ library: xvasynth
35
+ tags:
36
+ - emotion
37
+ - audio
38
+ - text-to-speech
39
+ - tts
40
  pipeline_tag: text-to-speech
41
  ---
42
 
43
+ xVASynth's xVAPitch (v3) type of voice models based on NVIDIA HIFI NeMo datasets created.
44
 
45
+ Models created by Dan Ruta, origin link:
46
+ - https://www.nexusmods.com/skyrimspecialedition/mods/65022?tab=files
47
 
48
+ Dataset supposed origin:
49
+ - https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/core/core.html
50
+
51
+ xVAPitch model referenced Papers:
52
+ - Multi-head attention with Relative Positional embedding - https://arxiv.org/pdf/1809.04281.pdf
53
+ - Transformer with Relative Potional Encoding- https://arxiv.org/abs/1803.02155
54
+ - SDP - https://arxiv.org/pdf/2106.06103.pdf
55
+ - Spline Flow - https://arxiv.org/abs/1906.04032
56
+
57
+ Legal note: Although these datasets are licensed as CC BY 4.0, the base v3 model that these are fine-tuned from, was pre-trained on non-permissive data.