Spaces:

jailen
/

tortoise-base

Build error

App Files Files Community

tortoise-base / models

Commit History

remove redundant phonemize for vall-e (oops), quantize all files and then phonemize all files for cope optimization, load alignment model once instead of for every transcription (speedup with whisperx)

d2a9ab9

mrq commited on Mar 23, 2023

oops

da96161

mrq commited on Mar 22, 2023

cleanups, realigning vall-e training

f822c87

mrq commited on Mar 22, 2023

VALL-E config edits

34ef046

mrq commited on Mar 20, 2023

added japanese tokenizer (experimental)

b17260c

mrq commited on Mar 17, 2023

cleanup, metrics are grabbed for vall-e trainer

249c601

mrq commited on Mar 17, 2023

forgot to separate phonemes by spaces for [redacted]

1b72d0b

mrq commited on Mar 17, 2023

cleaned up some prepare dataset code

d4c5096

mrq commited on Mar 17, 2023

unk hunting

1a8c5de

mrq commited on Mar 16, 2023

oops

da4f926

mrq commited on Mar 16, 2023

preparations for training an IPA-based finetune

ee8270b

mrq commited on Mar 16, 2023

added options to pick tokenizer json and diffusion model (so I don't have to add it in later when I get bored and add in diffusion training)

363d0b0

mrq commited on Mar 15, 2023

removed redundant training data (they exist within tortoise itself anyways), added utility: view tokenized text

07b684c

mrq commited on Mar 14, 2023

;)

7b16b3e

mrq commited on Mar 14, 2023

(:

c85e32f

mrq commited on Mar 14, 2023

:)

54036fd

mrq commited on Mar 14, 2023

added mel LR weight (as I finally understand when to adjust the text), added text validation on dataset creation

66ac8ba

mrq commited on Mar 13, 2023

cleanups and fixes, fix DLAS throwing errors from '''too short of sound files''' by just culling them during transcription

2feb6da

ecker commited on Mar 11, 2023

only God knows why the YAML spec lets you specify string values without quotes

d318400

ecker commited on Mar 10, 2023

added the mysterious tortoise_compat flag mentioned in DLAS repo

b8867a5

ecker commited on Mar 9, 2023

forgot template

b0baa19

ecker commited on Mar 9, 2023

big cleanup to make my life easier when i add more parameters

3f321fe

ecker commited on Mar 9, 2023

actually make using adamw_zero optimizer for multi-gpus work

34dcb84

ecker commited on Mar 8, 2023

disable validation if validation dataset not found, clamp validation batch size to validation dataset size instead of simply reusing batch size, switch to adamw_zero optimizier when training with multi-gpus (because the yaml comment said to and I think it might be why I'm absolutely having garbage luck training this japanese dataset)

ff07f70

ecker commited on Mar 8, 2023

made validation working (will document later)

b4098dc

ecker commited on Mar 8, 2023

set validation to save rate and validation file if exists (need to test later)

e862169

ecker commited on Mar 7, 2023

added option to set worker size in training config generator (because the default is overkill), for whisper transcriptions, load a specialized language model if it exists (for now, only english), output transcription to web UI when done transcribing

3e220ed

ecker commited on Mar 5, 2023

renamed mega batch factor to an actual real term: gradient accumulation factor, fixed halting training not actually killing the training process and freeing up resources, some logic cleanup for gradient accumulation (so many brain worms and wrong assumptions from testing on low batch sizes) (read the training section in the wiki for more details)

df24827

ecker commited on Mar 4, 2023

added new training tunable: loss_text_ce_loss weight, added option to specify source model in case you want to finetune a finetuned model (for example, train a Japanese finetune on a large dataset, then finetune for a specific voice, need to truly validate if it produces usable output), some bug fixes that came up for some reason now and not earlier

c2726fa

ecker commited on Mar 1, 2023

huge success

225dee2

ecker commited on Feb 23, 2023

Added very experimental float16 training for cards with not enough VRAM (10GiB and below, maybe) \!NOTE\! this is VERY EXPERIMETNAL, I have zero free time to validate it right now, I'll do it later

8a1a48f

ecker commited on Feb 21, 2023

added more safeties and parameters to training yaml generator, I think I tested it extensively enough

092dd7b

ecker commited on Feb 19, 2023

oops

cf758f4

ecker commited on Feb 18, 2023

added dropdown to select autoregressive model for TTS, fixed a bug where the settings saveer constantly fires I hate gradio so much why are dropdown.change broken to contiuously fire and send an empty array

2615caf

ecker commited on Feb 18, 2023

a bit of UI cleanup, import multiple audio files at once, actually shows progress when importing voices, hides audio metadata / latents if no generated settings are detected, preparing datasets shows its progress, saving a training YAML shows a message when done, training now works within the web UI, training output shows to web UI, provided notebook is cleaned up and uses a venv, etc.

d5c1433

ecker commited on Feb 18, 2023

almost

229be0b

ecker commited on Feb 17, 2023

Commit History

remove redundant phonemize for vall-e (oops), quantize all files and then phonemize all files for cope optimization, load alignment model once instead of for every transcription (speedup with whisperx) d2a9ab9

oops da96161

cleanups, realigning vall-e training f822c87

VALL-E config edits 34ef046

added japanese tokenizer (experimental) b17260c

cleanup, metrics are grabbed for vall-e trainer 249c601

forgot to separate phonemes by spaces for [redacted] 1b72d0b

cleaned up some prepare dataset code d4c5096

unk hunting 1a8c5de

oops da4f926

preparations for training an IPA-based finetune ee8270b

added options to pick tokenizer json and diffusion model (so I don't have to add it in later when I get bored and add in diffusion training) 363d0b0

removed redundant training data (they exist within tortoise itself anyways), added utility: view tokenized text 07b684c

;) 7b16b3e

(: c85e32f

:) 54036fd

added mel LR weight (as I finally understand when to adjust the text), added text validation on dataset creation 66ac8ba

cleanups and fixes, fix DLAS throwing errors from '''too short of sound files''' by just culling them during transcription 2feb6da

only God knows why the YAML spec lets you specify string values without quotes d318400

added the mysterious tortoise_compat flag mentioned in DLAS repo b8867a5

forgot template b0baa19

big cleanup to make my life easier when i add more parameters 3f321fe

actually make using adamw_zero optimizer for multi-gpus work 34dcb84

made validation working (will document later) b4098dc

set validation to save rate and validation file if exists (need to test later) e862169

added option to set worker size in training config generator (because the default is overkill), for whisper transcriptions, load a specialized language model if it exists (for now, only english), output transcription to web UI when done transcribing 3e220ed

huge success 225dee2

Added very experimental float16 training for cards with not enough VRAM (10GiB and below, maybe) \!NOTE\! this is VERY EXPERIMETNAL, I have zero free time to validate it right now, I'll do it later 8a1a48f

added more safeties and parameters to training yaml generator, I think I tested it extensively enough 092dd7b

oops cf758f4

added dropdown to select autoregressive model for TTS, fixed a bug where the settings saveer constantly fires I hate gradio so much why are dropdown.change broken to contiuously fire and send an empty array 2615caf

almost 229be0b

remove redundant phonemize for vall-e (oops), quantize all files and then phonemize all files for cope optimization, load alignment model once instead of for every transcription (speedup with whisperx)

d2a9ab9

oops

da96161

cleanups, realigning vall-e training

f822c87

VALL-E config edits

34ef046

added japanese tokenizer (experimental)

b17260c

cleanup, metrics are grabbed for vall-e trainer

249c601

forgot to separate phonemes by spaces for [redacted]

1b72d0b

cleaned up some prepare dataset code

d4c5096

unk hunting

1a8c5de

oops

da4f926

preparations for training an IPA-based finetune

ee8270b

added options to pick tokenizer json and diffusion model (so I don't have to add it in later when I get bored and add in diffusion training)

363d0b0

removed redundant training data (they exist within tortoise itself anyways), added utility: view tokenized text

07b684c

;)

7b16b3e

(:

c85e32f

:)

54036fd

added mel LR weight (as I finally understand when to adjust the text), added text validation on dataset creation

66ac8ba

cleanups and fixes, fix DLAS throwing errors from '''too short of sound files''' by just culling them during transcription

2feb6da

only God knows why the YAML spec lets you specify string values without quotes

d318400

added the mysterious tortoise_compat flag mentioned in DLAS repo

b8867a5

forgot template

b0baa19

big cleanup to make my life easier when i add more parameters

3f321fe

actually make using adamw_zero optimizer for multi-gpus work

34dcb84

made validation working (will document later)

b4098dc

set validation to save rate and validation file if exists (need to test later)

e862169

added option to set worker size in training config generator (because the default is overkill), for whisper transcriptions, load a specialized language model if it exists (for now, only english), output transcription to web UI when done transcribing

3e220ed

huge success

225dee2

Added very experimental float16 training for cards with not enough VRAM (10GiB and below, maybe) \!NOTE\! this is VERY EXPERIMETNAL, I have zero free time to validate it right now, I'll do it later

8a1a48f

added more safeties and parameters to training yaml generator, I think I tested it extensively enough

092dd7b

oops

cf758f4

added dropdown to select autoregressive model for TTS, fixed a bug where the settings saveer constantly fires I hate gradio so much why are dropdown.change broken to contiuously fire and send an empty array

2615caf

almost

229be0b