Update README.md
Browse files
README.md
CHANGED
|
@@ -81,4 +81,23 @@ Some additional configurations have been explored with, but experiments have not
|
|
| 81 |
Some current "achitectural features" are in-use, but their effects need to be experimented with further:
|
| 82 |
* `split_classifier_heads` is still a mystery whether it's truly helpful or not (each RVQ level gets its own output head).
|
| 83 |
* `audio_embeddings_sum` is also a mystery whether it matters if each later RVQ level should "see" the past levels through summing embeddings, or if not doing it is preferable.
|
| 84 |
-
* Disabling `unified_position_ids` seems to help quality more often than not, but I'm still unsure if it's beneficial in practice.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
Some current "achitectural features" are in-use, but their effects need to be experimented with further:
|
| 82 |
* `split_classifier_heads` is still a mystery whether it's truly helpful or not (each RVQ level gets its own output head).
|
| 83 |
* `audio_embeddings_sum` is also a mystery whether it matters if each later RVQ level should "see" the past levels through summing embeddings, or if not doing it is preferable.
|
| 84 |
+
* Disabling `unified_position_ids` seems to help quality more often than not, but I'm still unsure if it's beneficial in practice.
|
| 85 |
+
|
| 86 |
+
## LoRAs
|
| 87 |
+
|
| 88 |
+
This repo also contains some LoRAs to serve as a reference under `./loras/`.
|
| 89 |
+
|
| 90 |
+
Using a LoRA is the same as a base model, except you're required to have the base model already (obviously). Just use the LoRA's config YAML to load from instead to use it.
|
| 91 |
+
|
| 92 |
+
The only caveat is that my original dataset *does* contain these samples already, but given the sheer size of it, they're probably underutilized.
|
| 93 |
+
* However, the base model already has *almost adequate* output from these speakers, but not enough to be satisfactory.
|
| 94 |
+
|
| 95 |
+
* `config.lora.glados.yaml` / `lora-glados-r128-a128`:
|
| 96 |
+
+ A simple LoRA of GLaDOS from both Portal and Portal 2.
|
| 97 |
+
+ Trained for 250 steps (48000 samples, 821 samples per epoch).
|
| 98 |
+
* `config.lora.sam.yaml` / `lora-sam-r128-a128`:
|
| 99 |
+
+ A simple LoRA of Sam from the non-remaster Sam and Max Telltale games.
|
| 100 |
+
+ Trained for 250 steps (48000 samples, 1555 samples per epoch).
|
| 101 |
+
* `config.lora.max.yaml` / `lora-max-r128-a128`:
|
| 102 |
+
+ A simple LoRA of Max from the non-remaster Sam and Max Telltale games.
|
| 103 |
+
+ Trained for 250 steps (48000 samples, 1292 samples per epoch).
|