kronosta
/

ColliErgoSum

Model card Files Files and versions

ColliErgoSum / README.md

kronosta's picture

Corrected checkpoint descriptions

17439bf verified about 1 year ago

|

history blame contribute delete

1.68 kB

	---
	license: mit
	base_model:
	- kronosta/blunstron
	library_name: diffusers
	---

	This is another dance diffusion AI similar to Blunstron, except this time it generates harmonizing voices, and sings in either E or G half sharp.
	I finetuned it from Blunstron on a dataset of 3-second clips of Jacob Collier - In The Bleak Midwinter (which like Blunstron's dataset of The Alan Parsons Project -
	Old and Wise, I have sampled dozens of times in my "audio collage" music). However the samples were too quiet for my tastes and always sung in
	F, because it latched on to the beginning of the song. So I halted the training, removed all the quieter parts from the dataset, then resumed from where I left off
	so it could keep the knowledge it got already. This time I decided to preserve the entire history of checkpoints, since each once is unique.

	# Checkpoints
	- `epoch=948-step=192000.ckpt` is probably the best one, and it's the one I'll probably use. It sings in E and G half sharp, is varied from the original dataset,
	and is sometimes composed and sometimes wild, for the variety.
	- `epoch=1019-step=192500.ckpt` seems to be overfit, since a lot of the stuff it generates I can immediately hear where in the song it came from, from memory alone.
	Its perfect sound doesn't leave much room for creativity and also definitely qualifies as art theft. Like the previous, it sings in E and G half sharp.
	- `epoch=876-step=191500.ckpt` is from before the deletion of most of the dataset. It sings in F and is relatively quiet and composed.
	- `epoch=845-step=191000.ckpt` I have no training samples from, and haven't tested. It's quite likely it might share some aspects with Blunstron.