File size: 4,171 Bytes
573027e 38f0a65 573027e 38f0a65 f8efefb 5258730 f8efefb 5258730 8471a16 5258730 8471a16 5258730 8471a16 5258730 8471a16 5258730 8471a16 5258730 8471a16 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
---
license: cc-by-nc-4.0
---
# RAVE Models
This is a collection of [RAVE](https://github.com/acids-ircam/RAVE) models trained by the [Intelligent Instruments Lab](https://iil.is) for various projects.
Most of these models are encoder-decoder only, no prior, and all use the `--causal` mode and are exported for streaming inference with [nn~](https://github.com/acids-ircam/nn_tilde), [NN.ar](https://github.com/elgiano/nn.ar) or [rave-supercollider](https://github.com/victor-shepardson/rave-supercollider).
## Musical Instruments
### guitar_iil_b2048_r48000_z16.ts
Dataset: [IILGuitarTimbre](https://github.com/Intelligent-Instruments-Lab/IILGuitarTimbre), a timbre-oriented collection of plucking, strumming, striking scraping and more recorded dry from an electric guitar.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### organ_archive_b2048_r48000_z16.ts
Dataset: various recordings of organ music sourced from archive.org. Small amounts of voice and other instruments were included, and vinyl record noises are prominent.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### organ_bach_b2048_sr48000_z16.ts
Dataset: various recordings of J.S. Bach music for church organ.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### sax_soprano_franziskaschroeder_b2048_r48000_z20.ts
Dataset: Soprano sax improvisation by [Franziska Schroeder](https://improvisationai.wordpress.com/).
Model: modified RAVE v1, 48kHz, block size 2048, 20 latent dimensions.
## Voice
### voice_vocalset_b2048_r48000_z16.ts
Dataset: [VocalSet](https://zenodo.org/record/1193957) singing voice dataset.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### voice_hifitts_b2048_r48000_z16.ts
Dataset: [Hi-Fi TTS](http://arxiv.org/abs/2104.01497) audiobooks dataset.
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### voice_jvs_b2048_r44100_z16.ts
Dataset: [Hi-Fi TTS](http://arxiv.org/abs/2104.01497) speaker 9017 (John Van Stan).
Model: RAVE v3, 44.1kHz, block size 2048, 16 latent dimensions.
### voice_vctk_b2048_r44100_z16.ts
Dataset: [CSTR VCTK Corpus](https://datashare.ed.ac.uk/handle/10283/3443) multispeaker read speech dataset.
Model: RAVE v3, 44.1kHz, block size 2048, 22 latent dimensions.
## *Pluma* Birds
This model of bird sounds was curated by Giacomo Lepri for his instrument *[Pluma](http://www.giacomolepri.com/pluma)*
### birds_pluma_b2048_r48000_z12.ts
Dataset: bird sounds.
Model: modified RAVE v1, 48kHz, block size 2048, 12 latent dimensions.
## *Pond Brain* Marine Sounds
These models of marine sounds were trained for [Jenna Sutela](https://jennasutela.com/)'s *Pond Brain* installations at [Copenhagen Contemporary](https://copenhagencontemporary.org/en/yet-it-moves-read-online/) and the [Helsinki Biennial](https://helsinkibiennaali.fi/en/artist/jenna-sutela/)
### water_pondbrain_b2048_r48000_z16.ts
Dataset: water recordings from freesound.org.
<details>
<summary>list of freesound users</summary>
`inspectorj`, `inchadney`, `aesqe`, `vonfleisch`, `javetakami`, `atomediadesign`, `kolezan`, `zabuhailo`, `zaziesound`, `repdac3`, `al_sub`, `lgarrett`, `uzbazur`, `lydmakeren`, `frenkfurth`, `edo333`, `boredtoinsanity`, `owl`, `kaydinhamby`, `tliedes`, `ilmari_freesound`, `manoslindos`, `l3ardoc`, `alexbuk`, `s-light`
</details>
Model: modified RAVE v1, 48kHz, block size 2048, 16 latent dimensions.
### humpbacks_pondbrain_b2048_r48000_z20.ts
Dataset: humpback whale recordings from the [Watkins database](https://cis.whoi.edu/science/B/whalesounds/index.cfm), [MBARI](https://freesound.org/people/MBARI_MARS/), and BBC.
Model: modified RAVE v1, 48kHz, block size 2048, 20 latent dimensions.
### marinemammals_pondbrain_b2048_r48000_z20.ts
Dataset: various marine mammal sounds from [NOAA](https://www.fisheries.noaa.gov/national/science-data/sounds-ocean-mammals), the [Watkins database](https://cis.whoi.edu/science/B/whalesounds/index.cfm), freesound users `felixblume` and `geraldfiebig`, and sound effects databases.
Model: modified RAVE v1, 48kHz, block size 2048, 20 latent dimensions.
|