Training data

#1
by bernd-bread - opened

Dear Malte,
the model info says it is trained "on a dataset consisting of mentions of german political parties". Would you share which dataset this is?
Thanks a million!

For the dataset I scraped subtitle data from german news youtube channels and public network television vods. For the exact sources and distributions you can look at my bachelor's thesis which I trained this model for. I can share it with you if you tell me your email address.

That would be fantastic! I couldn't find a dm option here on huggingface, so I'll just post the address and edit the post afterwards: [...]
Thanks so much!

Sign up or log in to comment