Submitted by Stefan Schweter 9 The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models CORAL NLP Research 6 2