kyutai
/

dactory-models

Model card Files Files and versions

EdouardGrave commited on Apr 30, 2025

Commit

974663c

·

verified ·

1 Parent(s): 7735f05

Update README.md

Files changed (1) hide show

README.md +8 -8

README.md CHANGED Viewed

@@ -28,14 +28,6 @@ language:
 ---
 # Dactory models
-* **Model name**: Dactory models
-* **Languages**: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Irish, Croatian, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovenian, Swedish
-* **Author**: Kyutai
-* **Model type**: Classification
-* **License**: CC-BY-SA 4.0
-* **Version**: 1.0
-* **Released**: April 2025
 ## Model description
 This is a set of fastText-based models to evaluate the quality and domain of text, in the 24 official languages of the European Union.
@@ -48,6 +40,14 @@ Stack Exchange websites related to STEM (`stem`), Humanities (`hum`), pop cultur
 The models were trained to distinguish lines sampled uniformly from these different sources.
 To get training data for the languages other than English, we translated the English training set with MADLAD, except for the `rand` and `wiki` labels, for which data is readily available in all languages.
 ## Use cases
 These models can we used to evaluate the quality of text, by estimating how similar it is to text from high quality sources.

 ---
 # Dactory models
 ## Model description
 This is a set of fastText-based models to evaluate the quality and domain of text, in the 24 official languages of the European Union.
 The models were trained to distinguish lines sampled uniformly from these different sources.
 To get training data for the languages other than English, we translated the English training set with MADLAD, except for the `rand` and `wiki` labels, for which data is readily available in all languages.
+* **Model name**: Dactory models
+* **Languages**: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Irish, Croatian, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovenian, Swedish
+* **Developed by**: Kyutai
+* **Model type**: Classification
+* **License**: CC-BY-SA 4.0
+* **Version**: 1.0
+* **Released**: April 2025
 ## Use cases
 These models can we used to evaluate the quality of text, by estimating how similar it is to text from high quality sources.