Update README.md
Browse files
README.md
CHANGED
|
@@ -149,7 +149,7 @@ Note that as a general language tagging model, it can potentially get confused f
|
|
| 149 |
The model is trained on a sentence with a minimum of four tokens, so it may not accurately classify very short and ambigous statements. Note that this model is experimental
|
| 150 |
and may produce unexpected results compared to generic text classifiers. It is trained on cleaned text, therefore, "messy" text may unexpectedly produce different results.
|
| 151 |
|
| 152 |
-
> Note that Romanized versions of any language
|
| 153 |
|
| 154 |
### Training and Evaluation Data
|
| 155 |
A synthetic training row consists of 1-4 individual and mostly independent sentences extracted from various sources. The actual training and evaluation data, as well as coverage
|
|
|
|
| 149 |
The model is trained on a sentence with a minimum of four tokens, so it may not accurately classify very short and ambigous statements. Note that this model is experimental
|
| 150 |
and may produce unexpected results compared to generic text classifiers. It is trained on cleaned text, therefore, "messy" text may unexpectedly produce different results.
|
| 151 |
|
| 152 |
+
> Note that Romanized versions of any language may only have minor representation in the training set, such as Romanized Russian, and Hindi.
|
| 153 |
|
| 154 |
### Training and Evaluation Data
|
| 155 |
A synthetic training row consists of 1-4 individual and mostly independent sentences extracted from various sources. The actual training and evaluation data, as well as coverage
|