DerivedFunction commited on
Commit
81b0f9e
·
verified ·
1 Parent(s): 4e8ff75

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -8
README.md CHANGED
@@ -154,14 +154,6 @@ and may produce unexpected results compared to generic text classifiers. It is t
154
  A synthetic training row consists of 1-4 individual and mostly independent sentences extracted from various sources. The actual training and evaluation data, as well as coverage
155
  is found in `DerivedFunction/language-ner`.
156
 
157
- The data composition follows a strategic curriculum:
158
-
159
- * **60% Pure Documents:** Single-language sequences to establish strong baseline profiles for each language.
160
- * **30% Homogenous Mixed:** Documents containing one main language, and clear transitions between two or more languages to train boundary detection.
161
- * **10% Mixed with Noise:** Integration of "neutral" spans including code snippets, mathematical notation, emojis, symbols, and `rot_13` text tagged as `O` or their respective source to reduce hallucination.
162
-
163
-
164
-
165
 
166
  ## Training procedure
167
 
 
154
  A synthetic training row consists of 1-4 individual and mostly independent sentences extracted from various sources. The actual training and evaluation data, as well as coverage
155
  is found in `DerivedFunction/language-ner`.
156
 
 
 
 
 
 
 
 
 
157
 
158
  ## Training procedure
159