Update README.md
Browse files
README.md
CHANGED
|
@@ -20,6 +20,7 @@ The script is designed for efficient, large-scale data processing. It leverages
|
|
| 20 |
- **Scalable**: Capable of handling millions of documents by processing files sequentially and texts in parallel.
|
| 21 |
- **Seamless Integration**: Appends classification results (`quality_ai` and `confidence`) directly to the original data, preserving all existing columns/keys.
|
| 22 |
- **User-Friendly Progress**: Displays a `tqdm` progress bar to monitor the analysis in real-time.
|
|
|
|
| 23 |
|
| 24 |
## 3. How It Works
|
| 25 |
|
|
|
|
| 20 |
- **Scalable**: Capable of handling millions of documents by processing files sequentially and texts in parallel.
|
| 21 |
- **Seamless Integration**: Appends classification results (`quality_ai` and `confidence`) directly to the original data, preserving all existing columns/keys.
|
| 22 |
- **User-Friendly Progress**: Displays a `tqdm` progress bar to monitor the analysis in real-time.
|
| 23 |
+
- **Language-Aware Filtering**: Automatically classifies all non-Polish texts as LOW quality, unless a multilingual mix (e.g., Polish-English) is detected, in which case the model’s prediction may vary accordingly.
|
| 24 |
|
| 25 |
## 3. How It Works
|
| 26 |
|