adgw
/

quality_classifier_pl

Model card Files Files and versions

adgw commited on Jul 10, 2025

Commit

5382e15

·

verified ·

1 Parent(s): 1e1df41

Update README.md

Files changed (1) hide show

README.md +10 -10

README.md CHANGED Viewed

@@ -87,13 +87,13 @@ Ensure your project follows this structure:
 │   └── data1.jsonl
 │   └── data2.jsonl
 ├── models/
-│   ├── model.joblib               # The trained XGBoost model
-│   └── scaler.pkl                 # The scikit-learn scaler
-├── output/                        # Output directory for processed files
-├── dummy.py                       # The interactive testing script
-├── main_parquet_spacy_jsonl.py             # The main processing script jsonl
-├── main_parquet_spacy_parquet.py           # The main processing script parquet
-└── predictor_parquet_spacy.py     # The feature extraction module
 ```
 ## 5. Usage
@@ -110,9 +110,9 @@ The script is configured to run out-of-the-box. Simply place your data files in
 Open your terminal and execute the Python script:
 ```bash
-python -W ignore main_parquet_spacy_jsonl.py
 or
-python -W ignore main_parquet_spacy_parquet.py
 ```
 ### Step 3: Check the Output
@@ -127,7 +127,7 @@ The script automatically skips files that have already been processed and exist
 ```
-python -W ignore main_parquet_spacy_parquet.py
 Znaleziono 1 plików w folderze wejściowym
 input_parquet\docs.parquet

 │   └── data1.jsonl
 │   └── data2.jsonl
 ├── models/
+│   ├── model.joblib    # The trained XGBoost model
+│   └── scaler.pkl      # The scikit-learn scaler
+├── output/             # Output directory for processed files
+├── dummy.py            # The interactive testing script
+├── main_jsonl.py       # The main processing script jsonl
+├── main_parquet.py     # The main processing script parquet
+└── predictor.py        # The feature extraction module
 ```
 ## 5. Usage
 Open your terminal and execute the Python script:
 ```bash
+python -W ignore main_jsonl.py
 or
+python -W ignore main_parquet.py
 ```
 ### Step 3: Check the Output
 ```
+python -W ignore main_parquet.py
 Znaleziono 1 plików w folderze wejściowym
 input_parquet\docs.parquet