Joblib
adgw commited on
Commit
5382e15
Β·
verified Β·
1 Parent(s): 1e1df41

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -10
README.md CHANGED
@@ -87,13 +87,13 @@ Ensure your project follows this structure:
87
  β”‚ └── data1.jsonl
88
  β”‚ └── data2.jsonl
89
  β”œβ”€β”€ models/
90
- β”‚ β”œβ”€β”€ model.joblib # The trained XGBoost model
91
- β”‚ └── scaler.pkl # The scikit-learn scaler
92
- β”œβ”€β”€ output/ # Output directory for processed files
93
- β”œβ”€β”€ dummy.py # The interactive testing script
94
- β”œβ”€β”€ main_parquet_spacy_jsonl.py # The main processing script jsonl
95
- β”œβ”€β”€ main_parquet_spacy_parquet.py # The main processing script parquet
96
- └── predictor_parquet_spacy.py # The feature extraction module
97
  ```
98
 
99
  ## 5. Usage
@@ -110,9 +110,9 @@ The script is configured to run out-of-the-box. Simply place your data files in
110
  Open your terminal and execute the Python script:
111
 
112
  ```bash
113
- python -W ignore main_parquet_spacy_jsonl.py
114
  or
115
- python -W ignore main_parquet_spacy_parquet.py
116
  ```
117
 
118
  ### Step 3: Check the Output
@@ -127,7 +127,7 @@ The script automatically skips files that have already been processed and exist
127
 
128
  ```
129
 
130
- python -W ignore main_parquet_spacy_parquet.py
131
  Znaleziono 1 plikΓ³w w folderze wejΕ›ciowym
132
  input_parquet\docs.parquet
133
 
 
87
  β”‚ └── data1.jsonl
88
  β”‚ └── data2.jsonl
89
  β”œβ”€β”€ models/
90
+ β”‚ β”œβ”€β”€ model.joblib # The trained XGBoost model
91
+ β”‚ └── scaler.pkl # The scikit-learn scaler
92
+ β”œβ”€β”€ output/ # Output directory for processed files
93
+ β”œβ”€β”€ dummy.py # The interactive testing script
94
+ β”œβ”€β”€ main_jsonl.py # The main processing script jsonl
95
+ β”œβ”€β”€ main_parquet.py # The main processing script parquet
96
+ └── predictor.py # The feature extraction module
97
  ```
98
 
99
  ## 5. Usage
 
110
  Open your terminal and execute the Python script:
111
 
112
  ```bash
113
+ python -W ignore main_jsonl.py
114
  or
115
+ python -W ignore main_parquet.py
116
  ```
117
 
118
  ### Step 3: Check the Output
 
127
 
128
  ```
129
 
130
+ python -W ignore main_parquet.py
131
  Znaleziono 1 plikΓ³w w folderze wejΕ›ciowym
132
  input_parquet\docs.parquet
133