Update README.md
Browse files
README.md
CHANGED
|
@@ -5,9 +5,13 @@ tags:
|
|
| 5 |
- Guide
|
| 6 |
- Feature Engineering
|
| 7 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
# Feature Engineering & Feature Selection
|
| 9 |
|
| 10 |
-
A comprehensive guide [[pdf]](https://
|
| 11 |
|
| 12 |
## Motivation
|
| 13 |
|
|
@@ -27,11 +31,11 @@ Data and feature has the most impact on a ML project and sets the limit of how w
|
|
| 27 |
|
| 28 |
Download the PDF here:
|
| 29 |
|
| 30 |
-
- [**PDF Download**](https://
|
| 31 |
|
| 32 |
Same, but in markdown:
|
| 33 |
|
| 34 |
-
- [**Mark Down Download**](https://
|
| 35 |
|
| 36 |
PDF has a much readable format, while Markdown has auto-generated anchor link to navigate from outer source. GitHub sucks at displaying markdown with complex grammar, so I would suggest read the PDF or download the repo and read markdown with [Typora](https://typora.io/).
|
| 37 |
|
|
@@ -74,103 +78,103 @@ Below is a list of methods currently implemented in the repo.
|
|
| 74 |
|
| 75 |
- 1.1 Variables
|
| 76 |
- 1.2 Variable Identification
|
| 77 |
-
- Check Data Types [[guide]](https://
|
| 78 |
- 1.3 Univariate Analysis
|
| 79 |
-
- Descriptive Analysis [[guide]](https://
|
| 80 |
-
- Discrete Variable Barplot [[guide]](https://
|
| 81 |
-
- Discrete Variable Countplot [[guide]](https://
|
| 82 |
-
- Discrete Variable Boxplot [[guide]](https://
|
| 83 |
-
- Continuous Variable Distplot [[guide]](https://
|
| 84 |
- 1.4 Bi-variate Analysis
|
| 85 |
-
- Scatter Plot [[guide]](https://
|
| 86 |
-
- Correlation Plot [[guide]](https://
|
| 87 |
-
- Heat Map [[guide]](https://
|
| 88 |
|
| 89 |
**2. Feature Cleaning**
|
| 90 |
|
| 91 |
- 2.1 Missing Values
|
| 92 |
-
- Missing Value Check [[guide]](https://
|
| 93 |
-
- Listwise Deletion [[guide]](https://
|
| 94 |
-
- Mean/Median/Mode Imputation [[guide]](https://
|
| 95 |
-
- End of distribution Imputation [[guide]](https://
|
| 96 |
-
- Random Imputation [[guide]](https://
|
| 97 |
-
- Arbitrary Value Imputation [[guide]](https://
|
| 98 |
-
- Add a variable to denote NA [[guide]](https://
|
| 99 |
- 2.2 Outliers
|
| 100 |
-
- Detect by Arbitrary Boundary [[guide]](https://
|
| 101 |
-
- Detect by Mean & Standard Deviation [[guide]](https://
|
| 102 |
-
- Detect by IQR [[guide]](https://
|
| 103 |
-
- Detect by MAD [[guide]](https://
|
| 104 |
-
- Mean/Median/Mode Imputation [[guide]](https://
|
| 105 |
-
- Discretization [[guide]](https://
|
| 106 |
-
- Imputation with Arbitrary Value [[guide]](https://
|
| 107 |
-
- Windsorization [[guide]](https://
|
| 108 |
-
- Discard Outliers [[guide]](https://
|
| 109 |
- 2.3 Rare Values
|
| 110 |
-
- Mode Imputation [[guide]](https://
|
| 111 |
-
- Grouping into One New Category [[guide]](https://
|
| 112 |
- 2.4 High Cardinality
|
| 113 |
-
- Grouping Labels with Business Understanding [[guide]](https://
|
| 114 |
-
- Grouping Labels with Rare Occurrence into One Category [[guide]](https://
|
| 115 |
-
- Grouping Labels with Decision Tree [[guide]](https://
|
| 116 |
|
| 117 |
**3. Feature Engineering**
|
| 118 |
- 3.1 Feature Scaling
|
| 119 |
-
- Normalization - Standardization [[guide]](https://
|
| 120 |
-
- Min-Max Scaling [[guide]](https://
|
| 121 |
-
- Robust Scaling [[guide]](https://
|
| 122 |
- 3.2 Discretize
|
| 123 |
-
- Equal Width Binning [[guide]](https://
|
| 124 |
-
- Equal Frequency Binning [[guide]](https://
|
| 125 |
-
- K-means Binning [[guide]](https://
|
| 126 |
-
- Discretization by Decision Trees [[guide]](https://
|
| 127 |
-
- ChiMerge [[guide]](https://
|
| 128 |
- 3.3 Feature Encoding
|
| 129 |
-
- One-hot Encoding [[guide]](https://
|
| 130 |
-
- Ordinal-Encoding [[guide]](https://
|
| 131 |
-
- Count/frequency Encoding [[guide]](https://
|
| 132 |
-
- Mean Encoding [[guide]](https://
|
| 133 |
-
- WOE Encoding [[guide]](https://
|
| 134 |
-
- Target Encoding [[guide]](https://
|
| 135 |
- 3.4 Feature Transformation
|
| 136 |
-
- Logarithmic Transformation [[guide]](https://
|
| 137 |
-
- Reciprocal Transformation [[guide]](https://
|
| 138 |
-
- Square Root Transformation [[guide]](https://
|
| 139 |
-
- Exponential Transformation [[guide]](https://
|
| 140 |
-
- Box-cox Transformation [[guide]](https://
|
| 141 |
-
- Quantile Transformation [[guide]](https://
|
| 142 |
- 3.5 Feature Generation
|
| 143 |
-
- Missing Data Derived [[guide]](https://
|
| 144 |
-
- Simple Stats [[guide]](https://
|
| 145 |
-
- Crossing [[guide]](https://
|
| 146 |
-
- Ratio & Proportion [[guide]](https://
|
| 147 |
-
- Cross Product [[guide]](https://
|
| 148 |
-
- Polynomial [[guide]](https://
|
| 149 |
-
- Feature Learning by Tree [[guide]](https://
|
| 150 |
-
- Feature Learning by Deep Network [[guide]](https://
|
| 151 |
|
| 152 |
**4. Feature Selection**
|
| 153 |
|
| 154 |
- 4.1 Filter Method
|
| 155 |
-
- Variance [[guide]](https://
|
| 156 |
-
- Correlation [[guide]](https://
|
| 157 |
-
- Chi-Square [[guide]](https://
|
| 158 |
-
- Mutual Information Filter [[guide]](https://
|
| 159 |
-
- Information Value (IV) [[guide]](https://
|
| 160 |
- 4.2 Wrapper Method
|
| 161 |
-
- Forward Selection [[guide]](https://
|
| 162 |
-
- Backward Elimination [[guide]](https://
|
| 163 |
-
- Exhaustive Feature Selection [[guide]](https://
|
| 164 |
-
- Genetic Algorithm [[guide]](https://
|
| 165 |
- 4.3 Embedded Method
|
| 166 |
-
- Lasso (L1) [[guide]](https://
|
| 167 |
-
- Random Forest Importance [[guide]](https://
|
| 168 |
-
- Gradient Boosted Trees Importance [[guide]](https://
|
| 169 |
- 4.4 Feature Shuffling
|
| 170 |
-
- Random Shuffling [[guide]](https://
|
| 171 |
- 4.5 Hybrid Method
|
| 172 |
-
- Recursive Feature Selection [[guide]](https://
|
| 173 |
-
- Recursive Feature Addition [[guide]](https://
|
| 174 |
|
| 175 |
|
| 176 |
|
|
|
|
| 5 |
- Guide
|
| 6 |
- Feature Engineering
|
| 7 |
---
|
| 8 |
+
# Disclaimer
|
| 9 |
+
This guidence is copying from github repository [ashipatel26/Amazing-Feature-Engineering](https://github.com/ashishpatel26/Amazing-Feature-Engineering).
|
| 10 |
+
In this HF repository we using it for our student, because there are dissertation is correlated with feature engineering on recommender system.
|
| 11 |
+
|
| 12 |
# Feature Engineering & Feature Selection
|
| 13 |
|
| 14 |
+
A comprehensive guide [[pdf]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.pdf) [[markdown]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md) for **Feature Engineering** and **Feature Selection**, with implementations and examples in Python.
|
| 15 |
|
| 16 |
## Motivation
|
| 17 |
|
|
|
|
| 31 |
|
| 32 |
Download the PDF here:
|
| 33 |
|
| 34 |
+
- [**PDF Download**](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.pdf)
|
| 35 |
|
| 36 |
Same, but in markdown:
|
| 37 |
|
| 38 |
+
- [**Mark Down Download**](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md)
|
| 39 |
|
| 40 |
PDF has a much readable format, while Markdown has auto-generated anchor link to navigate from outer source. GitHub sucks at displaying markdown with complex grammar, so I would suggest read the PDF or download the repo and read markdown with [Typora](https://typora.io/).
|
| 41 |
|
|
|
|
| 78 |
|
| 79 |
- 1.1 Variables
|
| 80 |
- 1.2 Variable Identification
|
| 81 |
+
- Check Data Types [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#12-variable-identification) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/1_Demo_Data_Explore.ipynb)
|
| 82 |
- 1.3 Univariate Analysis
|
| 83 |
+
- Descriptive Analysis [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#13-univariate-analysis) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/1_Demo_Data_Explore.ipynb)
|
| 84 |
+
- Discrete Variable Barplot [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#13-univariate-analysis) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/1_Demo_Data_Explore.ipynb)
|
| 85 |
+
- Discrete Variable Countplot [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#13-univariate-analysis) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/1_Demo_Data_Explore.ipynb)
|
| 86 |
+
- Discrete Variable Boxplot [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#13-univariate-analysis) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/1_Demo_Data_Explore.ipynb)
|
| 87 |
+
- Continuous Variable Distplot [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#13-univariate-analysis) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/1_Demo_Data_Explore.ipynb)
|
| 88 |
- 1.4 Bi-variate Analysis
|
| 89 |
+
- Scatter Plot [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#14-bi-variate-analysis) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/1_Demo_Data_Explore.ipynb)
|
| 90 |
+
- Correlation Plot [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#14-bi-variate-analysis) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/1_Demo_Data_Explore.ipynb)
|
| 91 |
+
- Heat Map [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#14-bi-variate-analysis) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/1_Demo_Data_Explore.ipynb)
|
| 92 |
|
| 93 |
**2. Feature Cleaning**
|
| 94 |
|
| 95 |
- 2.1 Missing Values
|
| 96 |
+
- Missing Value Check [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#214-how-to-handle-missing-data) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.1_Demo_Missing_Data.ipynb)
|
| 97 |
+
- Listwise Deletion [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#214-how-to-handle-missing-data) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.1_Demo_Missing_Data.ipynb)
|
| 98 |
+
- Mean/Median/Mode Imputation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#214-how-to-handle-missing-data) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.1_Demo_Missing_Data.ipynb)
|
| 99 |
+
- End of distribution Imputation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#214-how-to-handle-missing-data) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.1_Demo_Missing_Data.ipynb)
|
| 100 |
+
- Random Imputation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#214-how-to-handle-missing-data) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.1_Demo_Missing_Data.ipynb)
|
| 101 |
+
- Arbitrary Value Imputation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#214-how-to-handle-missing-data) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.1_Demo_Missing_Data.ipynb)
|
| 102 |
+
- Add a variable to denote NA [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#214-how-to-handle-missing-data) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.1_Demo_Missing_Data.ipynb)
|
| 103 |
- 2.2 Outliers
|
| 104 |
+
- Detect by Arbitrary Boundary [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#222-outlier-detection) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.2_Demo_Outlier.ipynb)
|
| 105 |
+
- Detect by Mean & Standard Deviation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#222-outlier-detection) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.2_Demo_Outlier.ipynb)
|
| 106 |
+
- Detect by IQR [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#222-outlier-detection) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.2_Demo_Outlier.ipynb)
|
| 107 |
+
- Detect by MAD [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#222-outlier-detection) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.2_Demo_Outlier.ipynb)
|
| 108 |
+
- Mean/Median/Mode Imputation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#223-how-to-handle-outliers) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.2_Demo_Outlier.ipynb)
|
| 109 |
+
- Discretization [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#223-how-to-handle-outliers) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.2_Demo_Discretisation.ipynb)
|
| 110 |
+
- Imputation with Arbitrary Value [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#223-how-to-handle-outliers) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.2_Demo_Outlier.ipynb)
|
| 111 |
+
- Windsorization [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#223-how-to-handle-outliers) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.2_Demo_Outlier.ipynb)
|
| 112 |
+
- Discard Outliers [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#223-how-to-handle-outliers) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.2_Demo_Outlier.ipynb)
|
| 113 |
- 2.3 Rare Values
|
| 114 |
+
- Mode Imputation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#23-rare-values) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.3_Demo_Rare_Values.ipynb)
|
| 115 |
+
- Grouping into One New Category [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#23-rare-values) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.3_Demo_Rare_Values.ipynb)
|
| 116 |
- 2.4 High Cardinality
|
| 117 |
+
- Grouping Labels with Business Understanding [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#24-high-cardinality)
|
| 118 |
+
- Grouping Labels with Rare Occurrence into One Category [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#24-high-cardinality) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.3_Demo_Rare_Values.ipynb)
|
| 119 |
+
- Grouping Labels with Decision Tree [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#24-high-cardinality) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.2_Demo_Discretisation.ipynb)
|
| 120 |
|
| 121 |
**3. Feature Engineering**
|
| 122 |
- 3.1 Feature Scaling
|
| 123 |
+
- Normalization - Standardization [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#31-feature-scaling) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.1_Demo_Feature_Scaling.ipynb)
|
| 124 |
+
- Min-Max Scaling [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#31-feature-scaling) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.1_Demo_Feature_Scaling.ipynb)
|
| 125 |
+
- Robust Scaling [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#31-feature-scaling) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.1_Demo_Feature_Scaling.ipynb)
|
| 126 |
- 3.2 Discretize
|
| 127 |
+
- Equal Width Binning [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#32-discretize) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.2_Demo_Discretisation.ipynb)
|
| 128 |
+
- Equal Frequency Binning [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#32-discretize) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.2_Demo_Discretisation.ipynb)
|
| 129 |
+
- K-means Binning [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#32-discretize) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.2_Demo_Discretisation.ipynb)
|
| 130 |
+
- Discretization by Decision Trees [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#32-discretize) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.2_Demo_Discretisation.ipynb)
|
| 131 |
+
- ChiMerge [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#32-discretize) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.2_Demo_Discretisation.ipynb)
|
| 132 |
- 3.3 Feature Encoding
|
| 133 |
+
- One-hot Encoding [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#33-feature-encoding) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.3_Demo_Feature_Encoding.ipynb)
|
| 134 |
+
- Ordinal-Encoding [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#33-feature-encoding) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.3_Demo_Feature_Encoding.ipynb)
|
| 135 |
+
- Count/frequency Encoding [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#33-feature-encoding)
|
| 136 |
+
- Mean Encoding [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#33-feature-encoding) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.3_Demo_Feature_Encoding.ipynb)
|
| 137 |
+
- WOE Encoding [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#33-feature-encoding) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.3_Demo_Feature_Encoding.ipynb)
|
| 138 |
+
- Target Encoding [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#33-feature-encoding) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.3_Demo_Feature_Encoding.ipynb)
|
| 139 |
- 3.4 Feature Transformation
|
| 140 |
+
- Logarithmic Transformation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#34-feature-transformation) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.4_Demo_Feature_Transformation.ipynb)
|
| 141 |
+
- Reciprocal Transformation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#34-feature-transformation) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.4_Demo_Feature_Transformation.ipynb)
|
| 142 |
+
- Square Root Transformation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#34-feature-transformation) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.4_Demo_Feature_Transformation.ipynb)
|
| 143 |
+
- Exponential Transformation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#34-feature-transformation) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.4_Demo_Feature_Transformation.ipynb)
|
| 144 |
+
- Box-cox Transformation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#34-feature-transformation) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.4_Demo_Feature_Transformation.ipynb)
|
| 145 |
+
- Quantile Transformation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#34-feature-transformation) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.4_Demo_Feature_Transformation.ipynb)
|
| 146 |
- 3.5 Feature Generation
|
| 147 |
+
- Missing Data Derived [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#35-feature-generation) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/2.1_Demo_Missing_Data.ipynb)
|
| 148 |
+
- Simple Stats [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#35-feature-generation)
|
| 149 |
+
- Crossing [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#35-feature-generation)
|
| 150 |
+
- Ratio & Proportion [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#35-feature-generation)
|
| 151 |
+
- Cross Product [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#35-feature-generation)
|
| 152 |
+
- Polynomial [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#35-feature-generation) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.5_Demo_Feature_Generation.ipynb)
|
| 153 |
+
- Feature Learning by Tree [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#35-feature-generation) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/3.5_Demo_Feature_Generation.ipynb)
|
| 154 |
+
- Feature Learning by Deep Network [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#35-feature-generation)
|
| 155 |
|
| 156 |
**4. Feature Selection**
|
| 157 |
|
| 158 |
- 4.1 Filter Method
|
| 159 |
+
- Variance [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#41-filter-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.1_Demo_Feature_Selection_Filter.ipynb)
|
| 160 |
+
- Correlation [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#41-filter-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.1_Demo_Feature_Selection_Filter.ipynb)
|
| 161 |
+
- Chi-Square [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#41-filter-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.1_Demo_Feature_Selection_Filter.ipynb)
|
| 162 |
+
- Mutual Information Filter [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#41-filter-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.1_Demo_Feature_Selection_Filter.ipynb)
|
| 163 |
+
- Information Value (IV) [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#41-filter-method)
|
| 164 |
- 4.2 Wrapper Method
|
| 165 |
+
- Forward Selection [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#42-wrapper-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.2_Demo_Feature_Selection_Wrapper.ipynb)
|
| 166 |
+
- Backward Elimination [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#42-wrapper-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.2_Demo_Feature_Selection_Wrapper.ipynb)
|
| 167 |
+
- Exhaustive Feature Selection [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#42-wrapper-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.2_Demo_Feature_Selection_Wrapper.ipynb)
|
| 168 |
+
- Genetic Algorithm [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#42-wrapper-method)
|
| 169 |
- 4.3 Embedded Method
|
| 170 |
+
- Lasso (L1) [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#43-embedded-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.3_Demo_Feature_Selection_Embedded.ipynb)
|
| 171 |
+
- Random Forest Importance [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#43-embedded-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.3_Demo_Feature_Selection_Embedded.ipynb)
|
| 172 |
+
- Gradient Boosted Trees Importance [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#43-embedded-method) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.3_Demo_Feature_Selection_Embedded.ipynb)
|
| 173 |
- 4.4 Feature Shuffling
|
| 174 |
+
- Random Shuffling [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#44-feature-shuffling) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.4_Demo_Feature_Selection_Feature_Shuffling.ipynb)
|
| 175 |
- 4.5 Hybrid Method
|
| 176 |
+
- Recursive Feature Selection [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#451-recursive-feature-elimination) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.5_Demo_Feature_Selection_Hybrid_method.ipynb)
|
| 177 |
+
- Recursive Feature Addition [[guide]](https://huggingface.co/recommender-system/feature-engineering-guide/blob/main/A%20Short%20Guide%20for%20Feature%20Engineering%20and%20Feature%20Selection.md#452-recursive-feature-addition) [[demo]](https://huggingface.co/recommender-system/feature-engineering-guide/tree/main/4.5_Demo_Feature_Selection_Hybrid_method.ipynb)
|
| 178 |
|
| 179 |
|
| 180 |
|