Target encoding is a powerful technique in feature engineering. This method converts categorical values into numerical format based on the target variable, enhancing model performance and interpretability
Need for Target Encoding
Target encoding transforms categorical variables into numerical values by replacing them with a statistic (like the mean) calculated from the target variable
Here, SALES is the target/dependent variable and others like PRODCUTLINE, CITY and COUNTRY are features expressed in categorical format. Now check for missing values in the dataset
the dataset sales_sample does not have any missing values
Load libraries meant for Target encoding
We have separated features(X- all independent categorical variables) and target (SALES (y)) by usind drop function
You must have installed category_encoders using pip install category_encoders
Import TargetEncoder from category_encoders. Assing TargetEncoder function to a variable called encoder or anything. Then using fit_transform function convert all X based on y (target variable)
Then concatenate original df and encoded X
After concatenation:
PRODUCTLINE Feature contains the following categories. Using pivot table we find the average price 3523.831843 (based on sales ) Targe encoding converts Motorcycles into numerical format (Average Sales price meant for Motorcycles)
City:
Country