Under this we have two techniques with which we normalize the feature data

  1. Box-Cox Power Transformation
  2. Yeo-Johnson Transformation

1.Box-Cox Transformation:

  • It is a parametric power Transformation   technique.
  • Main aim is to normalize the feature data
  • It is applicable only when your feature data are positive values
  • This technique will not give fruitful results when you try to transform the data containing negative values.

Load libraries and read data

Descriptive Statistics

hwy feature contains only positive values

Draw Histogram Plot Using Original Data

Normalize using Power Transformation function (Box-Cox)

Transformed Data:

Draw Histogram plot with transformed data

 

 

Comparison of Original Histogram and Box-Cox Transformed Histogram

with original data                                                                                        With transformed data

all values are positive                                                                                 Normalized feature data

2. Yeo-Johnson Transformation Technique

  • This is generalization of Box_Cox power transformation
  • Box-Cox Power Transformation is not able to handle negative values
  • But Yeo-Johnson can handle both positive and negative values of the feature

Load libraries and read data

Descriptive statistics

Logistic Regression:

Summary:

Load libraries and read data:

Store_visits feature contains both negative and positive values

Draw Histogram plot with original data:

Transform the original store_visits  data using Yeo-Johnson Transformation

Draw Histogram Plot using transformed data

Comparison of Original Histogram before transformation and Histogram after transformation

Left skewed                                                                          Normalized

Before transformation                                                       After transformation