Scaling To Shape:
Under this we have four techniques with which we can transform the skewed feature date
- Log Transformation log(X)
- Square Root Transformation sqrt(X)
- Square Transformation X^2
- Exponential Transformation (X,0.5)
Log Transformation:
- It is used to reduce the skewness of the data
- Data with a long tail(ex. Exponential Growth) is managed by this
- It is effective when we deal with Right-Skewed data (Positively skewed data)
Usage:
- Useful to data having exponential growth
- Useful to data having significant variation in magnitude
- Both of the above will affect the performance of the model
Formula

Where:
- X is the original value
- b is the base
- Load libraries and read data
- Applicable to Positively skewed data
HAPPINESS: ( Refer to Happiness Data shown in Skewness Calculation)

Descriptive Statistics

Draw Histogram Plot

Histogram using Original Data

Right Skewed
Transform using Log Transformation Technique

Draw Histogram using transformed data

Compare Original Histogram and Transformed Histogram

Right Skewed Normalized
Before normalization After Normalization
2.Square Root Transformation
The same data Happiness

Define squareroot function and call
Draw Histogram Plot


Right Skewed Normalized
Before normalization After normalization
3.Squared transformation: This is meant for left skewed data. Not applicable for right skewed data
Left Skewed Data (negatively skewed data)
This is used to reduce left skewness in data
Apply this when you want to approach a more symmetrical distribution
Used when your data is negatively skewed
By squaring each value, it can help normalize distributions that are mildly skewed to the left
Formula
Define the function and call later
def Squared_transformation(Series)
return np.squaret(Series)
sqrttransformedgpa = Squared_transformation(df[‘GPA’])
Load libraries and read GPA data

Descriptive Statistics

Draw Histogram plot for the original data


Left tail is long
Normalize using Square Transformation Technique


Draw Histogram Plot using Normalized data


4. Exponential Transformation
a)Applies an exponential function to each element in the feature
- relationship between a feature and the target is exponential in nature, applying an exponential transformation can linearize the data.
- useful for linear regression models where the assumption is that the relationship between variables is linear.
Formula
def exponential_transformation(series, exponent=0.5):
return np.power(series, exponent)
This method is not suitable for any features in a dataset

