When you deal with datasets having more outliers you can convert the normal variate X using the formula

Robust_scaled_feature = (X – median(X))/IQR

We use median and IQR  for robust scaling of X

Median and IQR are not affected by outliers. So it is robust

Usage

Use this method when your data contains many outliers, and you don’t want these outliers to influence the scale of your data

Robust scaling uses median and inter quartile range (IQR). This conversion technique is useful when you deal with the features containing more outliers. Use this technique when you want to avoid the heavy influence of outliers in your prediction process.

We use the same data what we used in Z-Score transformation

Load libraries and read data

Separate features(independent variables) and Dependent Variable(churn)

Define Robust Scale Function and convert feature data

Remove the features complains,tariffplan and status

The formula could not be used against the values of these feature we get NaN. So remove and attach the original values

Join the features removed with assign function in another dataframe

Draw boxplot using seaborn and matplot lib

Logistic Regression Model using sklearn lib

Find Coefficients and intercept

Find R-Square

Confusion Matrix

Confusion Matrix Heat Map

Find Accuracy of the model:

Find Precision, recall and F1-score

Classification Report

Regular customers who won’t churn from the telephone company (84.28%)

Default customers point to churning customers from the telephone company (15.72%)

Accuracy works out to 89%

Find ROC-AUC(Area Under Curve)

Calculation of metrics using single program

Comparison of transformation techniques

From the above it is clear that R^2 important metric has improved a lot when we use robust scaling technique for normalizing the feature values

Original data     R^2 is 19.94%

z-score               R^2 is 20.42%

Robust              R^2 is 20.66%