Z-score Normalization – MMK TECHNOLOGIES

Model	Requirement
Tree based models: Random Forest and Gradient Boosting	Do not require feature normalization or scaling
Neural Networks	Require feature normalization as the model converges more quickly. Beneficial to Neural network model
Selection of appropriate feature scaling / normalization techniques	Is important step in pre-processing of the dataset given on hand

It is a data pre-processing technique.
It Rescales features to have mean = 0 and standard deviation = 1.
It is sensitive to outliers: Since mean (µ) and standard deviation (σ) are influenced by extreme values.
Achieved by centering data around the mean and scaling it based on the feature’s standard deviation
Works well when the data distribution is approximately Gaussian (normal).
Effect on models:
- Beneficial for algorithms assuming normal-like distribution (e.g., linear regression, logistic regression, LDA).
- Improves gradient descent convergence for models like neural networks.
- Distance-based models (KNN, SVM) perform better when features are standardized.

Here we convert the normal variate X into Standard Normal Variate Z using the formula

Each normal variate X is converted to Z

Function used for this Normalization technique

def z_score_standardization(series):

return (series – series.mean())/(series.std())

Then call this function using your independent features after dropping dependent variable feature

Let us download teleco Customer Churn dataset from https://www.kaggle.com/datasets/royjafari/customer-churn

It is a free data.

You can refer the following publications:

Jafari-Marandi, R., Denton, J., Idris, A., Smith, B. K., & Keramati, A. Optimum profit-driven churn decision making: innovative artificial neural networks in telecom industry. Neural Computing and Applications, 1-34.
Keramati, A., Jafari-Marandi, R., Aliannejadi, M., Ahmadian, I., Mozaffari, M., & Abbasi, U. (2014). Improved churn prediction in the telecommunication industry using data mining techniques. Applied Soft Computing, 24, 994-1012.
Keramati, A., & Ardabili, S. M. (2011). Churn analysis for an Iranian mobile operator. Telecommunications Policy, 35(4), 344-356.

Or you can download using the following algorithm

import kagglehub

# Download latest version

path = kagglehub.dataset_download(“royjafari/customer-churn”)

print(“Path to dataset files:”, path)

The data contains 3150 rows (records) and 13 features and one Dependent variable (Churn)