Use the same dataset dia.csv (diabetic.csv)
Load Libraries and read data
Descriptive Statistics
Separate feature(X) and dependent variable Outcome
Ridge Algorithm
Find coefficients:
yhat(outcome) = 0.021 * preg + 0.006 * Glucose – 0.002 * BP + 0.0 * skinthick – 0.0 * Insulin + 0.012 * BMI + 0.121 * Pedigree + 0.002 * Age
Method 2:
Root Means Square Error has improved from OLS to Ridge
Plot alphas and coefs
FEATURE SELECTION UNDER L2 REGULARISATION
Recursive Feature Elimination Technique for L2 regularization
- It is a popular feature selection technique used in machine learning. It works by iteratively removing the least relevant features based on a model’s performance.
- Ultimately selects the most informative subset of features.
- RFE can be applied to various models like linear models, support vector machines and decision tress
- Improves model accuracy
- Includes all predictors and computes importance score for each predictor and allows for systematic elimination of less important features.
RFE
Selected features using RFE
RFE USING RANDOM FOREST REGRESSOR ESTIMATOR
We specify n_features_to_select as 3. So system has selected three features only
RFE USING RFECV (Cross Validation) Technique
System calculate the optimal number of features as 6. so selected features works out to 6
Feature selection when you use scaled data using standard scaler function
here we instruct the system to select 7 features.