Using Python-scipy.stats

Step1: Import Libraries like numpy, pandas, matplotlib.pyplot. Step2: Download the data from local or from web. Step3: Using head()- First 5 records of the data downloaded.Step4: Using tail()- Last 5 records of the data downloaded. Step5: Using Shape- View the shape of the data (Rows, columns).Step6: Using plt.plot()- plot the data points

Install Libraries

Install libraries like pandas, numpy, sk.learn, statsmodels.api, matplot.lib, seaborn

Load libraries

3. Read Data: You can read data from csv, excel or database using required functions. Here I have used ‘SalaryData.csv’.

4. Convert data into dataframe:

5.Read data using head and tail functions

6. Containing only two variables 1) Experience and 2) Salary. Let us take Salary as dependent variable/target variable and Experience in years as Independent/Explanatory variable

7.Find the shape of the data: Data contains 30 observations and two columns

8.Find the type of data features/columns

Both features are numerical one is float64 and other is int64

9. View the descriptive statistics of the data being examined

Here we find count of the observations made, mean, std, min, First Quartile, Second Quartile, Third quartile and max of two features.

10.Calculate the number of NaN (Non available Number) /null figure in features

Here all 30 rows and two columns are available. There is no empty cells

You can straight-a-way find the value of parameters like slope, intercept, r_value, p_value, and SSE using stats.linregress function. Arguments are data

Visualization using matplotlib.pyplot library:

If you remove plt.xlim() and plt.ylim you will get the following scatter diagram

MMKTECH CONSULTANTS

B32, F1, C-COLONY, MUTHU SUNDARI APARTMENTS, WATER TANK SOUTH STREET, PERUMALPURAM

+91 9840922213

km301252@gmail.com

Visualization using matplotlib.pyplot library: