Introduction to Logistic Regression :
Logistic Regression is a supervised learning method mainly used for classification. It is used when a dependent variable is binary categorical and the independent variable is categorical and continuous. Logistic regression can take only integral values representing various classes.
The outcome could be in the form of 1 / 0, True / False, High/Low, Yes / No given a set of independent or predictor variables. Thus, when the output variable has two or more discrete outcomes, logistic regression is a commonly used technique.
Logistic regression is very simple and easier than modern machine learning algorithms, but simpler algorithms don’t mean anything less. There are many cases where logistic regression has advantages that are very significant in real cases. and is more than enough to solve any business or related problem.
Now, you might have the question why is it called the Logistic Regression when it is used for Classification?
It is mainly because it uses the underlying principle of simple linear regression.
Now further we will try to understand what are types of classification?
1) Binary Classification : It means only boolean function i.e yes or no only.
Eg : Bank will accept or reject the loan application
Patient is diabetic or not.
2) Multinomial Classification : From the name multinomial itself , we get when the dependent variable has multiple categories. There are two
types of Multinomial Logistic Regression
Ordered Multinomial Logistic Regression
Nominal Multinomial Logistic Regression
Let us understand Logistic Regression by taking a simple example:
Suppose you have applied for a loan in a bank. There can be two possibilities ,
Whether the bank will accept the loan or not.
Dependent variable : Whether bank is going to accept loan or not (0/1)Predictors or Independent Variable : Information about the bank relationship and demographic information.
Here the bank is going to sanction the loan based on no. of independent variables like education, age, income, employment status, emi to income ratio, and credit score etc. For Bank loan lending decisions the variables are mostly captured through the financial statements as they depict the true financial health of an individual.
Why Logistic Regression?
Linear Regression helps us predicting continuous target variable but Logistic
Regression helps us to predict a discrete and a target variable.
Logistic Regression is one of the “white-box” algorithms which helps us in
determining the probability values and the corresponding cut-offs.
The aim of logistic regression is to find a function of a predictor variable that relates them to a binary outcome i.e 0/1 outcome.
Broad overview of Logistic Regression Model:
Here “p” is used as a dependent variable.
The plot shows a model of the relationship between a probability of an outcome and continuous predictor.
This is known as sigmoid function and it is defined as:
Assumptions of Logistic Regression:
1. Dependent variable is categorical.
2. Attributes and log odds should be linearly related to the independent variables
3. In binary logistic regression class of interest is coded with 1 and other class 0
4. Attributes are independent of each other (low or no multi collinearity)
5. In the multi-class classification class of interest is coded 1 and rest 0 .
Applications of Logistic Regression:
1. Medical diagnosis: Predict the disease patient is suffering from on the basis of symptoms .
2. Forecasting weather: Stormy, sunny, cloudy, rainy and a few more.
3. Marriage portals and websites :For finding the best match
4.Human Resources: On the basis of individual characteristic HR manager of a company can predict the absenteeism pattern of his employees
5. If loan has to be given to a particular candidate depend on his identity check, account summary, any properties he hold, any previous loan, etc
Conclusion:
Logistic regression is one of the best tools used by researchers, statisticians, and data scientists in predictive analytics. Except that the dependent variable should be discrete, the assumptions for logistic regression are mostly similar to that of multiple regression. Most of the data science professionals and students still struggle to learn the basics of this technique, that is why I am pleased to present you a basic introduction of the topic . As you all know “The more that you read, the more things you will know “ so start your learning journey now.
Article By : Nikhil Rampuria
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other technical and Non Technical Internship Programs