Discriminant Analysis- A Conceptual Understanding

What is Discriminant Analysis?:

Discriminant Analysis is A classification technique that deals with the data with a response variable and predictor variables. mainly used to classify the observation to a class or category based on the independent variables of the data.

So, The two types of Discriminant Analysis: Linear Discriminant Analysis and Quadratic Discriminant Analysis.

Linear Discriminant Analysis (LDA):

It is a supervised technique and tries to predict the class of Dependent Variable using the linear combination of Independent Variables. It assumes that the independent variables are normally distributed (continuous and numerical) and equal variance/ covariance for the classes. This technique then used both for classification and dimensionality reduction. When these assumptions satisfies, LDA creates a Linear Decision Boundary.

Figure: 0s And 1s In The Dataset

This line can clearly discriminate between 0s and 1s in the dataset. The objective of LDA is to therefore argue the best line that separates 0s and 1s. However, when these assumptions gets violated, LDA performs well.

Linear Discriminant Analysis Technique:

DS = β0 + β1*X1 + β2*X2 + —- + βk*Xk

Where

DS: Discriminant Score

Β’s : Discriminant Weights/ Coefficients

X’s: Independent Variables

Weights are estimated so that groups are separated as clearly as possible of the discriminant functions. So, LDA constructs an equation which minimizes the possibility of misclassifying cases into their respective classes.

Assumptions of LDA:

1. Multivariate normality- Independent variables should normally distribute for all labels.

2. Equal variance and covariance for all classes.

3. No multicollinearity and if present, it should treated if the outcome is affected.

4. All the samples in the data should be independent of each other.

Criteria for LDA to Perform Well:

1. Minimizes the possibility of misclassifying cases into their respective classes.

2. Distance of points from the line i.e., how far away are the lag points from the separating line.

3.  Probability of being on the LHS and the probability of being on the RHS.

Standardized, Unstandardized and Structure Coefficients:

The main purpose is to standardize the variable (where mean becomes 0, standard deviation becomes 1 and covariance becomes correlation) to bring in numerical stability. If the independent variables have units, then βs will inherit the reciprocals of the corresponding units. So as to make βs free from units, the original independent variables are standardized. So, The higher the values of βs implies the more important the corresponding independent variables become to distinguish between the class of dependent variables. It also shows us the variable importance. LDA prefers that the independent variables are more correlated unlike linear regression.

When is LDA Used?:

1.       When the classes are well separated. Logistic regression lacks stability when the classes are well separated, that is when LDA comes to rescue.

2.       When the data is small, LDA is more efficient.

3.       When we have more than two classes, LDA is a better choice. In case of binary classification, both logistic regression and LDA can be applied.

Steps to Perform Linear Discriminant Analysis:

1.   Compute the d-dimensional mean vectors for the different classes of the dataset.

2.    Calculate the between-class variance i.e., the separability between the mean of different classes.

3.   Calculate the within class variance i.e., the separability between the mean and sample of each class.

4.     Compute the Eigen Vectors and the corresponding Eigen Values for the scatter matrices. An eigen vector corresponding to real non-zero eigen value, points in a direction that gets stretched due to the transformation. the eigen value is the factor by which it is stretched. Afterwards, Negative eigen value implies the direction gets reverse.

5.     Sort the Eigen Vectors by decreasing Eigen Values and choose k-Eigen Vectors with the largest Eigen Values to form a (n x k) dimensional matrix.

6. So, Construct a lower-dimensional space projection using Fisher’s criterion which maximizes the between class variance and minimize the within class variance.

How Does LDA Make Predictions?:

LDA model uses Bayes Theorem to estimate probability. They make predictions upon the probability that a new input dataset belongs to each class. The class with the highest probability is the output class and then LDA makes a prediction. So, The prediction made simply by the use of Bayes Theorem which estimates the probability of the output given the input. They also make use of the probability of each class and also the data belonging to that class.

Comparison of Linear Discriminant Analysis with Other Techniques:

1.   LDA, ANOVA and Regression analysis express the dependent variable as a linear combination of independent variables.

2.   Simultaneously, The dependent variable in LDA is Categorical and independent variables are continuous. ANOVA uses categorical independent variables and a continuous dependent variable.

3.   LDA is closely related to PCA and Factor Analysis as both are linear transformation techniques i.e., they look for linear combination of variables which best explain the data.

4.   However, LDA is a supervised technique and PCA is an unsupervised technique as it ignores class labels.

Applications of LDA:

1.       Separation of 0s and 1s.

2.       Recognition of objects.

3.       In Pattern recognition tasks.

Quadratic Discriminant Analysis:

This is a variant of LDA Simultaneously uses quadratic combinations of independent variables to predict the class in the dependent variable. 

It does not assume equal covariance of the classes, but the assumption of Normal Distribution still holds. So, QDA creates a quadratic decision boundary.

DS = β1*X1 + β2*X2 + β3*X12 + β4*X22 + β5*X1*X2

However, QDA cannot be used for dimensionality reduction.

Written By: Srinidhi Devan

Reviewed By: Viswanadh

If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs

Leave a Comment

Your email address will not be published. Required fields are marked *