Why visualising the data is important?
Data visualisation is an important technique in machine learning to understand the data through various plotting techniques. however, It helps us to get insights about the trends, patterns, outliers in datasets which helps us to develop a better machine learning model with high performance.
With high volume of data it is difficult for machine learning engineers to understand the structure, pattern and distribution of data, so data visualisation methods are employed to get statistical inference about the dataset.
Contents:
- What is matplotlib
- Type of plots in matplotlib?
- How to install matplotlib?
- How to import the matplotlib?
- Basic functions in matplotlib.
What is Matplotlib?
Matplotlib is one of the plotting library in python which is however widely in use for machine learning application with its numerical mathematics extension- Numpy to create static, animated and interactive visualisations.
TYPES OF PLOT
We can also create various plots in python with the help of library, let us see what are the commonly in use plots in machine learning.
- Bar Graph
- Histogram
- Scatter plot
- Pie chart
- Box plot
HOW TO INSTALL MATPLOTLIB:
To install matplotlib using pip(python package manager)>>>pip install matplotlib in order To install matplotlib in Anacondaconda install anaconda To install matplotlib in Jupyter notebook!pip install matplotlib |
IMPORTING MATPLOTLIB
#You can import pyplot using the following codeFrom matplotlib import pyplot as plt |
Before getting into the types of chart let us know some of the basic functions in Library.
BASIC FUNCTIONS IN Library
plt.show()
This function in pyplot is in use to display all the figures created in your code.
plt.xlabel()
This function is used to create a label in x axis for reference.
plt.ylabel()
thus, This function is in use to create a label in y axis for reference.
plt.title()
This function is used to create title for the figure.
plt.annotate()
This function is used to add comments inside the figure to make the figure meaningful.
plt.grid()
This function is used to display grid inside the figure.
plt.figure()
This function helps to create one or more figures or chart.
Bar Graph:
Bar graphs are in use to compare different groups of data which changes over time. This is also term as charts which represents categorical data with rectangular bars of height proportion to the value they represent. Each bars representing a category.
Syntax:
matplotlib.pyplot.bar(x,height,width=0.5)
#importing pyplot and numpy libraries from the Library import pyplot as pltImport numpy as np#creating x axis and y axis arrayx=np.array([“A”,”B”,”C”,”D”])y=np.array([10,20,30,10])#plotting the bar chartplt.bar(x,y) # (or) plt.bar(x,y,width=0.5) |
Output:
Histogram:
Histograms are used to display the distribution of numerical data.It shows the frequency distribution of a datasets.It is used for summarizing the discrete or numerical data.It is widely used in statistics to check the normality of the data.We can use hist() function to create a histogram using the pyplot .Let us create a simple histogram plot.
Syntax:
plt.hist(x,optional parameters)
import numpy as npimport matplotlib.pyplot as pltx=np.random.randn(1000)plt.hist(x) |
SCATTER PLOT:
Scatter plot is a graph in which it is used to find the relation between variable and each data in the dataset is represented as a dot.
This plot is used to visualise the effect of one variable with other variable(relation between two variables).We can create a simple scatter plot with the inbuilt function scatter().
Syntax:
plt.scatter(x,y,optional parameters)
import numpy as npimport matplotlib.pyplot as pltx=np.array([5,7,8,7,2,17,2,9,4,12,21,6,5])y=np.array([89,55,47,95,110,56,88,69,62,99,65,36,64])plt.scatter(x,y) |
PIE CHART:
Pie chart is used to visualise the numerical proportion of a data.Usually in percentage value.
Syntax:
plt.pie(x,optional parameters)
Let us consider an example ,Marks obtained in each section by a person in an test is given as x and section. let us plot a pie chart for this data
import numpy as npimport matplotlib.pyplot as pltx=np.array([25,35,25,15])section=np.array([‘Part-A’,’Part-B’,’Part-C’,’Part-D’])plt.pie(x,labels=section) |
BOX PLOT:
Box plot or whisker plot is used to display the summary of a set of data as box .The Box displays the minimum value, first quartile value, median value, third quartile, and maximum value of a dataset.This method is widely used in statistics to understand the distribution of a data.
Syntax:
plt.boxplot(x,optional parameters)
import matplotlib.pyplot as pltimport numpy as npx = np.random.normal(100,10,200)plt.boxplot(x) |
Here the center line represents the median value.Box represents the first and third quartile value.
CONCLUSION:
This blog will help you to understand the different types of plots commonly used in machine learning. Thus, Hope you enjoyed the blog.
Happy learning!!!!!
Written By: Nikesh Joseph
Reviewed By: Vikas Bhardwaj
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs