What is scikit-learn?
Scikit-learn is also known as sklearn. It’s a free and the most useful machine learning library for Python. Sklearn library comes loaded with a lot of features such as classification, regression, clustering and dimensionality reduction algorithms include k-means, K-nearest neighbors, Support Vector Machines (SVM), Decision Trees also supports python numerical and scientific libraries NumPy and SciPy.
Sklearn Is Used To Build Machine Learning Models. It should not be used for reading the data, manipulating data and summarizing data.
Installation of a library :
There are multiple ways to install the sklearn library, let’s see how to install it :
pip command :
conda command :
Using Google colab and Jupyter notebook :
If you want to install the sklearn library directly using Jupyter notebook or Google colab then use following command :
Install sklearn with required version
You can also install sklearn with required version by using command :
Upgrading a library
To upgrade the installed library we can use a command
Importing of a library
To make use of the functions in a library, you’ll need to import the library with the help of an import statement. An import statement is created by the import keyword along with the name of the library.
Importing sklearn library :
We can also import the python library using the alias.
here, import the “sklearn” library using the alias “sk”.
Instead of the whole library, if you want to import a specific function in it you can also do this by using from – import method.
suppose you want to import datasets function in the sklearn library you can follow the below command :
Loading dataset
The sklearn.datasets package includes some small toy datasets that also helps to fetch larger datasets that do not require downloading any file from some external websites.
Loading list of all dataset in sklearn
Code :
Output :
Importing dataset :
Here we are importing iris dataset in sklearn
Features of scikit-learn
Sklearn library comes loaded with a lot of features such as :
Supervised Learning :
In supervised learning, data must be in label format. A supervised learning algorithm produces a gathered function after analyzes the training data, which can be used for mapping new examples.
There are two types of problems in supervised learning i.e :
- Classification :
The person distribution according to their age is an example of classification. It is trained with many example ages along with their class(child, young adults, etc)and it must learn how to classify new people according to their age.
- Regression :
In regression typically the task is to predict a target numeric value, such as the price of a house, given a set of features (size, Number of bedrooms, Etc.)called predictors. This type of task – regression. For training a model we want to give it many examples of houses, including both predictors and their price.
Some supervised learning algorithms:
- k-Nearest Neighbors
- Linear Regression
- Logistic Regression
- Support Vector Machines (SVM)
- Decision Trees
Unsupervised Learning :
Unsupervised learning is a type of machine learning algorithm that is also known as unlabeled. It is used to draw a conclusion from datasets consisting of input data without labeled responses. The most popular method for unsupervised learning is clustering, which is used by EDA to find hidden patterns in a given dataset.
- Clustering :
Clustering is the task in which we want to group a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. You have data about your LinkedIn profile visitors. We want to run a clustering algorithm to try to detect groups of similar visitors. At no point do you tell the algorithm which group a visitor belongs to it finds those connections without your help? For example, it might notice that 65% of your visitors are students from your domain and generally see your post in the evening, while 23% is a recruiter who visits during the weekends.
Some unsupervised learning algorithms:
- K-means
- Fuzzy K-means
- Hierarchical clustering
- Mixture of Gaussians
Reinforcement Learning :
Reinforcement learning is also one of the basic machine learning paradigms, alongside supervised learning and unsupervised learning. It is all about making decisions sequentially.Simply, we can say that the output depends on the output of the previous input. For example, many robots implement Reinforcement Learning algorithms to learn how to walk, how to play Chess games, etc.
Example :
Step 1:
Import the required libraries
Step 2:
Load the dataset
Step 3:
Data Preparation and EDA
Step 4:
Building a Machine Learning Model
Summary :
After reading this blog/article you can understand how to install and import the scikit-learn library in various ways and also to load inbuilt datasets in the scikit-learn library. We can easily understand what supervised learning, unsupervised learning, reinforcement learning. their basic types such as classification, regression, clustering and names of specific algorithms.
article by: Rushikesh Lavate
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs