Introduction to PyCaret
PyCaret is an open source library, low code machine learning library in python that enables you to travel from preparing your data to deploying your model within seconds in your choice of notebook environment. thus, It is very much needed/ suitable for data scientists who run to increase the productivity of their business problems using PyCaret library. It is also an important library for those who want to include in their workflow irrespective of coding background(with or without).
PyCaret’s also major role/aim is to reduce the run time from hypothesis to getting insights from the data. it also is a deployment ready python library.
Installation of PyCaret
For every enthusiast before building the model using PyCaret library, it is required to install the library first. It’s a basic step every beginner will be used to do.
Using the below command, install the PyCaret
#installation for the first time
!pip install pycaret
#if you previously installed, check the version using below code
From pycaret.utils import version
version()
Before installing pycaret or any library, creating a virtual environment is highly recommended for example python 3.6+
Creating a virtual environment. so, If you have anaconda navigator installed then follow these steps:-
#creating a virtual environment
conda create -- name (envname) python 3.6+
#activating conda environment
conda activate envname
Then, follow the above steps by installing the PyCaret library using commands.
Pre- Processing steps | PyCaret
As I said above, PyCaret is the development ready python library which states that as you proceed with the ML experiment, all the steps will be saved automatically in the pipeline which is easier in future while deploying into production. Beautiful thing about PyCaret is it will arrange all dependencies in the pipeline.
Some steps in pre-processing
- Getting the data
- Splitting the data and sampling the data
- Data preparation or cleaning
- Scaling and transforming
- Feature engineering
- Feature selection
- Modelling
Find the total details HERE
Next step,
We will do some hands-on using PyCaret library i,e will try to solve Regression Problem
What all Prerequisites needed?
- Python 3x (preferable 3.6 + )
- PyCaret latest version
- Internet connections to load data with the library
- Knowledge about Binary classification problems
What is Regression?
Regression is a supervised machine learning technique where the goal is to investigate the relation between the dependent and independent variable.
Techniques covered in this PyCaret based regression:
- Normalization
- Transformation
- Target transformation
- Combine rare levels
- Bin numeric variables
- Model ensembling and techniques
- Tuning Hyperparameters of ensembles
1: Getting the data
#shape of the data
2: Setting up the environment in pycaret
from pycaret.regression import *
exp_reg101 = setup(data = data, target = 'Gold_T+22', session_id=123)
3: Comparing all models
4: Create a model
- Adaboost regressor
- Light gradient boosting machine
- Decision tree
5: Tune a model
- Adaboost regressor
- Light gradient boosting machines
- Decision tree
6: Plat a model
- Residual plot
plot_model(tuned_lightgbm)
- Prediction error plot
plot_model(tuned_lightgbm, plot = 'error')
- Feature importance plot
plot_model(tuned_lightgbm, plot='feature')
7: Predict on sample/Hold- out sample
predict_model(tuned_lightgbm);
Model | MAE | MSE | RMSE | R2 | RMSLE | MAPE | |
0 | Light Gradient Boosting Machine | 0.0191 | 0.0007 | 0.0258 | 0.6433 | 0.0221 | -0.1316 |
8: Finalize model for deployment
final_lightgbm = finalize_model(tuned_lightgbm)
predict_model(final_lightgbm);
Model | MAE | MSE | RMSE | R2 | RMSLE | MAPE | |
0 | Light Gradient Boosting Machine | 0.0025 | 0.0 | 0.0034 | 0.9938 | 0.0032 | -0.0803 |
9: Predict on Unseen data
unseen_predictions = predict_model(final_lightgbm, data=data_unseen)
unseen_predictions.head()
10: Saving the model
save_model(final_lightgbm,'Final Lightgbm Model 08Feb2020')
11: Loading the saved model
saved_final_lightgbm = load_model('Final Lightgbm Model 08Feb2020')
new_prediction = predict_model(saved_final_lightgbm, data=data_unseen
new_prediction.head()
Conclusion
In this article, we have covered almost everything under PyCaret and this is also called a new AutoML technique. Hope this article gives you insight for your learning.
Thanks for reading!
written by: Krishna Heroor
reviewed by: Umamah
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs