Natural Language Processing (NLP) - Pianalytix - Build Real-World Tech Projects

First of all we will discuss Natural Language. Natural Language is a Language which has developed naturally by humans. As NLP is a Technique Which is used to communicate with the machine or computer. It is a branch of Machine Learning and Artificial intelligence. NLP is act as an interpreter between user and Machine.

Advantages

NLP is very time efficient.
NLP removes unnecessary and unwanted information.

Disadvantages

NLP is unpredictable
NLP has limited functions which help we perform limited tasks only.
It may not show context.

Components of NLP

There are Two types of components in NLP.

Natural Language Understanding.
NL Generation.

Natural Language Understanding:-

NLU deals with understanding the input given by the user as a part of human Language.

NL Generation:-

It deals with producing written and spoken Language from raw data.

Application of NLP

There are following applications of NLP.

Question Answering:- Here we are build a system that answering the question asked by the humans.
Spelling Correction:- With the help of NLP we can check the spelling mistake and correct it.
Spam Detection:- Spam Detection is used to detect unuseful or fake emails
Sentimental Analysis:- Some time data can be different like in the form of emojis to understand.

Building Pipeline of NLP

There are the following steps to build pipeline of NLP-

Step 1:- Sentence Segmentation

It is the first step of building the NLP pipeline. It split the paragraph into separate sentences.

Example:-

If you tell me I will forget. or If you show me I will remember. If you involve me i will understand.

Sentence segments provide the following result.

“If you tell me i will forget.”
or “If you show me i will remember.”
“If you involve me i will understand.”

Step 2:- Words Tokenization

It is use to split the sentence into separate words. Every word is tokens.

Example:-

Python is a programming language. Word Tokenizer gives us following result

“Python”, “is”, “a”, “programming”, “language”, “.”

Step 3:- Stemming

Stemming is the process of reducing a word or Normalizing the words. For example likes, liked, likely, all these words originate with a single root word “like.”

Some time stemming generate the root word which may not have any Specific meaning, it is the biggest problem.

Step 4:- Lemmatization

Lemmatization is a text preprocessing technique used in Natural Language Processing. Lemmatization is similar to stemming. In the stemming root word is called stem and in the Lemmatization root word is called lamma. But lamma always be a specific meaning.

Step 5:- Identifying Stop Words

Stop words are words that are removed before and after the natural language data (text) are processed. “is”, “am”, “and”, “the”, “a” these are occur in the stop words category.

Step 6- Dependency Parsing

Dependency Parsing is used to find the dependency of all words to each other

Step 7:- POS Tag

POS Tags are Parts of Speech Tags, which is used to labelling each of word in a sentence. Noun, Verb, Adverb, Adjective, pronoun, Conjunction and their sub-categories can be labels.

It is also known as grammatical tags

Step 8:- Name Entity Recognition (NER)

Name Entity recognition is the process of recognising the name entity and categorize the entity such as person name, organization name or location.

Step 9:- Chunking

Chunking is a process of extracting the phrases from unordered or Unstructured text and grouping them into bigger pieces of sentences.

Conclusion | NLP

So that’s how NLP works. I hope you find this article insightful!

written By: Chandra Shekhar Tiwari

reviewed by: Shivani Yadav

If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs