First of all we will discuss Natural Language. Natural Language is a Language which has developed naturally by humans. As NLP is a Technique Which is used to communicate with the machine or computer. It is a branch of Machine Learning and Artificial intelligence. NLP is act as an interpreter between user and Machine.
Advantages
- NLP is very time efficient.
- NLP removes unnecessary and unwanted information.
Disadvantages
- NLP is unpredictable
- NLP has limited functions which help we perform limited tasks only.
- It may not show context.
Components of NLP
There are Two types of components in NLP.
- Natural Language Understanding.
- NL Generation.
Natural Language Understanding:-
NLU deals with understanding the input given by the user as a part of human Language.
NL Generation:-
It deals with producing written and spoken Language from raw data.
Application of NLP
There are following applications of NLP.
- Question Answering:- Here we are build a system that answering the question asked by the humans.
- Spelling Correction:- With the help of NLP we can check the spelling mistake and correct it.
- Spam Detection:- Spam Detection is used to detect unuseful or fake emails
- Sentimental Analysis:- Some time data can be different like in the form of emojis to understand.
Building Pipeline of NLP
There are the following steps to build pipeline of NLP-
Step 1:- Sentence Segmentation
It is the first step of building the NLP pipeline. It split the paragraph into separate sentences.
Example:-
If you tell me I will forget. or If you show me I will remember. If you involve me i will understand.
Sentence segments provide the following result.
- “If you tell me i will forget.”
- or “If you show me i will remember.”
- “If you involve me i will understand.”
Step 2:- Words Tokenization
It is use to split the sentence into separate words. Every word is tokens.
Example:-
Python is a programming language. Word Tokenizer gives us following result
“Python”, “is”, “a”, “programming”, “language”, “.”
Step 3:- Stemming
Stemming is the process of reducing a word or Normalizing the words. For example likes, liked, likely, all these words originate with a single root word “like.”
Some time stemming generate the root word which may not have any Specific meaning, it is the biggest problem.
Step 4:- Lemmatization
Lemmatization is a text preprocessing technique used in Natural Language Processing. Lemmatization is similar to stemming. In the stemming root word is called stem and in the Lemmatization root word is called lamma. But lamma always be a specific meaning.
Step 5:- Identifying Stop Words
Stop words are words that are removed before and after the natural language data (text) are processed. “is”, “am”, “and”, “the”, “a” these are occur in the stop words category.
Step 6- Dependency Parsing
Dependency Parsing is used to find the dependency of all words to each other
Step 7:- POS Tag
POS Tags are Parts of Speech Tags, which is used to labelling each of word in a sentence. Noun, Verb, Adverb, Adjective, pronoun, Conjunction and their sub-categories can be labels.
It is also known as grammatical tags
Step 8:- Name Entity Recognition (NER)
Name Entity recognition is the process of recognising the name entity and categorize the entity such as person name, organization name or location.
Step 9:- Chunking
Chunking is a process of extracting the phrases from unordered or Unstructured text and grouping them into bigger pieces of sentences.
Conclusion | NLP
So that’s how NLP works. I hope you find this article insightful!
written By: Chandra Shekhar Tiwari
reviewed by: Shivani Yadav
If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs