Activation Functions in the Neural Networks

The activation functions in the neural network are one of the most important components for the appropriate classification. The output, classification accuracy & efficiency of the computation depends upon the activation function.

introduction to Activation Functions

Activation functions are the equations in the mathematical form that identifies the neural network’s output. Each neuron in the neural network consists of the activation function.  The activation function decides whether the particular neuron should fire output or not depending upon whether its input is relevant for the prediction of the final output. Also, it helps to normalize the output of a neuron in the range of 0 to 1 or 1 to -1.

As activation function gets calculated in thousands/millions of neurons for every data sample it must be computationally efficient. The backpropagation technique in modern neural networks places more load on the activation function.

Linear activation function

Range : – infinity to + infinity

Advantages :
  • It is used at only one place i.e. output layer. takes the inputs, multiplies it by the weights, and creates an output signal proportional to the input.
  • It is better than a step function because it allows multiple outputs, not just yes and no.
Disadvantages :
  • If we try to differentiate linear function to bring nonlinearity then our result will no more depend on input “x” and function will become constant and in turn, will not introduce any groundbreaking behavior to our algorithm. So it is not possible use backpropagation to go back and understand which weights in the input neurons can provide a better prediction.

Sigmoid activation function

Range : 0 to 1.

Advantages :
  • It can be easily used at the output layer of a neural network for binary classification problems as the value of sigmoid function lies between 0 and 1 and output can be predicted by keeping some threshold value between 0 and 1.
  • It has a Smooth gradient, preventing jumps in output values.
Disadvantages :
  • For very high or very low values of X, there is almost no change to the prediction, causing a vanishing gradient problem. Due to this neural network may refuse to learn or become too slow to find the accurate output.
  • It is computationally expensive.

Tanh function

Range : – 1 to + 1.

Advantages :
  • As it is usually used in hidden layers of neural networks and gives values between -1 and 1 which in turn will make mean for the hidden layer 0 or very close to it and help in the centering of data and learning for the next layer much easier.
  • Efficient for modeling inputs that have strongly negative, neutral, and strongly positive values.
Disadvantages :
  • For very high or very low values of X, there is almost no change to the prediction, causing a vanishing gradient problem. Due to this neural network may refuse to learn further or it will become too slow to find the accurate output.
  • It is computationally expensive.

ReLU (Rectified Linear Unit):

Range : – 0 to infinity

Advantages:
  • The cost of computation is less as compared to sigmoid & tan h since it uses simple mathematical computations. This function activates only a few neurons at a time which makes the neural network sparse & computationally efficient.
  • It is faster than the sigmoid & tan h activation function.
  • The derivative function in ReLU allows backpropagation.
Disadvantages:
  • The network cannot perform backpropagation when the gradient of the function becomes zero due to zero or negative value of the input.

Leaky ReLU:

Range: – infinity  to infinity

Advantage:
  • Leaky ReLU can enable backpropagation for negative values of the input as it provides little positive slope in the negative section.
Disadvantage:
  • Results are not consistent for negative values of input.

Softmax:

Range : – 0 to 1

Advantages:
  • The multiple classes can be handled with the help of this function.
  • It helps to determine the probability of input being in a particular class by giving the output values in the range of 0-1.

Conclusion:

In this article, we have studied what is activation function, different activation functions in the neural network, their output range, advantages & disadvantages.

Written By: Priyanka Shahane

Reviewed By: Rushikesh Lavate

If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs

Leave a Comment

Your email address will not be published. Required fields are marked *