Predicting the chaotic trajectory double pendulum Using LSTM

Have you ever wondered how we can train or predict a real-world object like a robot learning to balance itself or to climb unknown terrain with its heavy arms working with servo motors? however, it is possible with deep neural networks to train robots. so, Robot’s parts like the hand can be written in the function of the geometrical arrangement of mass, orientation, initial velocity, a driving force from the motor.

Example: We assume the hand of a robot to have two centers of mass one for the arm(upper limb) and forearm and simplify it to be a double pendulum. however, You can see on the internet how a double pendulum can be very fastly balanced in an upright position.

Watch this: https://www.youtube.com/watch?v=FFW52FuUODQ

Here is Google Colab link for implementation(Article don’t have all the codes): https://colab.research.google.com/drive/1AFmRmU2PIIRNp2-g8Y4Otzbd0jU-xtdW?usp=sharing or https://github.com/himanshu230998 

In this article, we will thus predict the trajectory of the double pendulum using a neural network. so, It is divided into 

  • Introduction
  • Aim
  • Data generation
  • Model training
  • Baseline model
  • Results
  • Conclusion
  • Reference

Introduction | Trajectory Of The Double Pendulum

You might be thinking that just write Newton’s law of motion. thus, Obtain governing equations and just feed the initial condition to obtain the position of the pendulum. But it is not that simple. Actually, a double pendulum is a chaotic system for the angle of a pendulum with a vertical greater than 10 degrees, but for an angle less than 10 degrees, it’s not chaotic. 

So what is a chaotic system? A chaotic system is one that shows large deviations in its behavior with however small changes to its initial conditions. 

Fig: 1 Double Pendulum

The trajectories of the double pendulum for small and large angle cases are shown below. The chaotic phenomena, on the left, can clearly be seen in the large-angle case whereas the small-angle path, on the right, clearly shows a path that repeats itself over the course of time.

Fig: 2 Left Figure Shows The Chaotic System And Right Figure Show The Non-Chaotic System

Aim

We will compare a chaotic system for an angle greater than 10 degrees to a non-chaotic system of a double pendulum. however, Our baseline model Linear Regression With Feature Map, autoregressive model, Feed Forward Neural Network with Markovian Assumption with the LSTM model. therefore, The baseline model is implemented and is available on my GitHub account. So we will implement LSTM here.

Data generation | Trajectory Of The Double Pendulum

To generate data for a double pendulum, so we will first find the governing differential equation and using python, data will be generated.

So above differential equation can be implemented in python:

(Note that some code lines are missing which however were not important for conceptual understanding. So refer google colab or github link)

Step:1 

Find the [Z1, Z1_dot, Z2, Z2_dot] for each time step

# derivs(state,t) returns array of 4 number [Z1, Z1_dot, Z2, Z2_dot]def derivs(state, t):        dydx = np.zeros_like(state)    dydx[0] = state[1]     del_ = state[2] – state[0]    den1 = (M1 + M2)*L1 – M2*L1*cos(del_)*cos(del_)    dydx[1] = (M2*L1*state[1]*state[1]*sin(del_)*cos(del_) +               M2*G*sin(state[2])*cos(del_) +               M2*L2*state[3]*state[3]*sin(del_) –               (M1 + M2)*G*sin(state[0]))/den1     dydx[2] = state[3]     den2 = (L2/L1)*den1    dydx[3] = (-M2*L2*state[3]*state[3]*sin(del_)*cos(del_) +               (M1 + M2)*G*sin(state[0])*cos(del_) –               (M1 + M2)*L1*state[1]*state[1]*sin(del_) –               (M1 + M2)*G*sin(state[2]))/den2     return dydx dt = 0.01t = np.arange(0.0, 20 , dt)theta1 = np.random.randint(-150,150,size = 100)theta2 = np.random.randint(-150,150,size = 100)# hard code it here,  can change to randominitial_velo_1 = 5.0initial_velo_2 = 2.0state = np.radians([theta1[i], initial_velo1_1, theta2[i], initial_velo1_2])y = integrate.odeint(derivs, state, t)#y gives array [Z1, Z1_dot, Z2, Z2_dot]

Step: 2

From the array [Z1, Z1_dot, Z2, Z2_dot], find collect dataset  [x1, x2, y1, y2 ,v_x1, v_y1,v_x2, and v_vy2]

    x1 = L1*sin(y[:, 0])       #x1=L1*sin(theta_1) where theta_1=z1    y1 = -L1*cos(y[:, 0])      #y1=L1*cos(theta_1) where theta_1=z1    x2 = L2*sin(y[:, 2]) + x1  #x2=L2*sin(theta_2)+x1 where theta_2=z2    y2 = -L2*cos(y[:, 2]) + y1 #y2=L2*cos(theta_2)+y1 where theta_2=z2    v_x1 = np.diff(x1)             v_x1 = np.insert(v_x1,obj=0,values=0)    v_x1 = v_x1/dt             #v_x1=difference in x1/dt    v_x2 = np.diff(x2)    v_x2 = np.insert(v_x2,obj=0,values=0)    v_x2 = v_x2/dt             #v_x2=difference in x2/dt    v_y1  = np.diff(y1)    v_y1 = np.insert(v_y1,obj=0,values=0)    v_y1 = v_y1/dt              #v_y1=difference in y1/dt    v_y2 = np.diff(y2)    v_y2 = np.insert(v_y2,obj=0,values=0)    v_y2 = v_y2/dt               #v_y2=difference in y2/dt

Step: 3

Store the dataset in .csv or .txt file for later use.

with open(‘/content/drive/MyDrive/CS504 project/CSV.csv’, ‘w’,newline = ”) as csvfile:    filewriter = csv.writer(csvfile,quoting=csv.QUOTE_ALL)    for i in range(len(t)):        filewriter.writerow([t[i],x2[i]])

We perform feature scaling to thus normalise the data.

# normalize the datasetscaler = MinMaxScaler(feature_range=(0, 1))dataset = scaler.fit_transform(dataset)

Step: 5

finally, Split the dataset for training and testing.

# split into train and test setstrain_size = int(len(dataset) * 0.67)test_size = len(dataset) – train_sizetrain, test = dataset[0:train_size, :], dataset[train_size:len(dataset), :]# reshape into X=t and Y=t+1look_back = 1trainX, trainY = create_dataset(train, look_back)testX, testY = create_dataset(test, look_back)# reshape input to be [samples, time steps, features]trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))

Model Training:

Step: 6

firstly, Make a sequential model.

# create and fit the LSTM networkmodel = Sequential()model.add(LSTM(4, input_shape=(1, look_back)))model.add(Dense(1))tf.keras.utils.plot_model(model, show_shapes=True)

Step: 7

so, Compile and fit the model.

model.compile(loss=’mean_squared_error’, optimizer=’adam’)model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)

Step: 8

also Make prediction.

# make predictionstrainPredict = model.predict(trainX)testPredict = model.predict(testX)# invert predictionstrainPredict = scaler.inverse_transform(trainPredict)trainY = scaler.inverse_transform([trainY])testPredict = scaler.inverse_transform(testPredict)testY = scaler.inverse_transform([testY])# calculate root mean squared errortrainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:, 0]))print(‘Train Score: %.2f RMSE’ % (trainScore))testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:, 0]))print(‘Test Score: %.2f RMSE’ % (testScore))# shift train predictions for plottingtrainPredictPlot = np.empty_like(dataset)trainPredictPlot[:, :] = np.nantrainPredictPlot[look_back:len(trainPredict) + look_back, :] = trainPredict# shift test predictions for plottingtestPredictPlot = np.empty_like(dataset)testPredictPlot[:, :] = np.nantestPredictPlot[len(trainPredict) + (look_back * 2) + 1:len(dataset) – 1, :] = testPredict# plot baseline and predictionsplt.plot(scaler.inverse_transform(dataset))plt.plot(trainPredictPlot)plt.plot(testPredictPlot)plt.show()

Baseline model:

Linear Regression With Feature Map: This model takes a hypothesis of the form T x and also seeks to minimize the cost function.

Fig: 3 R2 Scores For Linear Regression

Autoregressive model: Auto Regression is a regression predictor that is geared toward time series data. so, It is commonly used to predict time dependent parameters

Fig: 4: Trajectory Predicted From Auto-Regressive Model

Feed forward neural network: Neural networks in which nodes do not form a cycle and also losses are minimised through backpropagation.

Fig: 5 Feedforward Neural Network Containing 8 Nodes For Dataset
Fig: 6: NN Prediction For First 20 Seconds Of Chaotic Case

Results | Trajectory Of The Double Pendulum

We observe that LSTM predicts closely to our simulated data on the testing set whereas other baseline models fail terribly to do this. From fig. 6, feed forward neural network prediction is thus very far from actual simulated data. 

Figure 7: LSTM’s Predictions Vs. Simulated Data For Highly Chaotic Cases
Figure 8: LSTM’s Predictions (Blue Line) Vs. Simulated Data (Orange Line) For Periodic Case

Conclusion

We concluded that LSTM performs better than the baseline model for predicting chaotic trajectory of a double pendulum. 

Reference:
  1. http://cs229.stanford.edu/proj2019spr/report/38.pdf

Written By: Himanshu Kumar Singh
Reviewed By: Rushikesh Lavate

If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs

Leave a Comment

Your email address will not be published. Required fields are marked *