IN
DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS
STOCKHOLM SWEDEN 2021 ,
Data-Driven Learning for
Approximating Dynamical Systems Using Deep Neural Networks
MAX BERG WAHLSTRÖM
AXEL DERNSJÖ
Abstract
In this thesis, a one-step approximation method has been used to produce approximations of two dynamical systems. The two systems considered are a pendulum and a damped dual-mass-spring system.
Using a method for a one-step approximation proposed by [15] it is first shown that the state variables of a general dynamical system one time-step ahead can be expressed using a concept called effective increment.
The state of the system one time-step ahead then only depends on the previous state and the effective increment, and this effective increment in turn only depends on the previous state and the governing equation of the dynamical system.
By introducing the concept of neural networks and surrounding concepts it is presented that a neural network could be trained to approximate this effective increment, thereby negating the need to have a known governing equation when determining the system state. The solution to a general dynamical system can then be approximated using only the trained neural network operator and a state variable to produce the state variable one discrete time-step ahead.
When training the neural network operator to approximate the effective increment, the analytical solutions to two dynamical systems are used to produce large amounts of training data on which the network can be trained. Using the optimizer algorithm Adam [8] and the collected training data the network parameters were changed to make the difference between the output of the network and some target value small, the target value, in this case, being the correct state variable one time-step ahead.
The results show that training a neural network to be able to produce approximations of a dynamical system
is possible, but if one wants to produce more accurate approximations of more complex systems than the
ones considered in this thesis, greater care has to be taken when choosing parameters of the network as well
as tweaking the hyper-parameters of the optimizer Adam. Furthermore, the structure of the network could
be tweaked by changing the number of hidden layers and the number of nodes in them.
Sammanfattning
Denna rapport har anv¨ ant en enstegs-approximationsmetod f¨ or att approximera tv˚ a dynamiska system. De tv˚ a systemen som approximeras ¨ ar en pendel och ett d¨ ampat massa-fj¨ ader-system med tv˚ a massor.
Genom att anv¨ anda metoden f¨ or enstegs-approximation som f¨ oreslogs i [15] visas f¨ orst att tillst˚ andsvariblerna i ett generellt dynamiskt system, ett tidsteg fram˚ at i tiden kan uttryckas med hj¨ alp av konceptet effektiv
¨
okning. Systemets tillst˚ and ett steg fram˚ at i tiden ¨ ar endast beroende av det tidigare tillst˚ andet samt den effektiva ¨ okningen. Den effektiva ¨ okningen beror i sin tur endast p˚ a det tidigare tillst˚ andet och systemets styrande ekvation.
Genom att introducera konceptet neurala n¨ atverk och de ytterligare kringliggande koncepten visas det att ett neuralt n¨ atverk kan tr¨ anas f¨ or att approximera den effektiva ¨ okningen. D¨ arav finns inget behov av en k¨ and styrande ekvation. L¨ osningen till ett generellt dynamiskt system kan d˚ a approximeras genom att anv¨ anda det tr¨ anade neurala n¨ atverket och tillst˚ andsvariblerna f¨ or att skapa tillst˚ andet i n¨ asta diskreta tidspunkt.
F¨ or att tr¨ ana det neurala n¨ atverket att approximera den effektiva ¨ okningen anv¨ ands de analytiska l¨ osningarna till de tv˚ a dynamiska systemen f¨ or att skapa stora m¨ angder data som n¨ atverket kan tr¨ ana p˚ a. Genom att anv¨ anda optimeringsalgoritmen Adam [8] och den samlade tr¨ aningsdatan kan n¨ atverksparametrarna ¨ andras f¨ or att g¨ ora felet mellan det som kommer ut fr˚ an n¨ atverket och det ¨ onskade v¨ ardet s˚ a litet som m¨ ojligt, d¨ ar det ¨ onskade v¨ ardet ¨ ar tillst˚ andsvaribeln ett steg fram i tiden.
Resultatet visar att det g˚ ar att tr¨ ana ett neuralt n¨ atverk till att approximera de dynamiska systemen. F¨ or
att ˚ astadkomma h¨ og noggrannhet i approximationerna f¨ or mer komplexa system ¨ an de som presenteras i
rapporten beh¨ ovs mer noggrannhet n¨ ar n¨ atverkets parametrar v¨ aljs och hur hyperparametrarna i optimer-
ingsalgoritmen Adam v¨ aljs.
Acknowledgments
The authors of this thesis would like to thank our supervisors David Krantz and Xin Huang for their great
support and engaging discussions throughout this project.
Contents
1 Introduction 1
1.1 Background . . . . 1
1.2 Problem Statement . . . . 2
1.3 Overview . . . . 2
2 Dynamical Systems 3 2.1 Pendulum . . . . 3
2.2 Damped Dual-Mass-Spring System . . . . 5
3 Deep Neural Network 7 3.1 Architecture . . . . 7
3.2 Activation Function . . . . 9
3.3 Output and Loss function . . . . 10
3.4 Residual Neural Network . . . . 10
3.5 Training the Network . . . . 11
3.5.1 Gradient-Based Learning . . . . 12
3.5.2 Back-Propagation . . . . 12
3.5.3 Optimization Algorithm . . . . 13
4 Approximating Dynamical Systems With Deep Neural Networks 15 4.1 One-Step Solution Using Effective Increment . . . . 15
4.2 One-Step Approximation Using ResNet . . . . 16
4.3 data sets . . . . 16
5 Implementation 17 5.1 Data Collection . . . . 17
5.1.1 Pendulum . . . . 17
5.1.2 Damped Dual-Mass-Spring System . . . . 17
5.2 Network Setup . . . . 18
5.3 Training Process . . . . 18
5.4 Validation of the Trained Model . . . . 19
6 Result 20 6.1 Pendulum . . . . 20
6.1.1 No External Force and Constant Time-Step . . . . 20
6.1.2 No External Force and Varying Time-Step . . . . 20
6.1.3 External Force Input . . . . 21
6.1.4 Different Stopping Criteria . . . . 22
6.2 Damped Dual-Mass-Spring System . . . . 23
6.2.1 No External Force and Constant Time-Step . . . . 23
6.2.2 Force Input . . . . 24
6.2.3 Overshooting . . . . 25
7 Discussion 26
8 Conclusion 27
1 Introduction
The use of modeling dynamical systems in the industry is immense. From modeling the suspension of a car to the modeling of the complex weather systems that can be found in meteorology.
For some problems such as modeling the suspension in a car or describing the motion of a planetary body an analytical solution for describing the state of the system can be found. For many of the dynamical systems that need to be studied, one can not find an analytic solution or they are hard to find, and the solutions have to be simulated or approximated instead. These simulations are often quite costly in terms of both time and processing power. Instead of these simulations, an approximation method based on data-driven learning can be implemented.
This thesis will investigate one particular approximation method based on data-driven learning with a deep neural network.
1.1 Background
Data-driven learning is the process of learning a specific task by using large amounts of data. One example that utilizes data-driven learning is Google’s translation system, Google Neural Machine Translation system (GNMT) [19]. The process works by using large amounts of training examples (data) on which the translation method can be trained, thereby the word data-driven is used. The translation process can by utilizing this data-driven approach overcome many of the weaknesses of conventional translation.
The data-driven learning method that will be utilized is based on an artificial neural network that uses large amounts of training data to learn a specific task. The resemblance of an artificial neural network illustrated in figure 3 to a biological neural network, the human brain, has been done several times [5]. Each of the nodes in the network represents a neuron in the brain and the connection between the nodes in the network as a synapse in the brain. These connections have the ability to transmit a signal to the next node (neuron).
The learning process of an artificial neural network can also be seen to mimic a human brain, which learns by examples.
The ability of an artificial neural network to learn a specific task demands training. For this process to be successful the network needs large amounts of data to train on and to have the correct setup of nodes and connections between them.
The training of a neural network is a costly affair that can take days to complete and can require large amounts of processing power and memory. The benefit of using a neural network is that once this training is complete, it can produce some output near instantly. This means that the network only has to be trained once, and even though the data needed for training might need to be collected through a costly simulation, this simulation will only need to be done once.
Some popular numerical methods are for example the 4th order Runge-Kutta method and adaptive finite element methods which have the ability to produce accurate approximations in many cases. But these methods also have restrictions, one area where ordinary numerical methods find difficulty in predicting the system state is when a system has a discontinuous motion [3]. The proposed benefit of using a deep neural network for approximation in regards to this aspect is that a neural network has the ability to learn trends.
This is useful if we in some cases one do not seek an extremely accurate solution but instead are interested in the underlying dynamics and want to see how the system would react if some parameters are changed or a particular external force acts upon the system. If the network obtains the ability to find a system’s underlying dynamics, the question of how the system would react to some change could be easy to answer with such a model instead of making costly simulations with other methods.
The data-driven approximation method that this thesis will utilize is based on previous work by [15] and [16].
The method is based on a concept called effective increment that describes the exact changes of a dynamical
system from one state in time to another. The method is also based on a special type of deep neural
network, called residual neural network which is well-posed to approximate a general dynamical system by approximating the effective increment. Conceptually, there are similarities between a network in the form of a residual neural network and a first-order approximation method such as Euler’s method, as in the fact that both methods take a step by using the previous step plus some change depending on the previous step [2].
1.2 Problem Statement
The purpose of this thesis is to use a data-driven method to predict the behavior of a dynamical system. In particular this thesis will look into a pendulum and a damped dual-mass-spring system. These dynamical systems can be described by differential equations. Let us consider a general non-autonomous dynamical system
dx
dt = f (x(t), u(t)) x(t 0 ) = x 0 (1)
where f : R n → R n is the function describing the governing equation of the system, which depends on the time dependent state variable x ∈ R n and some time dependent input signal u ∈ R n . The goal is to utilize a data-driven numerical model that takes an initial condition x 0 at time t 0 and produces a prediction ˆ x of the actual state x such that
ˆ
x(t; x 0 ) ≈ x(t; x 0 ) (2)
where t is the time at which the prediction should be made. This model will then be used to make multiple consecutive one-step predictions using the previous prediction as an input for the next prediction.
1.3 Overview
In section 2 we introduce the two dynamical systems that we seek to evaluate the proposed data-driven
method on and the analytical solutions to these systems will be derived. In section 3 the fundamentals
behind deep neural networks are introduced. In section 4 we show that the exact solution to a dynamical
system can be written on a form expressed with effective increment. Then we give a motivation behind using
a deep neural network with the residual neural network architecture in order to approximate the effective
increment. Section 5 will in more detail explain the process of implementing the theory and producing
results as well as an implementation of a validation model on which the network will be tested. Section 6
then presents the results from the validation model and section 7 discusses these results.
2 Dynamical Systems
A dynamical system is a system in which the state of the system can be described as a function of time. The state of the system can be one or several different physical quantities, such as position, velocity, temperature et cetera. In this thesis, two dynamical systems will be approximated using a data-driven approximation method. This section will therefore derive an analytical solution to two dynamical systems, namely the pendulum and a damped dual-mass-spring system. These two systems are chosen to be able to test this approximation method on two systems with different complexities. The pendulum has relatively low com- plexity, while the damped dual-mass-spring system has a higher complexity much due to the dampening in the system.
2.1 Pendulum
Figure 1: Pendulum
The differential equation describing the motion of the pendulum in figure 1 can be formulated using Newtons second law,
mg sin(θ) = −ml d 2 θ
dt 2 + F (t) (3)
where m is the mass attached to the mass-less rod with length l and m is considered to be a point mass.
The gravitational acceleration is notated as g and θ(t) is the angle as defined in figure 1 and F (t) ∈ R is an external time-variant force that acts tangentially on the mass. The differential equation as presented in equation (3) can be rewritten as
d 2 θ dt 2 + g
l sin(θ) = F (t)
ml . (4)
To simplify the differential equation further, the small angle approximation θ ≈ sin(θ) is used, which means that equation (4) can be written
d 2 θ dt 2 + g
l θ = F (t)
ml . (5)
To find a homogeneous solution to this differential equation, a characteristic equation can be formed, r 2 + g
l = 0. (6)
The characteristic equation as presented by (6) has the roots r = ± p g
l i. Because these roots are complex we can formulate a homogeneous solution on the following form
θ h (t) = A cos r g l t
+ B sin r g l t
, (7)
where A, B ∈ R are constants. Setting the initial condition to θ 0 and rewriting the angular frequency as ω = p g
l to further simplify, the constants can be found and the final homogeneous solution to the differential equation as presented in (5) can be formed
θ h (t) = θ 0 cos(ωt). (8)
The particular solution to the differential equation of the pendulum depends on F (t). In the example below, the equation is solved for the force F (t) = F 0 sin(αt) where F 0 is the amplitude of the force and α is the angular frequency. To find the particular solution to this differential equation, we assign the particular solution θ p (t) = K 1 sin(αt) + K 2 cos(αt) where K 1 , K 2 ∈ R are constants. The constants are then solved which gives the particular solution,
θ p (t) = F 0
ml(ω 2 − α 2 ) sin(αt). (9)
The solution to the differential equation is then
θ(t) = θ 0 cos(ωt) + F 0
ml(ω 2 − α 2 ) sin(αt). (10)
Derivating equation (10) with respect to time the angular velocity is obtained as θ(t) = −ωθ ˙ 0 sin(ωt) + α F 0
ml(ω 2 − α 2 ) cos(αt), (11)
where the ˙ θ denotes the time derivative of θ.
2.2 Damped Dual-Mass-Spring System
m 2 m 1
k 2
k 1
c F (t)
x
Figure 2: The damped dual-mass-spring system.
Given the dynamical system as presented in figure 2 the differential equations describing the system can be written as:
( m 1 x ¨ 1 − c( ˙x 2 − ˙x 1 ) − k 2 x 2 + (k 1 + k 2 )x 1 = 0,
m 2 x ¨ 2 + c( ˙ x 2 − ˙x 1 ) − k 2 (x 2 − x 1 ) = F (t). (12) where m 1 and m 2 are the two masses, in this case considered as point masses, x 1 and x 2 are the positions of the two masses, c is a dampening constant and k 1 and k 2 are the spring constants for the two springs and F (t) is a time variant force that acts on the mass m 2 . The system of differential equations 12 can be rewritten in a matrix form
˙ x 1
¨ x 1
˙ x 2
¨ x 2
| {z }
˙ x
=
0 1 0 0
− k
1m +k
21
− m c
1
k
2m
1c m
10 0 0 1
k
2m
2c
m
2− m k
22
− m c
2