Royal Institute of Technology
Bachelor Thesis
Weight Matrix Adaptation for increased Memory Storage Capacity in a Spiking
Hopfield Network
Authors:
Ciwan Ceylan Albin Sunesson
Supervisor:
AP Pawel Herman.
Submitted to
the School of Engineering Sciences
May 2015
ROYAL INSTITURE OF TECHNOLOGY
Abstract
School of Computer Science and Communication Department of Computational Biology
Bachelor Thesis
Weight Matrix Adaptation for increased Memory Storage Capacity in a Spiking Hopfield Network
by Ceylan C. & Sunesson A.
The storage capacity of a small spiking Hopfield network is investigated in terms of two parameters governing the conductance and inhibition of the synaptic connections. This is motivated by the possibility of constructing larger associative networks from small network clusters. These kinds of network architectures have been observed in nature and could possibly be a foundation for future applications.
The investigation is conducted using simulations of integrate-and-fire neuron models and static synapses. Several different types of binary patterns are used to provide a detailed analysis of the storage capacity.
The investigated parameters have influence on the storage capacity of the network and
the capacity may be improved with the right choice of parameters. Also, differences in
capacity for different pattern types are observed.
Acknowledgements
First, we wish to thank our supervisor Pawel Herman without whom this project would not have been possible. We are sincerely grateful for your interest, insightful thoughts and your guiding throughout the way. Your expertise have been invaluable for the outcome of this project. Second, we have very much appreciated Emil Wärnbergs well-written thesis and motivating conversations, inspiring us in starting this project. Also, we have had informative and helpful mail conversations, answering our sometimes elementary questions, with both Georgios Iatropoulos and Susanne Kunkel to whom we are very thankful.
ii
Contents
Abstract i
Acknowledgements ii
1 Introduction 1
2 Background 3
2.1 Biological Neural Networks . . . . 3
2.2 The original Hopfield network . . . . 4
2.3 Spiking Neural Network Models . . . . 5
2.4 Previous research into spiking Hopfield networks . . . . 6
3 Method 8 3.1 Models . . . . 8
3.1.1 Neuron Model . . . . 8
3.1.2 Synapse Model . . . . 9
3.2 Simulations . . . . 11
3.2.1 NEST Simulator . . . . 11
3.2.2 Encoding and decoding . . . . 11
3.2.3 Simulation scenarios . . . . 14
4 Results 15 4.1 Simulation Results . . . . 15
4.1.1 Pattern Size . . . . 15
4.1.2 Pattern Overlap . . . . 17
4.1.3 Pattern Stimulation . . . . 17
4.2 Possible explanations of results . . . . 21
4.2.1 The capacity’s dependence on the K-value and overlap . . . . 21
4.2.2 Capacity convergence for large Q-values . . . . 22
4.2.3 Relation between Q and the number of active neurons . . . . 22
5 Conclusion 26 6 Discussion 27 6.1 Reliability of Results . . . . 27
6.2 Further Research . . . . 27
iii
Introduction
The Human Brain Project pilot report describes understanding the human brain as one of the greatest challenges facing 21
stcentury science [1]. Detailed simulations of biological neural networks have been made possible and been developed since the mid 90s thanks to advances in neuroscience, computer science and vast improvements in access to computational power. These simulations are one important tool used by the scientists working on understanding the complexity of the brain.
Furthermore, new approaches to machine learning problems have been inspired by the advances in neuroscience. Spiking neural networks have their roots in biological mod- eling and have been described by Wolfgang Maass as the third generation of applied neural networks and the next step of increased computational power of networks [2].
This statement is reinforced by Paugam-Mooisy and Bohte [3] whom present a summary of state-of-the-art methods involving spiking neural networks as well as some interest- ing applications, e.g. speech processing, active vision for computers and autonomous robotics.
The creation of the artificial Hopfield network was also inspired by advancements in brain research. Specifically, the findings suggesting that parts of the human brain, e.g. the Hippocampus, is able to function as an associative memory [4, 5] and the experimental evidence suggesting that the underling mechanics of the hippocampus memory are gov- erned by Hebbian learning [6]. The Hopfield model is based on this learning paradigm which states that the efficiency of the synaptic connection between two neurons should be dictated by their mutual spiking activity [7]. However, while inspired by biology, the Hopfield network is considered an artificial neural network since it makes some highly artificial assumptions [8]. Worth mentioning is the use of binary neurons, on or off, instead of spiking neurons as well as the assumption of full connectivity, every neuron connected to every other neuron [9] .
The binary neurons of the Hopfield network have successfully been replaced with spiking neurons in several cases [8, 10]. Furthermore, Lansner proposed a biological realistic architecture tackling the unrealistic all-to-all connectivity [11]. He suggests that small
1
Chapter 1. Introduction 2
groups of neurons should form densely connected sub-nets functioning like Hopfield net- works. These sub-nets would then be connected to other sub-nets to form a complete associative memory network.
The modularity of Lansner’s idea induces the thought of designing associative networks for application purposes. Perhaps such a network could consist of several clusters of small spiking Hopfield networks, each specialized in storage of certain patterns. A first step towards such an application would be to investigate the functionality and capacity of these small networks.
Emil Wärnberg implemented and investigated such a small spiking Hopfield network in his bachelor thesis [12]. He successfully stored 3 patterns in a 64 neuron large network using a weight matrix set with the Hebbian learning rule provided in Hopfields origi- nal article [9]. However, Wärnberg covered a wide range of subjects and his focus was not on storage capacity. Thus he neither investigated the total storage capacity of his implementation nor did he investigate if different types of pattern sets affect the capac- ity. Furthermore, Wärnberg explicitly provided inhibitory input into the non-stimulated neurons. This surely stabilizes the network yet it might be possible to provide the inhi- bition via the networks weight matrix. This thesis will attempt to further improve on Wärnbergs work by investigating storage capacity and implementing inhibition via the networks weight matrix.
The aim of this thesis is to provide a thorough analysis into the weight matrix’s influence on the storage capacity of a small spiking Hopfield network and if possible determine if the network can be designed to specialize in different pattern types. The adjustments are made using two parameters; a value Q which affects the conductance of all connections and a value K which affect inhibition levels in the system. Furthermore, the investigation will cover if these parameters have different influence on varying degrees of pattern stimulation, different number of active neurons per pattern or varying degrees of overlap between patterns. The aspiration is to uncover whether different weight matrices are able to specialize in storing different pattern types.
Each neuron will implement an integrate-and-fire model and use similar model param-
eters. The effect of tuning these will not be investigated. The patterns to be stored
will be binary patterns consisting of ones and zeros. A one will correspond to an active
neuron and a zero will correspond to an inactive neuron. The neuron’s firing rate will be
used to determine whether a neuron should be considered active or inactive. An external
current will be provided to excite a part of the neurons corresponding to the ones in a
stored pattern. Neurons corresponding to zeros will never be stimulated.
Background
2.1 Biological Neural Networks
The underlying structure of the brain consists of networks of brain cells called neurons.
Ideally, a neuron is divided into three parts; the dendrites, the soma and the axion.
The dendrites can be considered as the "input device" of the neuron, the soma is the
"central processing unit" and the axion is the "output device". The neuron receives inputs from other neurons through the dendrites in the form of electric pulses, spikes.
These inputs are transmitted to the soma, raising its voltage. When the voltage reaches a threshold the soma fires it’s own electric pulse, called an action potential, through the axion which outputs the signal to other neurons via synapses. These synapses are chemical devices which connect the axion with the dendrites through a complex chain of bio-chemical processes. This synapse connection motivate the terms presynaptic- and postsynaptic neurons, which are used extensively in both biological and artificial neural network contexts. Depending on the substances released in the synapse, the voltage of the postsynaptic neuron may be decreased or increased. This is called inhibition and excitation respectively[7].
Several neural networks of different architecture can be found in various parts of the brain, each believed to be associated with different functions. One such structure is the Hippocampus which is believed to play an important part in long-term memory storage [13]. Through in vivo stimulation of neurons in a rat’s Hippocampus, Kelso et al. found experimental evidence that mutual spiking activity increases the efficiency of the connections between two neurons, in accordance with the Hebbian learning paradigm [6]. It has been further suggested that these synaptic efficiency changes are the foundation of the Hippocampus’ associative memory functionality [4]. These ideas gained increased recognition when Hopfield presented his Hopfield network. This network is capable of associative memory and based on the Hebbian learning paradigm.
3
Chapter 2. Background 4
2.2 The original Hopfield network
The Hopfield network is an example of a stable recurrently connected attractor network.
This signifies that the network will settle into a stable state as it dynamically develops in its state space. These stable states are called attractors. This property provides the Hopfield network with an associative memory functionality. Each pattern to be stored in the network memory will be an attractor into which the network can settle[14]. Thus, provided that the network is stable, initialization of the network with an input pattern would consequently lead to the network settling into the attractor closest in the state space. However, as the state space gets filled up with attractors the likelihood of recall error will increase. J.J. Hopfield investigates this in his original article and concludes that the Hopfield network is able to store about 0.15N patterns, where N is the number of neurons in the network, before the errors in recall becomes severe[9].
The original Hopfield network contains only a single layer of neurons. These are all connected symmetrically, mathematically w
ij= w
ji, w
ijbeing the weight (efficiency) of the connection from neuron i onto neuron j. Furthermore, the Hopfield network does not allow neurons to be connected onto themselves, w
ii= 0. The fact that the Hopfield network is single layered implies that the input and output layer are the same[15]. The neurons in the original Hopfield model uses binary states. These are represented by the values 1 and 0, on or off[9].
To store p patterns in the network, Hopfield proposes that the weights are set according to the following Hebbian learning rule:
w
ij=
p
X
s=1
1.0 if V
is= 1 and V
js= 1 1.0 if V
is= 0 and V
js= 0
−1.0 if V
is6= V
jsHere V
jsis the state of neuron i in pattern s.
The state of a neuron is updated by comparing a threshold value to the sum of all input from connected neurons,in mathematical terms (V
i/U
iis the state/threshold of neuron i):
V
i→ 1
V
i→ 0 if X
j6=i
w
ijV
j> U
i< U
iFinally, there is a convergence proof for the Hopfield network stating that the system will always settle in a steady state, an attractor. This proof is carried out by Rojas[15]
on a slightly modified version of the original Hopfield network described above.
Chapter 2. Background 5
2.3 Spiking Neural Network Models
As mentioned in section 2.1, biological neurons can be considered as small electrical devices whom communicates via short electrical pulses, spikes. A detailed model of the electrodynamics of a neuron was presented by A. L. Hodgkin and A. F. Huxley in 1952 [16]. This model forms a basis for simplified formal models of spiking neurons, such as the integrate-and-fire model [7].
There are two lines of thought of how spiking neurons in nature encode and decode information. The prevailing thought throughout much of the 20
thcentury was that neurons relayed information via firing rates, i.e. the number of spikes during a certain time interval. However, experimental data has been gathered for which rate codes cannot account. An example is the fast reaction times of some animals [7]. Thus, temporal coding (or spike coding) has gained interest. This hypothesis suggests that neurons can encode and decode information using the relative timing of individual spikes. This type of code would allow for the fast computation observed in nature [7].
Wolfgang Maass describes spiking neural networks as the third generation of applied neural networks [2]. He argues that the first generation was characterized by neurons with binary states, such as the Hopfield neurons. The second generation allowed for neurons with a continuous set of states. These states would be analog to firing rates for spiking neurons. Due to the use of a continuous set of states, the second generation supports analog input as well as learning algorithms based on gradient decent. However, the fast computation observed in nature cannot be accounted by rate codes. Herein lies the potential of temporal codes. Maass concludes [2] that spiking neurons using temporal code has at least the same amount of computational power as the previous two generations, with the potential of being significantly more powerful.
An essential feature of all neural networks is their ability to learn. For spiking neural net- works, the Hebbian learning paradigm is well merited [7]. This paradigm originates from a postulate formulated by Donald O. Hebb [17]. The postulate states that the synaptic efficiency of two neurons should be strengthen if the neurons display mutual activity.
These synaptic changes allows the network to be molded from external stimulation, such as speech or pain. This is called plasticity [7].
Hebb’s postulate has been generalized in several ways and the paradigm now includes
learning rules incorporating efficiency decay in the absence of stimuli, efficiency satura-
tion as well as anti-hebbian learning, weakening of efficiency upon activity [7]. Further-
more, Hebbian learning generalizes from rate codes to temporal codes through spike-
timing dependent plasticity (STDP). The fundamental concept of STDP is that the
Chapter 2. Background 6
synaptic efficiency should be modified by the relative timing of presynaptic and postsy- napitic spikes on a scale of 10 ms. These fast learning dynamics have now been observed experimentally in several cases [18]. There exists a plethora of different STDP models and the details of these will have to be omitted here.
2.4 Previous research into spiking Hopfield networks
A theoretical analysis of a Hopfield network with spiking neurons was provided by Ger- stner and van Hemmen[8]. Their motivation was to increase the biological plausibility of the model. The binary neurons of the original Hopfield model are replaced by more biological realistic integrate-and-fire neurons, though the unrealistic high connectivity of the Hopfield network is unmodified. In this setup Gerstner and van Hemmen derive two interesting results. Their first conclusion is that for a stationary input pattern, the stationary solutions of the system can be fully described by the neurons mean firing rates and the timing of individual spikes can be neglected. However, their second conclusion states that this is not true in oscillatory states. In these cases, several synapse param- eters determine the existence of stable solutions. These solutions are also dependent on the internal spiking dynamics of the neurons. Thus solutions display more detailed dynamics than possible for the case of binary or rate state neurons.
Lansner further investigates the possibility of biological realistic attractor memory net- works. He suggests that the unrealistically high connection density of these networks could be provided locally in neuron clusters while the connection density in the network as a whole remains sparse [11]. These kind of densely connected neuron cluster structures have been observed in the Neocortex of several species [19].
Maass and Natschläger were able to emulate a Hopfield network using compartmental spiking neuron models and temporal coding [10]. Through theoretical derivations and simulation results they conclude that with the use of three pairs of synaptic connections between each neuron, several patterns can be stored and recalled in a system using the Hebbian learning rule form Hopfields original article. They also find this system to be robust to noise and that the temporal coding allows for fast recall of patterns.
Finally, Wärnbergs thesis [12] deserves some attention since it is closely related to this thesis. Wärnberg also investigates memory storage in a small spiking Hopfield network.
He divides his investigation into three premises. In the first premise he uses static
synapses to investigates how the recall error is affected by the Q-value (though Wärn-
berg uses a different notation), the axonal delay and some noise parameters. Wärnberg
provides useful data on how the recall error varies with these parameters. One result is
Chapter 2. Background 7
that the error appears to depend significantly on the Q-value. Thus, this value will be given attention in this thesis as well.
In the second premise Wärnberg attempts to implement STDP and in the third premise
he explores the possibility of using temporal code. He concludes that his implementa-
tion of these premises is unsuccessful. However, he also admits that there are a lot of
possibilities for improvement on his implementation.
Method
3.1 Models
3.1.1 Neuron Model
The Hodkin-Huxley neuron model provides a detailed description of biological spiking neurons [16]. However, this model is computationally demanding since a system of 4 coupled differential equations has to be solved for each time-step. Therefore, simplified formal models are commonly used in simulations. Both Maass and Wärnberg implements the leaky integrate-and-fire model in their investigation of spiking Hopfield network [10, 12].
The leaky integrate-and-fire model describes the neuron as a simple Resistance-Capacitor circuit without spacial components. The circuit is driven by a current I(t). Mathemati- cally this means that the membrane potential, u(t), is described by:
du(t)
dt = −(u(t) − u
0) τ
m+ I(t) C
τ
m= RC is called the membrane time constant, R the membrane resistance and C the membrane capacity. The current consists of input from other neurons in the network, input from external stimulation as well as noise input, I(t) = I
net(t) + I
ext(t) + I
noise(t).
The firing time of the neuron, t
(f ), is defined as the time the membrane potential crosses a threshold value, ϑ. Immediately following a fire-event the membrane potential is reset to a value u
r< ϑ. After reset the membrane potential remains unchanged during an absolute refractory time ∆
abs, which may be 0, and then the process is repeated. Since the membrane potential cannot increase during the absolute refractory time, this introduce a theoretical maximum of the firing frequency, f
max= 1
∆
abs.
The model parameters were drawn from a Gaussian distribution with the expected values and standard deviations listed in table 3.1. This is done to avoid any unexpected behavior
8
Chapter 3. Method 9 Parameter Value StD Description
u
0−70.0 3.5 Resting membrane potential in mV
C 250.0 0 Capacity of the membrane in pF
τ
m10.0 0.5 Membrane time constant in ms
∆
abs2.0 0.1 Duration of refractory period in ms
ϑ −55.0 2.75 Spike threshold in mV
u
r−70.0 3.5 Reset potential of the mem- brane in mV
Table 3.1: Table of parameters used for the neuron model. The value for each neuron was drawn from a Gaussian distribution with the shown expected values and standard deviations.
from using a completely homogeneous network. The expected values are the default values in our simulation tool of choice[20].
3.1.2 Synapse Model
The network consisted of 64 neurons. Each neuron was connected to every other neuron in the network and no neuron was connected onto itself, in accordance with the original Hopfield network [9]. The synaptic connections of the network is modeled using current pulses. Mathematically, the current input to a neuron i is modeled as:
I
inet(t) = X
j
w
ijX
f
Qα(t − t
(f )j)
The fist sum is taken over all neurons in the network and the second sum is taken over all firing times. The Q-value has the interpretation of the maximum value of the current pulse. Wärnberg investigated this parameter in his thesis [12]. He concluded that this parameter has considerable influence on the recall error and a value of ≈ 225 pA provides reasonable results.
w
ijis the efficacy of the connection between neuron i and neuron j. The connections are set using a modified version of the Hebbian learning rule used by Hopfield [9]:
w
ij= X
s
1.0 if V
is= 1 and V
js= 1 K if V
is= 0 and V
js= 0
−1.0 if V
is6= V
jsChapter 3. Method 10
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Time (ms)
0.0 0.2 0.4 0.6 0.8 1.0
α (t)
Figure 3.1: A description of the alpha kernel with the axonal delay set to zero.
V
isis the state of neuron i in pattern s. The sum is taken over all patterns to be stored.
The parameter K governs the total amount of inhibition in the network. Setting K = 1 gives the original Hopfield learning rule.
α(s) is a kernel describing the synaptic current pulse. A mathematical formulation for this current kernel can be found in Gerstner and Kistler [7]. A slightly modified version has been used:
α(s) = e · s − ∆
axτ
s· exp(− s − ∆
axτ
s) · Θ(s − ∆
ax)
∆
axis a constant modeling the axonal delay observed in biological neurons. In this modified version the maximum value of α(s) is 1, see figure 3.1. The values used for the kernel parameters are found in table 3.2. These were the default parameters in the simulation tool of choice [20].
Parameter Value Description
τ
s2.0 Rise time of the synaptic al- pha function in ms
∆
ax1.0 Axonal delay in ms
Table 3.2: Table of synapse parameters used.
Chapter 3. Method 11 The noise input was unique of each neuron and modeled as:
I
inoise(t) = X
f