Weight Matrix Adaptation for increased Memory Storage Capacity in a Spiking Hopfield Network

(1)

Royal Institute of Technology

Bachelor Thesis

Weight Matrix Adaptation for increased Memory Storage Capacity in a Spiking

Hopfield Network

Authors:

Ciwan Ceylan Albin Sunesson

Supervisor:

AP Pawel Herman.

Submitted to

the School of Engineering Sciences

May 2015

(2)

ROYAL INSTITURE OF TECHNOLOGY

Abstract

School of Computer Science and Communication Department of Computational Biology

Bachelor Thesis

Weight Matrix Adaptation for increased Memory Storage Capacity in a Spiking Hopfield Network

by Ceylan C. & Sunesson A.

The storage capacity of a small spiking Hopfield network is investigated in terms of two parameters governing the conductance and inhibition of the synaptic connections. This is motivated by the possibility of constructing larger associative networks from small network clusters. These kinds of network architectures have been observed in nature and could possibly be a foundation for future applications.

The investigation is conducted using simulations of integrate-and-fire neuron models and static synapses. Several different types of binary patterns are used to provide a detailed analysis of the storage capacity.

The investigated parameters have influence on the storage capacity of the network and

the capacity may be improved with the right choice of parameters. Also, differences in

capacity for different pattern types are observed.

(3)

Acknowledgements

First, we wish to thank our supervisor Pawel Herman without whom this project would not have been possible. We are sincerely grateful for your interest, insightful thoughts and your guiding throughout the way. Your expertise have been invaluable for the outcome of this project. Second, we have very much appreciated Emil Wärnbergs well-written thesis and motivating conversations, inspiring us in starting this project. Also, we have had informative and helpful mail conversations, answering our sometimes elementary questions, with both Georgios Iatropoulos and Susanne Kunkel to whom we are very thankful.

ii

(4)

Introduction

The Human Brain Project pilot report describes understanding the human brain as one of the greatest challenges facing 21

^st

century science [1]. Detailed simulations of biological neural networks have been made possible and been developed since the mid 90s thanks to advances in neuroscience, computer science and vast improvements in access to computational power. These simulations are one important tool used by the scientists working on understanding the complexity of the brain.

Furthermore, new approaches to machine learning problems have been inspired by the advances in neuroscience. Spiking neural networks have their roots in biological mod- eling and have been described by Wolfgang Maass as the third generation of applied neural networks and the next step of increased computational power of networks [2].

This statement is reinforced by Paugam-Mooisy and Bohte [3] whom present a summary of state-of-the-art methods involving spiking neural networks as well as some interest- ing applications, e.g. speech processing, active vision for computers and autonomous robotics.

The creation of the artificial Hopfield network was also inspired by advancements in brain research. Specifically, the findings suggesting that parts of the human brain, e.g. the Hippocampus, is able to function as an associative memory [4, 5] and the experimental evidence suggesting that the underling mechanics of the hippocampus memory are gov- erned by Hebbian learning [6]. The Hopfield model is based on this learning paradigm which states that the efficiency of the synaptic connection between two neurons should be dictated by their mutual spiking activity [7]. However, while inspired by biology, the Hopfield network is considered an artificial neural network since it makes some highly artificial assumptions [8]. Worth mentioning is the use of binary neurons, on or off, instead of spiking neurons as well as the assumption of full connectivity, every neuron connected to every other neuron [9] .

The binary neurons of the Hopfield network have successfully been replaced with spiking neurons in several cases [8, 10]. Furthermore, Lansner proposed a biological realistic architecture tackling the unrealistic all-to-all connectivity [11]. He suggests that small

1

(6)

Chapter 1. Introduction 2

groups of neurons should form densely connected sub-nets functioning like Hopfield net- works. These sub-nets would then be connected to other sub-nets to form a complete associative memory network.

The modularity of Lansner’s idea induces the thought of designing associative networks for application purposes. Perhaps such a network could consist of several clusters of small spiking Hopfield networks, each specialized in storage of certain patterns. A first step towards such an application would be to investigate the functionality and capacity of these small networks.

Emil Wärnberg implemented and investigated such a small spiking Hopfield network in his bachelor thesis [12]. He successfully stored 3 patterns in a 64 neuron large network using a weight matrix set with the Hebbian learning rule provided in Hopfields origi- nal article [9]. However, Wärnberg covered a wide range of subjects and his focus was not on storage capacity. Thus he neither investigated the total storage capacity of his implementation nor did he investigate if different types of pattern sets affect the capac- ity. Furthermore, Wärnberg explicitly provided inhibitory input into the non-stimulated neurons. This surely stabilizes the network yet it might be possible to provide the inhi- bition via the networks weight matrix. This thesis will attempt to further improve on Wärnbergs work by investigating storage capacity and implementing inhibition via the networks weight matrix.

The aim of this thesis is to provide a thorough analysis into the weight matrix’s influence on the storage capacity of a small spiking Hopfield network and if possible determine if the network can be designed to specialize in different pattern types. The adjustments are made using two parameters; a value Q which affects the conductance of all connections and a value K which affect inhibition levels in the system. Furthermore, the investigation will cover if these parameters have different influence on varying degrees of pattern stimulation, different number of active neurons per pattern or varying degrees of overlap between patterns. The aspiration is to uncover whether different weight matrices are able to specialize in storing different pattern types.

Each neuron will implement an integrate-and-fire model and use similar model param-

eters. The effect of tuning these will not be investigated. The patterns to be stored

will be binary patterns consisting of ones and zeros. A one will correspond to an active

neuron and a zero will correspond to an inactive neuron. The neuron’s firing rate will be

used to determine whether a neuron should be considered active or inactive. An external

current will be provided to excite a part of the neurons corresponding to the ones in a

stored pattern. Neurons corresponding to zeros will never be stimulated.

(7)

Background

2.1 Biological Neural Networks

The underlying structure of the brain consists of networks of brain cells called neurons.

Ideally, a neuron is divided into three parts; the dendrites, the soma and the axion.

The dendrites can be considered as the "input device" of the neuron, the soma is the

"central processing unit" and the axion is the "output device". The neuron receives inputs from other neurons through the dendrites in the form of electric pulses, spikes.

These inputs are transmitted to the soma, raising its voltage. When the voltage reaches a threshold the soma fires it’s own electric pulse, called an action potential, through the axion which outputs the signal to other neurons via synapses. These synapses are chemical devices which connect the axion with the dendrites through a complex chain of bio-chemical processes. This synapse connection motivate the terms presynaptic- and postsynaptic neurons, which are used extensively in both biological and artificial neural network contexts. Depending on the substances released in the synapse, the voltage of the postsynaptic neuron may be decreased or increased. This is called inhibition and excitation respectively[7].

Several neural networks of different architecture can be found in various parts of the brain, each believed to be associated with different functions. One such structure is the Hippocampus which is believed to play an important part in long-term memory storage [13]. Through in vivo stimulation of neurons in a rat’s Hippocampus, Kelso et al. found experimental evidence that mutual spiking activity increases the efficiency of the connections between two neurons, in accordance with the Hebbian learning paradigm [6]. It has been further suggested that these synaptic efficiency changes are the foundation of the Hippocampus’ associative memory functionality [4]. These ideas gained increased recognition when Hopfield presented his Hopfield network. This network is capable of associative memory and based on the Hebbian learning paradigm.

3

(8)

Chapter 2. Background 4

2.2 The original Hopfield network

The Hopfield network is an example of a stable recurrently connected attractor network.

This signifies that the network will settle into a stable state as it dynamically develops in its state space. These stable states are called attractors. This property provides the Hopfield network with an associative memory functionality. Each pattern to be stored in the network memory will be an attractor into which the network can settle[14]. Thus, provided that the network is stable, initialization of the network with an input pattern would consequently lead to the network settling into the attractor closest in the state space. However, as the state space gets filled up with attractors the likelihood of recall error will increase. J.J. Hopfield investigates this in his original article and concludes that the Hopfield network is able to store about 0.15N patterns, where N is the number of neurons in the network, before the errors in recall becomes severe[9].

The original Hopfield network contains only a single layer of neurons. These are all connected symmetrically, mathematically w

_ij

= w

_ji

, w

_ij

being the weight (efficiency) of the connection from neuron i onto neuron j. Furthermore, the Hopfield network does not allow neurons to be connected onto themselves, w

_ii

= 0. The fact that the Hopfield network is single layered implies that the input and output layer are the same[15]. The neurons in the original Hopfield model uses binary states. These are represented by the values 1 and 0, on or off[9].

To store p patterns in the network, Hopfield proposes that the weights are set according to the following Hebbian learning rule:

w

_ij

=

p

X

s=1



 



 



1.0 if V

_i^s

= 1 and V

_j^s

= 1 1.0 if V

_i^s

= 0 and V

_j^s

= 0

−1.0 if V

_i^s

6= V

_j^s

Here V

_j^s

is the state of neuron i in pattern s.

The state of a neuron is updated by comparing a threshold value to the sum of all input from connected neurons,in mathematical terms (V

_i

/U

_i

is the state/threshold of neuron i):

V

_i

→ 1

V

_i

→ 0 if X

j6=i

w

_ij

V

_j

> U

_i

< U

_i

Finally, there is a convergence proof for the Hopfield network stating that the system will always settle in a steady state, an attractor. This proof is carried out by Rojas[15]

on a slightly modified version of the original Hopfield network described above.

(9)

Chapter 2. Background 5

2.3 Spiking Neural Network Models

As mentioned in section 2.1, biological neurons can be considered as small electrical devices whom communicates via short electrical pulses, spikes. A detailed model of the electrodynamics of a neuron was presented by A. L. Hodgkin and A. F. Huxley in 1952 [16]. This model forms a basis for simplified formal models of spiking neurons, such as the integrate-and-fire model [7].

There are two lines of thought of how spiking neurons in nature encode and decode information. The prevailing thought throughout much of the 20

^th

century was that neurons relayed information via firing rates, i.e. the number of spikes during a certain time interval. However, experimental data has been gathered for which rate codes cannot account. An example is the fast reaction times of some animals [7]. Thus, temporal coding (or spike coding) has gained interest. This hypothesis suggests that neurons can encode and decode information using the relative timing of individual spikes. This type of code would allow for the fast computation observed in nature [7].

Wolfgang Maass describes spiking neural networks as the third generation of applied neural networks [2]. He argues that the first generation was characterized by neurons with binary states, such as the Hopfield neurons. The second generation allowed for neurons with a continuous set of states. These states would be analog to firing rates for spiking neurons. Due to the use of a continuous set of states, the second generation supports analog input as well as learning algorithms based on gradient decent. However, the fast computation observed in nature cannot be accounted by rate codes. Herein lies the potential of temporal codes. Maass concludes [2] that spiking neurons using temporal code has at least the same amount of computational power as the previous two generations, with the potential of being significantly more powerful.

An essential feature of all neural networks is their ability to learn. For spiking neural net- works, the Hebbian learning paradigm is well merited [7]. This paradigm originates from a postulate formulated by Donald O. Hebb [17]. The postulate states that the synaptic efficiency of two neurons should be strengthen if the neurons display mutual activity.

These synaptic changes allows the network to be molded from external stimulation, such as speech or pain. This is called plasticity [7].

Hebb’s postulate has been generalized in several ways and the paradigm now includes

learning rules incorporating efficiency decay in the absence of stimuli, efficiency satura-

tion as well as anti-hebbian learning, weakening of efficiency upon activity [7]. Further-

more, Hebbian learning generalizes from rate codes to temporal codes through spike-

timing dependent plasticity (STDP). The fundamental concept of STDP is that the

(10)

Chapter 2. Background 6

synaptic efficiency should be modified by the relative timing of presynaptic and postsy- napitic spikes on a scale of 10 ms. These fast learning dynamics have now been observed experimentally in several cases [18]. There exists a plethora of different STDP models and the details of these will have to be omitted here.

2.4 Previous research into spiking Hopfield networks

A theoretical analysis of a Hopfield network with spiking neurons was provided by Ger- stner and van Hemmen[8]. Their motivation was to increase the biological plausibility of the model. The binary neurons of the original Hopfield model are replaced by more biological realistic integrate-and-fire neurons, though the unrealistic high connectivity of the Hopfield network is unmodified. In this setup Gerstner and van Hemmen derive two interesting results. Their first conclusion is that for a stationary input pattern, the stationary solutions of the system can be fully described by the neurons mean firing rates and the timing of individual spikes can be neglected. However, their second conclusion states that this is not true in oscillatory states. In these cases, several synapse param- eters determine the existence of stable solutions. These solutions are also dependent on the internal spiking dynamics of the neurons. Thus solutions display more detailed dynamics than possible for the case of binary or rate state neurons.

Lansner further investigates the possibility of biological realistic attractor memory net- works. He suggests that the unrealistically high connection density of these networks could be provided locally in neuron clusters while the connection density in the network as a whole remains sparse [11]. These kind of densely connected neuron cluster structures have been observed in the Neocortex of several species [19].

Maass and Natschläger were able to emulate a Hopfield network using compartmental spiking neuron models and temporal coding [10]. Through theoretical derivations and simulation results they conclude that with the use of three pairs of synaptic connections between each neuron, several patterns can be stored and recalled in a system using the Hebbian learning rule form Hopfields original article. They also find this system to be robust to noise and that the temporal coding allows for fast recall of patterns.

Finally, Wärnbergs thesis [12] deserves some attention since it is closely related to this thesis. Wärnberg also investigates memory storage in a small spiking Hopfield network.

He divides his investigation into three premises. In the first premise he uses static

synapses to investigates how the recall error is affected by the Q-value (though Wärn-

berg uses a different notation), the axonal delay and some noise parameters. Wärnberg

provides useful data on how the recall error varies with these parameters. One result is

(11)

Chapter 2. Background 7

that the error appears to depend significantly on the Q-value. Thus, this value will be given attention in this thesis as well.

In the second premise Wärnberg attempts to implement STDP and in the third premise

he explores the possibility of using temporal code. He concludes that his implementa-

tion of these premises is unsuccessful. However, he also admits that there are a lot of

possibilities for improvement on his implementation.

(12)

Method

3.1 Models

3.1.1 Neuron Model

The Hodkin-Huxley neuron model provides a detailed description of biological spiking neurons [16]. However, this model is computationally demanding since a system of 4 coupled differential equations has to be solved for each time-step. Therefore, simplified formal models are commonly used in simulations. Both Maass and Wärnberg implements the leaky integrate-and-fire model in their investigation of spiking Hopfield network [10, 12].

The leaky integrate-and-fire model describes the neuron as a simple Resistance-Capacitor circuit without spacial components. The circuit is driven by a current I(t). Mathemati- cally this means that the membrane potential, u(t), is described by:

du(t)

dt = −(u(t) − u

₀

) τ

m

+ I(t) C

τ

_m

= RC is called the membrane time constant, R the membrane resistance and C the membrane capacity. The current consists of input from other neurons in the network, input from external stimulation as well as noise input, I(t) = I

^net

(t) + I

^ext

(t) + I

^noise

(t).

The firing time of the neuron, t

^{(f )}

, is defined as the time the membrane potential crosses a threshold value, ϑ. Immediately following a fire-event the membrane potential is reset to a value u

_r

< ϑ. After reset the membrane potential remains unchanged during an absolute refractory time ∆

^abs

, which may be 0, and then the process is repeated. Since the membrane potential cannot increase during the absolute refractory time, this introduce a theoretical maximum of the firing frequency, f

_max

= 1

∆

^abs

.

The model parameters were drawn from a Gaussian distribution with the expected values and standard deviations listed in table 3.1. This is done to avoid any unexpected behavior

8

(13)

Chapter 3. Method 9 Parameter Value StD Description

u

₀

−70.0 3.5 Resting membrane potential in mV

C 250.0 0 Capacity of the membrane in pF

τ

m

10.0 0.5 Membrane time constant in ms

∆

^abs

2.0 0.1 Duration of refractory period in ms

ϑ −55.0 2.75 Spike threshold in mV

u

_r

−70.0 3.5 Reset potential of the mem- brane in mV

Table 3.1: Table of parameters used for the neuron model. The value for each neuron was drawn from a Gaussian distribution with the shown expected values and standard deviations.

from using a completely homogeneous network. The expected values are the default values in our simulation tool of choice[20].

3.1.2 Synapse Model

The network consisted of 64 neurons. Each neuron was connected to every other neuron in the network and no neuron was connected onto itself, in accordance with the original Hopfield network [9]. The synaptic connections of the network is modeled using current pulses. Mathematically, the current input to a neuron i is modeled as:

I

_i^net

(t) = X

j

w

_ij

X

f

Qα(t − t

^{(f )}_j

)

The fist sum is taken over all neurons in the network and the second sum is taken over all firing times. The Q-value has the interpretation of the maximum value of the current pulse. Wärnberg investigated this parameter in his thesis [12]. He concluded that this parameter has considerable influence on the recall error and a value of ≈ 225 pA provides reasonable results.

w

_ij

is the efficacy of the connection between neuron i and neuron j. The connections are set using a modified version of the Hebbian learning rule used by Hopfield [9]:

w

_ij

= X

s



 



 



1.0 if V

_i^s

= 1 and V

_j^s

= 1 K if V

_i^s

= 0 and V

_j^s

= 0

−1.0 if V

_i^s

6= V

_j^s

(14)

Chapter 3. Method 10

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Time (ms)

0.0 0.2 0.4 0.6 0.8 1.0

α (t)

Figure 3.1: A description of the alpha kernel with the axonal delay set to zero.

V

_i^s

is the state of neuron i in pattern s. The sum is taken over all patterns to be stored.

The parameter K governs the total amount of inhibition in the network. Setting K = 1 gives the original Hopfield learning rule.

α(s) is a kernel describing the synaptic current pulse. A mathematical formulation for this current kernel can be found in Gerstner and Kistler [7]. A slightly modified version has been used:

α(s) = e · s − ∆

^ax

τ

_s

· exp(− s − ∆

^ax

τ

_s

) · Θ(s − ∆

^ax

)

∆

^ax

is a constant modeling the axonal delay observed in biological neurons. In this modified version the maximum value of α(s) is 1, see figure 3.1. The values used for the kernel parameters are found in table 3.2. These were the default parameters in the simulation tool of choice [20].

Parameter Value Description

τ

_s

2.0 Rise time of the synaptic al- pha function in ms

∆

^ax

1.0 Axonal delay in ms

Table 3.2: Table of synapse parameters used.

(15)

Chapter 3. Method 11 The noise input was unique of each neuron and modeled as:

I

_i^noise

(t) = X

f

30.0 · α(t − t

^{(f )}_i

)

The firing times where drawn from a Poisson-distribution with an expected firing fre- quency of 1000 Hz.

3.2 Simulations

3.2.1 NEST Simulator

All the simulations were carried out using NEST(Neural Simulation Tool)[20]. NEST is an open source library maintained and developed by the NEST initiative and offers a variety of neuron and synapse models for building spiking neural networks. This makes NEST well suited for the computational comparisons and investigations intended.

3.2.2 Encoding and decoding

Binary patterns consisting of ones and zeros were used to test the storage capacity.

Each pattern is a vector of 64 binary values. These are stored in the network using a Hebbian learning rule, see section 3.1.2. To initiate a recall, a portion of the neurons corresponding to ones in a stored pattern are stimulated using a constant input current.

Mathematically, I

_i^ext

(t) = 200.0 pA for each stimulated neuron i.

Every pattern used consists of 64 binary values. However, in this thesis, the pattern size is defined number of values equal to 1 in each pattern. Figure 3.2 shows two patterns with different pattern size.

Overlap is defined as the number of active neurons shared between two patterns. This is illustrated clearly in figure 3.3. The patterns used in the simulation was created so that no more than two patterns could share the same active neuron.

The number of stimulated neurons in a pattern refers the number of neurons which receive external stimulation. Figure 3.4 shows an example of 5 stimulated neurons in a pattern of size 7.

The network is given 1000 ms to attain a stable state. Afterwards, the mean firing rate

for each neuron is measured over an additional 1000 ms. This rate data is considered the

output of the network. The output is analyzed using a binary classification algorithm.

(16)

Chapter 3. Method 12

This algorithm classifies the output as a correct recall if the output satisfies some require- ments. Firstly, if a neuron not corresponding to a one in the stored pattern is active, the output is considered as an incorrect recall. Secondly, all neurons corresponding to a one in the stored pattern has to be active for the algorithm to classify the output as a correct recall. The algorithm uses a threshold value, β

_th

, to determine if a neuron is considered active or inactive. Since this threshold is arbitrary it deserves some analysis.

For this investigation, the threshold value has been set to a ratio, p

_th

of the mean firing rate of the stimulated neurons. This setup allows for comparison between networks with high levels of inhibition with networks with lower levels. To find a suitable value for p

_th

, a sensitivity-specificity investigation of the algorithm was conducted. Sensitivity is the rate of true positives and specificity the rate of false negatives. A set of 1000 outputs defined to be correct and a set of 1000 outputs defined to be incorrect were generated.

The algorithm was set to classify these sets for different values of p

_th

. The results can be seen in figure 3.5 as the rate of true positives vs the rate of true negatives. Ideally, our

Figure 3.2: Blue squares represents neurons corresponding to ones and while squares to zeros. The left pattern is defined to have pattern size 5 and the right square is defined to have pattern size 7.

Figure 3.3: Blue squares represents neurons corresponding to ones. Figure shows

two different patterns with overlap 2.

(17)

Chapter 3. Method 13

Figure 3.4: Blue squares represents neurons corresponding to ones. Green squares represents stimulated neurons. Red squares represents neurons supposedly activated for a successful recall.

0.0 0.2 0.4 0.6 0.8 1.0

Specificity

0.0 0.2 0.4 0.6 0.8 1.0

Sensitivity

0.1

0.2 Figure 3.5: Sensitivity-specificity plot for different values of p

th

, ranging from 0 to 1. On the y-axis: the rate of patterns correctly classified as correct. On the x-axis:

the rate of patterns correctly classified as incorrect.

algorithm should provide a result in the top-right side of the figure. However, p

_th

= 0.15

provides a sufficient result and is used for classification of the output data.

(18)

Chapter 3. Method 14

3.2.3 Simulation scenarios

The storage capacity of the network is defined as the highest number of patterns possible to store in the network under the condition that the network is able to correctly recall each stored pattern. In accordance with the aim of the thesis, the storage capacity’s dependence on the values of K and Q for different type of patterns was investigated.

Simulations where conducted over different values of K and Q. In turn these simulations

where done over different pattern sizes, different number of overlap and different number

of stimulated neurons respectively. Some standard values where used for the non-varied

parameters in each scenario. Q was set to 200 pA when K was varied and K was set

to 0.5 when Q was varied. The standard value for the pattern size was 7, the standard

value for the overlap was 1 and the standard value for the number of stimulated neurons

was set to the pattern size −2.

(19)

Results

4.1 Simulation Results

Three general cases of steady states for the network could be observed. These are illus- trated in figures 4.1, 4.2 and 4.3.

Firstly, figure 4.1 displays a successful recall of a pattern. 5 of the 7 active neurons of the pattern are stimulated and the other 2 are then activated. The active neurons fire at a sufficient rate and no neurons outside the pattern have been activated.

Secondly, in the scenario of figure 4.2, the same pattern has been stored. However, the network does not manage to activate the remaining two neurons of the pattern, and the recall is unsuccessful. There also exist cases in which one or two neurons have been activated and cases in which the neurons produces spikes at an rate insufficient to exceed the threshold value, β

_th

.

Finally, figure 4.3 shows an over excitation of the network. In this case, a lack of inhibition has initiated a positive feedback loop, causing a majority of the neurons to activate. In this stage the neurons fire at a rate close to the theoretical maximum of 500 Hz. The introduction of the K-value as a parameter to increase inhibition was grounded in the occurrence of these over excited states.

4.1.1 Pattern Size

In figure 4.4 the observed storage capacity as a function of the Q-value for different pattern sizes is found. For small pattern sizes, a Q-value above 280 pA is needed for the network to achieve a capacity higher than 1 stored pattern. As the pattern size is increased, the Q-value needed to achieve capacities above 1 is decreased non-linearly. For high values of Q the capacity tends to 3 for all pattern sizes.

In figure 4.5 the storage capacity for different pattern sizes and K-values is shown. The capacity of the network is unaffected by the K-value for pattern size 5, only a capacity of

15

(20)

Chapter 4. Results 16

1 stored pattern is achieved. For larger pattern sizes, capacity is increased for K-values around 0.5. A slightly lower K-value seems to provide larger capacity for larger pattern sizes.

Figure 4.1: A successful recall of a stored pattern. The raster plot shows the spike times for each active neuron in the network. In this scenario the first 5 neurons have been stimulated. This has resulted in the activation of neuron 6 and 7.

Figure 4.2: An unsuccessful recall of a stored pattern. The raster plot shows the

spike times for each active neuron in the network. In this scenario the first 5 neurons

have been stimulated. However, the network has not been able to activate neuron 6

and 7.

(21)

Chapter 4. Results 17

Figure 4.3: The raster plot shows the spike times for all neurons in the network.

Neurons 1-5 are being stimulated. After a while the network enters an over excited state. In this state, a positive feedback loop causes almost every neuron in the network to start firing at high rates. The neurons whom remain inactive are overlapping neurons.

4.1.2 Pattern Overlap

In figure 4.6 it is observed that the network achieves comparatively high capacities for patterns with no overlap. As the number of overlapping neurons is increased, a higher maximum current is required to attain capacities above 1. For high maximum currents the capacity tends to 3, with the exception for patterns with 3 overlap. Interestingly, the capacity for these patterns seems to converge towards 4 instead of 3 for large Q-values.

A similar result as in figure 4.5 is shown in figure 4.7. Increased storage capacities are achieved for mid-range K-values. Notable is that patterns with more overlap seems to prefer a higher K-value. Furthermore, a capacity of 5-6 patterns was achieved for patterns without overlap and a K-value of 0.4. This was the largest capacity achieved in this investigation.

4.1.3 Pattern Stimulation

Figure 4.8 and figure 4.9 echo the results of the previous sections. Again, the capacity

tends to 3 for large Q-values and the network capacity seems to increase for K-values

around 0.5. Figure 4.8 shows that higher Q-values are needed to achieve a capacity above

1 pattern as fewer neurons are stimulated.

(22)

Chapter 4. Results 18

0.5 0.6 0.7 0.8 0.9 1.0 1.2 1.4 1.6 1.8 2.0 Q-value relative to 200.0 pA

0 1 2 3 4 5

Stor age Capacity

Pattern Size 5 Pattern Size 7 Pattern Size 9

Figure 4.4: 3 graphs with varying pattern size. On y-axis: The maximum number of patterns possible to store in the network with 100% of the patterns recalled correctly. On x-axis: The Q-value relative to the value 200.0 pA. Error bars show one standard deviation estimated from 6 runs.

0.0 0.2 0.4 0.6 0.8 1.0

K-Value 0

1 2 3 4 5

Stor age Capacity

Pattern Size 5 Pattern Size 7 Pattern Size 9

Figure 4.5: 3 graphs with varying pattern size. On y-axis: The maximum number

of patterns possible to store in the network with 100% of the patterns recalled

correctly. On x-axis: The K-value of the modified learning rule. Error bars show one

standard deviation estimated from 6 runs.

(23)

Chapter 4. Results 19

0.5 0.6 0.7 0.8 0.9 1.0 1.2 1.4 1.6 1.8 2.0 Q-value relative to 200.0 pA

0 1 2 3 4 5

Stor age Capacity

Overlap 0 Overlap 1 Overlap 2 Overlap 3

Figure 4.6: 4 graphs with varying number of pattern overlap. On y-axis: The maximum number of patterns possible to store in the network with 100% of the patterns recalled correctly. On x-axis: The Q-value relative to the value 200.0 pA.

Error bars show one standard deviation estimated from 6 runs.

0.0 0.2 0.4 0.6 0.8 1.0

K-Value 0

1 2 3 4 5 6

Stor age Capacity

Overlap 0 Overlap 1 Overlap 2 Overlap 3

Figure 4.7: 4 graphs with varying number of pattern overlap. On y-axis: The

maximum number of patterns possible to store in the network with 100% of the

patterns recalled correctly.On x-axis: The K-value of the modified learning rule. Error

bars show one standard deviation estimated from 6 runs.

(24)

Chapter 4. Results 20

0.5 0.6 0.7 0.8 0.9 1.0 1.2 1.4 1.6 1.8 2.0 Q-value relative to 200.0 pA

0 1 2 3 4 5

Stor age Capacity

Neurons Stim. 6 Neurons Stim. 5 Neurons Stim. 4

Figure 4.8: 3 graphs with varying number of stimulated neurons. On y-axis: The maximum number of patterns possible to store in the network with 100% of the patterns recalled correctly. On x-axis: The Q-value relative to the value 200.0 pA.

Error bars show one standard deviation estimated from 6 runs.

0.0 0.2 0.4 0.6 0.8 1.0

K-Value 0

1 2 3 4 5

Stor age Capacity

Neurons Stim. 6 Neurons Stim. 5 Neurons Stim. 4

Figure 4.9: 3 graphs with varying number of stimulated neurons. On y-axis: The maximum number of patterns possible to store in the network with 100% of the patterns recalled correctly. On x-axis: The K-value of the modified learning rule.

Error bars show one standard deviation estimated from 6 runs.

(25)

Chapter 4. Results 21

4.2 Possible explanations of results

4.2.1 The capacity’s dependence on the K-value and overlap

A common feature of figures 4.5, 4.7 and 4.9 is the capacity increase for K-values around 0.4 and 0.6. An explanation for this may be found by considering the weight matrices for different values of K, see figure 4.10. This figure shows the weight matrices of a system with K-values ranging from 0 to 1. In each system, 4 patterns are stored. There is 1 active neuron overlap between each pattern. Also observe that the weights have been multiplied with Q = 200 pA. Blue colors are interpreted as inhibitory connections and red as excitatory connections.

For K = 1, the network has an overall low level of inhibition and over excitation of the network is likely upon stimulation. For K = 0, the network has high levels of inhibition and is thus unlikely to get over excited. However, the overlapping neurons are connected with 0-weights in this scenario and can thus not be excited by other neurons in the network. This makes recall of these patterns impossible. The 0-weights can be seen in matrix A in figure 4.10 as well as in matrix D, G and J in figure 4.12. For K = 0.4 − 0.6 a middle-ground is achieved. There are enough inhibitory connections to prevent over excitation and enough excitatory connections to activate the overlapping neurons.

Inhibition levels are also affected by the size of overlap and the excitation levels are affected by the number of externally stimulated neurons. Thus, the K-value that provides better performance differ between scenarios. Larger overlap increase inhibition, thus a larger K may be used without causing over excitation, this can be seen in figure 4.7.

Fewer stimulated neurons reduces the total excitatory input to the network, possibly causing difficulties for the network to activate the entire pattern. This may explain the behavior of the green graph in figure 4.9. It appears that K-values below 0.6 combined with few stimulated neurons disables the network from recalling patterns. In figure 4.5 the opposite can be observed. For pattern size 9, the capacity drops sharply at K = 0.6.

Possibly, inhibition levels got to low compared to the number of stimulated neurons, causing over excitation of the network.

The red graph for pattern size 5 in figure 4.5 does not seem to depend on K. This is

explained by considering figure 4.4. This figure indicates that a Q-value above 280 pA

is needed to achieve a storage capacity above 1.

(26)

Chapter 4. Results 22

4.2.2 Capacity convergence for large Q-values

In figures 4.4, 4.6 and 4.8, the capacity of the network seems to converge towards 3. An exception is the network with 3 overlapping neurons which converges towards a capacity of 4 patterns. The convergence can be explained by considering the matrices in figure 4.11. When 3 patterns are stored in the network, there are 0-weights connecting the neurons active in some pattern with the neurons not active in any pattern. Thus, the maximum current can be increased to infinity without excitation of non-pattern neurons.

As more than 3 patterns are stored, the 0-weights are replaced by excitatory connections.

Therefore, high Q-values are likely to cause an over excitation of the network in these scenarios.

An explanation for the behavior of the network with 3 overlapping neurons in figure 4.6 may be found in figure 4.12. Most clearly visible in the second column, the overlap between patterns provides inhibition to the non-stimulated neurons in the lower right sides of the matrices. This is seen as a blue ribbon of inhibitory connections. In the case with 3 overlapping neurons, this inhibition is apparently sufficient to prevent over excitation in the network even for high Q-values.

4.2.3 Relation between Q and the number of active neurons

In figure 4.4 and figure 4.8, a storage capacity above 1 is achieved for different Q-values.

In an all-to-all connected network the total input to a neuron directly depends on the

number of active neurons. Thus, higher Q-values are needed as fewer neurons are stimu-

lated and lower Q-values are needed for large pattern sizes. The choice of Q in a applied

network should be high enough to compensate for small pattern sizes or few stimulated

neurons.

(27)

Chapter 4. Results 23

0 10 20 30 40 50 60

A B

0 10 20 30 40 50 60

C D

0 10 20 30 40 50 60 0

10 20 30 40 50 60

E

0 10 20 30 40 50 60

F

−200

−160

−120

−80

−40 0 40 80 120 160 200

Figure 4.10: The weight matrix for 6 different K-values. A) K = 0.0, B) K = 0.2,

C) K = 0.4, D) K = 0.6, E) K = 0.8. F) K = 1.0. For each matrix 4 patterns are

stored. These can be seen as small squares in the top-left of the matrices. There is 1

neuron overlap between each pattern. The Q-value is set to 200 pA. Inhibition in the

system is decreased as K is increased.

(28)

Chapter 4. Results 24

0 10 20 30 40 50 60

A B C

0 10 20 30 40 50 60

D E F

0 10 20 30 40 50 60 0

10 20 30 40 50 60

G

0 10 20 30 40 50 60

H

0 10 20 30 40 50 60

I

−200

−160

−120

−80

−40 0 40 80 120 160 200

Figure 4.11: The weight matrix for different number of stored patterns. Matrix A

corresponds to 1 stored pattern, matrix B corresponds to 2 stored patterns etc. For

each matrix the K-value is 0.5 and there is 1 overlapping neuron between the

patterns. The Q-value is set to 200 pA. As the number of patters is increased the

inhibition of the weight matrix decreases.

(29)

Chapter 4. Results 25

0 10 20 30 40 50 60

A B C

0 10 20 30 40 50 60

D E F

0 10 20 30 40 50 60

G H I

0 10 20 30 40 50 60 0

10 20 30 40 50 60

J

0 10 20 30 40 50 60

K

0 10 20 30 40 50 60

L

−200

−160

−120

−80

−40 0 40 80 120 160 200

Figure 4.12: The weight matrix for 12 different scenarios. The K-value varies along

the columns. For column A) K = 0.0, for column B) K = 0.5 and for column C)

K = 1.0. The number of neuron overlap between patterns varies along the rows, going

from 0 to 3. For each matrix 4 patterns are stored. These can be seen as small

squares in the top-left of the matrices. The Q-value is set to 200 pA. As K increases

the inhibition of the system decreases. As the overlap increases there is some increase

of inhibition.

(30)

Conclusion

Both the K-value and the Q-value have influence on the storage capacity of the system.

A general observation is that the storage capacity is increased for mid-range K-values, 0.4-0.6. However, there are differences between different pattern types. As the number of overlapping neurons increases, higher K-values appear preferable to the system. Also, patterns with more active neurons are less tolerable to high K-values.

Furthermore, low Q-values appears to limit the storage capacity of the network to 1 pattern. High Q-values seems to cause a convergence of the storage capacity towards 3 patterns. However, an exception is the case with 3 overlapping neurons. In this case, the storage capacity tends towards 4 patterns.

26

(31)

Discussion

6.1 Reliability of Results

There are two stochastic sources present in the simulation, the noise and the distribution of neuron parameters. The error introduced from these sources have been estimated through calculation of the standard deviation, using 6 separate runs for each simulation.

In figure 4.4 to figure 4.9, the standard deviation for each point is represented by an error bar. Though the standard deviation is 0 in several cases, there are also cases where the standard deviation exceeds 1 pattern. These errors could be reduced by further simulations. However, this was not possible in this investigation due to time constraints.

Furthermore, there is a possible error introduced through the decoding algorithm. This error has tried to be minimized by setting the threshold value according to sensitivity- specificity data. Further reduction of the error might be possible using a more advanced classification algorithm. However, with the use of a binary classification algorithm it is difficult to completely eliminate the error. An analysis of the raw spike or rate data could be conducted to provide a more nuanced picture of the results. Even so, a decision whether the network correctly recalled a pattern or not has to be made at some point, either by a human or by an algorithm.

6.2 Further Research

Our results show that both the Q-value and the K-value affect the storage capacity of the network. Furthermore, different values for Q and K seems to be preferable for different types of patterns. These results suggest that there are indeed possibilities of designing different networks to specializing in storing different patterns. For example, for pattern set with larger amounts of overlap, the optimal storage capacity seems to be reached for high values of K and Q. For patterns with large pattern sizes and few overlaps, a lower capacities was achieved with such parameters. Note that this research does not

27

(32)

Bibliography 28

cover simultaneous adjustment of K and Q. The value of K was always set to 0.5 when Q was varied and Q was always set to 200.0 pA when K was varied. Further studies could be able to provide insight into optimal combinations of these parameters for different patterns sets.

There is also the possibility of an investigation into the neuron and synapse model pa- rameters. The values used in our study are tuned to be biologically realistic. However, larger storage capacities could be achieved with other parametric values. Furthermore, it could also be interesting to look into the effect of using less homogeneous networks.

These results could also prove to be the first step towards constructing a larger associative network. This network could consist of many clusters of small networks, as was suggested in the introduction. A challenge would be to find a suitable way of connecting these clusters, which may require neuron population theories. A introduction to this could be the overview of this subject found in Gerstner and Kistler [7] or Nykamp and Tranchina’s article on population approximations [21].

STDP and temporal code have not been covered in our research and could prove excel- lent areas for future studies. Wärnberg does touch upon these subjects in his thesis [12].

He believes this kind of spiking Hopfield network to be unsuitable for a STDP imple-

mentation. However, he admits that his implementation could be improved on several

points and with the combined results of his and our study, a STDP implementation may

be possible. One idea would be to use our results to design an initialization state to

the weight matrix. Furthermore, the potential computational power of temporal cod-

ing described by Maass [2] could motivate a reinvestigation of its comparability with

small spiking Hopfield networks. Temporal coding has been implemented successfully

in networks using more complex neuron models and more synaptic connections [10]. A

quantitative analysis of the complications of a temporal code implementation of simpler

spiking networks would be appreciated since Wärnberg does not provide such an analysis.

(33)

Bibliography

[1] Walker R. (Edit). The human brain project pilot report.

https://www.humanbrainproject.eu/, April 2012. URL http://tinyurl.com/

q36s2gj.

[2] Maass. W. Networks of spiking neurons: The third generation of neural network models. Neural Networks, 10(9):1659–1671, 1997.

[3] Hélène Paugam-Moisy and Sander M. Bohte. Computing with spiking neuron net- works. In Handbook of Natural Computing, pages 335–376. 2012.

[4] McNaughton B.L. and Morris R.G.M. Hippocampal synaptic enhancement and information storage within a distributed memory system. Trends in Neurosciences, 10:408–415, 1987.

[5] T. V. P. Bliss and T. Lømo. Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. The Journal of Physiology, 232(2):331–356, 1973.

[6] Ganong A. H. Kelso S. R. and Brown T. H. Hebbian synapses in hippocampus.

Proceedings of the National Academy of Sciences of the United States of America, 83(14):5326–5330, 1986.

[7] W. Gerstner and W. M. Kistler. Spiking Neuron Models: Single Neurons, Popula- tions, Plasticity. Cambridge University Press, 2002.

[8] Gerstner W. van Hemmen J.L. Associative memory in a network of ‘spiking’ neurons.

Network: Computation in Neural Systems, 3(2):139–164, 1992.

[9] J.J. Hopfield. Neural networks and physical systems with emergent collective com- putational abilities. Proc. Natl. Academy of Sciences, 1982.

[10] Maass W. and Natschläger T. Networks of spiking neurons can emulate arbitrary hopfield nets in temporal coding. Network Computation in Neural Systems, 8:355–

371, 1997.

29

(34)

Bibliography 30

[11] Lansner A. Associative memory models: from the cell-assembly theory to biophys- ically detailed cortex simulations. Trends in Neurosciences, 32:178–186, 2001.

[12] E. Wärnberg. Implementation and robustness of hopfield networks with spiking neurons. Bachelors thesis, KTH Royal Institute of Technology, 2014.

[13] Wikipedia. Hippocampus — Wikipedia, the free encyclopedia, 2004. URL http:

//en.wikipedia.org/wiki/Hippocampus#Hippocampal_formation. [Online; ac- cessed 16-May-2015].

[14] C. Eliasmith. Attractor network. Scholarpedia, 2(10):1380, 2007. revision #91016.

[15] R. Rojas. Neural Networks: A Systematic Introduction. Springer, 1996.

[16] A. L. Hodgkin and A. F. Huxley. A quantitative description of ion currents and its applications to conduction and excitation in nerve membranes. J. Physiol., 117:

500–544, 1952.

[17] Hebb D.O. The Organization of Behavior. John Wiley And Sons, Inc, 1949.

[18] Caporale N. and Dan Y. Spike timing–dependent plasticity: A hebbian learning rule. Annu. Rev. Neurosci., 31:25–46, 2008.

[19] Tom Binzegger, Rodney J. Douglas, and Kevan A. C. Martin. Stereotypical bou- ton clustering of individual neurons in cat primary visual cortex. The Journal of Neuroscience, 27(45):12242–12254, 2007.

[20] Marc-Oliver Gewaltig and Markus Diesmann. Nest (neural simulation tool). Schol- arpedia, 2(4):1430, 2007.

[21] Nykamp D. Q. and Tranchina D. A population density approach that facilitates

large-scale modeling of neural networks: Analysis and an application to orientation

tuning. Journal of Computational Neuroscience, 8(1):19–50, 2000.

Weight Matrix Adaptation for increased Memory Storage Capacity in a Spiking Hopfield Network

Royal Institute of Technology

Bachelor Thesis

Weight Matrix Adaptation for increased Memory Storage Capacity in a Spiking

Hopfield Network

Authors:

Ciwan Ceylan Albin Sunesson

Supervisor:

AP Pawel Herman.

Submitted to

the School of Engineering Sciences

May 2015

ROYAL INSTITURE OF TECHNOLOGY

Abstract

School of Computer Science and Communication Department of Computational Biology

Bachelor Thesis

Weight Matrix Adaptation for increased Memory Storage Capacity in a Spiking Hopfield Network

by Ceylan C. & Sunesson A.

The investigation is conducted using simulations of integrate-and-fire neuron models and static synapses. Several different types of binary patterns are used to provide a detailed analysis of the storage capacity.

The investigated parameters have influence on the storage capacity of the network and

the capacity may be improved with the right choice of parameters. Also, differences in

capacity for different pattern types are observed.

Acknowledgements

ii

Contents

Abstract i

Acknowledgements ii

1 Introduction 1

2 Background 3

2.1 Biological Neural Networks . . . . 3

2.2 The original Hopfield network . . . . 4

2.3 Spiking Neural Network Models . . . . 5

2.4 Previous research into spiking Hopfield networks . . . . 6

3 Method 8 3.1 Models . . . . 8

3.1.1 Neuron Model . . . . 8

3.1.2 Synapse Model . . . . 9

3.2 Simulations . . . . 11

3.2.1 NEST Simulator . . . . 11

3.2.2 Encoding and decoding . . . . 11

3.2.3 Simulation scenarios . . . . 14

4 Results 15 4.1 Simulation Results . . . . 15

4.1.1 Pattern Size . . . . 15

4.1.2 Pattern Overlap . . . . 17

4.1.3 Pattern Stimulation . . . . 17

4.2 Possible explanations of results . . . . 21

4.2.1 The capacity’s dependence on the K-value and overlap . . . . 21

4.2.2 Capacity convergence for large Q-values . . . . 22

4.2.3 Relation between Q and the number of active neurons . . . . 22

5 Conclusion 26 6 Discussion 27 6.1 Reliability of Results . . . . 27

6.2 Further Research . . . . 27

iii

Introduction

The Human Brain Project pilot report describes understanding the human brain as one of the greatest challenges facing 21

This statement is reinforced by Paugam-Mooisy and Bohte [3] whom present a summary of state-of-the-art methods involving spiking neural networks as well as some interest- ing applications, e.g. speech processing, active vision for computers and autonomous robotics.

The binary neurons of the Hopfield network have successfully been replaced with spiking neurons in several cases [8, 10]. Furthermore, Lansner proposed a biological realistic architecture tackling the unrealistic all-to-all connectivity [11]. He suggests that small

1

Chapter 1. Introduction 2

groups of neurons should form densely connected sub-nets functioning like Hopfield net- works. These sub-nets would then be connected to other sub-nets to form a complete associative memory network.

Each neuron will implement an integrate-and-fire model and use similar model param-

eters. The effect of tuning these will not be investigated. The patterns to be stored

will be binary patterns consisting of ones and zeros. A one will correspond to an active

neuron and a zero will correspond to an inactive neuron. The neuron’s firing rate will be

used to determine whether a neuron should be considered active or inactive. An external

current will be provided to excite a part of the neurons corresponding to the ones in a

stored pattern. Neurons corresponding to zeros will never be stimulated.

Background

2.1 Biological Neural Networks

The underlying structure of the brain consists of networks of brain cells called neurons.

Ideally, a neuron is divided into three parts; the dendrites, the soma and the axion.

The dendrites can be considered as the "input device" of the neuron, the soma is the

"central processing unit" and the axion is the "output device". The neuron receives inputs from other neurons through the dendrites in the form of electric pulses, spikes.

3

Chapter 2. Background 4

2.2 The original Hopfield network

The Hopfield network is an example of a stable recurrently connected attractor network.

The original Hopfield network contains only a single layer of neurons. These are all connected symmetrically, mathematically w

= w

, w

being the weight (efficiency) of the connection from neuron i onto neuron j. Furthermore, the Hopfield network does not allow neurons to be connected onto themselves, w

= 0. The fact that the Hopfield network is single layered implies that the input and output layer are the same[15]. The neurons in the original Hopfield model uses binary states. These are represented by the values 1 and 0, on or off[9].