AI Verification ApplicationApplikation för AI-verifiering

(1)

,

STOCKHOLM SWEDEN 2020

AI Verification Application

Applikation för AI-verifiering

Utveckling av en applikation för verifiering av

Artificiella Neurala Nätverk samt undersökning av

hyperparametrar till modeller för

objektdetektering i bilder

JOEL VIK

FREDRIK LINDGREN

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF ENGINEERING SCIENCES IN CHEMISTRY, BIOTECHNOLOGY AND HEALTH

Development of an Application for Artificial

(2)

(3)

Development of an Application for Artificial Neural

Network

Verification

and

Investigation

of

Hyperparameters for Single Shot Detection Models

Applikation för AI-verifiering

Utveckling av en applikation för verifiering av

Artificiella Neurala Nätverk samt undersökning av

hyperparametrar till modeller för objektdetektering i

bilder

Fredrik Lindgren

Joel Vik

Examensarbete inom Datateknik Grundnivå, 15 hp

Handledare på KTH: Maksims Kornevs Examinator: Ibrahim Orhan

TRITA-CBH-GRU-2020:049 KTH

(4)

(5)

Hyperparameter tuning for Artificial Neural Network models is an important part in the process of producing the best model for a given task. The process heavily relies on the availability of time and computing resources. For Artifi-cial Neural Network models intended to run on embedded systems, models are tuned and trained on powerful computers with far more computing re-sources, prior to deployment on the embedded system itself. Consequently, the performance of these models require validation on the target embedded system. This is required since the performance of the models might be insuf-ficient when executed on an embedded system with limited computing re-sources. The process of validating can be a time-consuming and tedious task. This thesis covers the development of an application prototype which eases the process of validating pre-compiled Artificial Neural Network models on a target embedded system. In addition, the thesis covers an analysis of hyperpa-rameter tuning algorithms, along with an investigation of which pahyperpa-rameters are important to tune in Convolutional Neural Networks for Single Shot De-tection. An application prototype was developed, and its functionality was val-idated. An analysis of tuning algorithms was conducted, however, no indefi-nite conclusions could be drawn, since the results were conflicting, and the sample size was too small. Through the investigation of hyperparameters, op-timizer learning rate was determined to be important to tune in Convolutional Neural Networks for Single Shot Detection. Further work, with a larger sample size, might make it possible to identify additional important parameters.

Keywords

(6)

(7)

Justering av hyperparametrar för Artificiella Neurala Nätverksmodeller är en viktig del i processen för att ta fram den bästa modellen för en viss uppgift. Processen förlitar sig på att resurser i form av tid och beräkningskraft finns tillgängligt. För modeller som är till för att exekveras på inbyggda system ge-nomförs denna process på kraftfulla datorer med hög beräkningskraft innan de driftsätts på det inbyggda systemets. Därefter krävs validering av dessa mo-dellers prestanda på det inbyggda systemet de ska köras på. Detta eftersom en modells prestanda kan vara otillräcklig när den begränsas till den mycket lägre beräkningskraft ett inbyggt system besitter. Valideringsprocessen kan vara både tidskrävande och tråkig. Detta examensarbete innefattar utvecklingen av en applikationsprototyp som förenklar valideringsprocessen för artificiella neurala nätverksmodeller på inbyggda system. Dessutom innefattar examens-arbetet en analys av algoritmer för justering av hyperparametrar, samt en undersökning av vilka hyperparametrar som är viktiga att justera i

Convolut-ional Neural Networks för objektdetektering i bilder. En applikationsprototyp

utvecklades och dess funktionalitet validerades. En analys av justerings algo-ritmerna genomfördes, men inga slutsatser kunde dras. Detta på grund av att testresultaten var motstridiga och att storleken på testerna var för liten. Ge-nom undersökningen av hyperparametrar kunde “optimizer learning rate” identifieras som en viktig parameter att justera. Ytterligare arbete som gene-rerar mer testdata skulle kunna möjliggöra att fler viktiga hyperparametrar identifieras.

Nyckelord

(8)

(9)

We would like to thank Stoneridge Electronics AB for the opportunity to con-duct this bachelor thesis, especially Urban Wadelius and Anisse Taleb, for their guidance.

(10)

(11)

1 Introduction ... 1

1.1 Problem ... 1

1.2 Goals ... 1

1.3 Delimitations ... 2

2 Background ... 3

2.1 Artificial Neural Networks ... 3

2.2 Artificial Neural Network Hyperparameters ... 3

2.3 Hyperparameter Tuning ... 3

2.3.1 Tuning Objective ... 4

2.3.2 Cross-Entropy Loss ... 4

2.3.3 Automatic Hyperparameter Tuning ... 5

2.3.4 Tuning Challenges ... 7

2.4 Tools and frameworks ... 8

2.4.1 gRPC ... 8

2.4.2 Kivy ... 9

2.4.3 Xilinx ZCU102 ... 9

2.4.4 Xilinx Vitis AI Development Kit... 9

2.4.5 Keras-tuner ... 9

2.4.6 Hyperopt and Hyperas ... 9

2.5 Previous Work ... 10

3 Methodology ... 13

3.1 Neural Network Tuning Algorithms ... 13

3.1.1 Candidate Algorithm Selection ... 13

3.1.2 Neural Network Model ... 13

3.1.3 Dataset ... 14

3.1.4 Algorithm Implementation ... 15

3.1.5 Algorithm Evaluation ... 16

3.1.6 Hyperparameters ... 16

3.2 Application Development ... 17

3.2.1 Evaluation Board Configuration ... 17

3.2.2 Development ... 17

(12)

4 Results ... 19

4.1 Candidate Algorithms ... 19

4.2 Tuning Trial Results ... 19

4.2.1 Trial Model Performances ... 19

4.2.2 Hyperparameters ... 22 4.3 Application Prototype ... 24 4.3.1 General Architecture ... 24 4.3.2 Client ... 26 4.3.3 Server ... 27 4.3.4 Client-server Communication ... 27

5 Analysis and Discussion ... 29

5.1 Tuning Algorithms Selection ... 29

5.1.1 Manual and Grid Search ... 29

5.1.2 Random Search ... 29

5.1.3 Bayesian Optimization ... 29

5.1.4 Selected Candidates Algorithms ... 30

5.2 Trial Results Evaluation ... 30

5.2.1 Basis ... 30 5.2.2 Trial A ... 30 5.2.3 Trial B ... 31 5.2.4 Algorithm Performance ... 31 5.3 Hyperparameters ... 31 5.3.1 Hyperparameter Evaluation ... 31 5.4 Application Prototype ... 32 5.4.1 Feedback ... 32 5.4.2 GUI ... 32

5.5 Choice of Programming Languages ... 33

5.5.1 Language for Hyperparameter Tuning ... 33

5.5.2 Languages for the Application ... 33

5.6 Economic, Societal, Ethical and Environmental Aspects ... 34

6 Conclusion ... 35

6.1 Future Work ... 35

6.1.1 Tuning Algorithms and Hyperparameters ... 35

(13)

Appendix A – CIFAR-10 Example Image ... 41

Appendix B - FileChooser Basic Configuration ... 41

Appendix C - RecyclerView Basic Configuration ... 41

(14)

(15)

INTRODUCTION | 1

1 Introduction

1.1 Problem

Machine learning can be an excellent tool for solving complicated tasks re-garding statistics, classification, computer vision and pattern recognition. Generally described, this is done by feeding labeled data into an algorithm in order to teach it rules that helps with solving a specific task.

Model hyperparameter tuning is an important step in improving algorithm performance for the task at hand. Machine learning models are parameterized in a way that enables their behavior to be tuned in order to find the solution for a given problem. Models can have many hyperparameters and determining the best combination of hyperparameters is, even with deep technical knowledge about neural networks, a hard and time-consuming task.

Machine learning models can be implemented into embedded systems. In these cases, the resource heavy model training and optimization cannot be done on the embedded system itself since it does not have the processing re-sources required. This part is instead done on more powerful computers often utilizing GPUs before the model is deployed onto the embedded systems. A problem is validation of the performance of these models. While a model might showcase splendid performance on a powerful computer, it might not perform as well, or even well enough on an embedded system. Therefore, it is important to validate the model performance on the target embedded system, and this can be a tedious and time-consuming task.

The intention with this thesis is to determine what parameters are useful to tune in neural networks. This will be done by implementing candidate neural network tuning algorithms selected by studying past work. The results of the implementation will then be analyzed. In addition, this paper covers the de-velopment and testing of a software prototype in the form of a client/server application. The application is intended to ease the process of interacting with and validating a neural networks performance by loading pre-compiled model binaries onto an embedded system with a Deep Processing Unit, executing the binaries and returning validation result.

1.2 Goals

The main goal of this thesis is to analyze machine learning hyperparameters and determine which may be important to tune, combined with the develop-ment of an application to ease the neural network interaction and validation process on embedded systems. To achieve this, the following sub goals were constructed:

(16)

2 | INTRODUCTION

2. Develop an application for remote execution and performance validation of a neural network on an embedded system.

3. Validate the behavior of the application.

4. Analyze the tuning algorithms based on model performance. 5. Produce a list of important parameters.

1.3 Delimitations

The following delimitations are set for this thesis:

1. The choice of algorithms to implement (candidate algorithms), will be based on previous work to limit time spent by avoiding redundant work. I.e. avoiding including inefficient algorithms that have previously been deemed obsolete.

2. Functionality of the application will be prioritized over the appearance of the graphical user interface due to time constraints.

(17)

BACKGROUND | 3

2 Background

This chapter presents related work within neural network hyperparameter and model parameter tuning. In addition, it gives an introduction on hyperpa-rameters in Artificial Neural Networks.

2.1 Artificial Neural Networks

As described by Hagan et al. [1], an Artificial Neural Network (ANN) is a com-puting system designed the same way as a biological neural network in a real-life brain. Yang and Qian furthermore describes an ANN as a structure with a collection of connected nodes called artificial neurons. Through these connec-tions the artificial neurons can transmit signals between themselves. An arti-ficial neuron which receives a signal can process it and then signal other con-nected artificial neurons. These neurons are usually organized in layers and different layers can perform different transformations on their respective in-puts. Signals travel from the first layer which is referred to as the input layer, to the last layer which is referred to as the output layer. There might be any number of layers in between the input and output layers. According to Nunes da Silva I [2] the key features of an ANN is the following:

• Adaptation and learning from experience

• Generalization of acquired knowledge which enables solutions to be esti-mated

• Organization of data via clustering of patterns

• Very high fault tolerance due to the numerous connections between artifi-cial neurons

• Distributed storage of knowledge in the artificial neuron

• Trivial to implement in a prototype since the end result is usually a math-ematical function

2.2 Artificial Neural Network Hyperparameters

Hyperparameters are parameters set before the training of a model. They de-fine the structure of the model by setting parameters such as the number of hidden nodes or the learning rate. Different machine learning problems opt for different optimization of the hyperparameters giving varying results and performance.[3][4]

2.3 Hyperparameter Tuning

(18)

4 | BACKGROUND

2.3.1 Tuning Objective

Hyperparameter tuning is done in order to find the best hyperparameter con-figuration for a given algorithm [5]. Martin Wistuba et al.[6] states that the goal of hyperparameter tuning is finding the optimal configuration of hy-perparameters λ* to create a model 𝑀 with a minimized predefined loss func-tion ℒ(𝐷(𝑣𝑎𝑙𝑖𝑑)_{; 𝑀}

λ) where ℒ stands for Laplace transform [7] and 𝐷(𝑣𝑎𝑙𝑖𝑑) rep-resents given test data used for validation. Loss functions generally measure mean squared error and mean absolute error rate, which parameter tuning aims to minimize. 𝑀 is established by a learning algorithm A, using a training dataset 𝐷(𝑡𝑟𝑎𝑖𝑛)_{. A can be parameterized by a set of hyperparameters which} means a model can be defined as 𝑀_λ = 𝐴_λ(𝐷(𝑡𝑟𝑎𝑖𝑛)). Finding the best hyperpa-rameter configuration for a model using a validation set can then be described as follows:

λ∗ ≔ arg minλℒ (𝑀, Dvalid) =: 𝑓𝐷(λ) (1)

The response function 𝑓_𝐷 is the loss. Note that if unsupervised learning is used, 𝐷(𝑡𝑟𝑎𝑖𝑛) and 𝐷(𝑣𝑎𝑙𝑖𝑑) might be identical and unlabeled.

Manual search is one approach for hyperparameter tuning. The approach is simply what it sounds like, it consists of manually testing combinations of hy-perparameters and manually evaluating their performance.

2.3.2 Cross-Entropy Loss

(19)

BACKGROUND | 5

2.3.3 Automatic Hyperparameter Tuning

The following three methods can be used to automatically tune hyperparame-ters.

2.3.3.1 Grid Method

Grid search is an approach that simply tests all combinations of hyperparam-eters available and returns the best performing λ*. The Grid search method is relatively easy to implement as there exists libraries with support for it. As described by James Bergstra and Yoshua Bengio in [9], Grid search suffers from that number of required evaluations grows exponentially with the num-ber of available hyperparameters.

Furthermore, Bergstra and Bengio states that Grid search should be paired with Manual search. Initially using Manual search will reduce the number of parameters which has to be evaluated by Grid search. With a reduced dimen-sion where Manual search only deems a few parameters important, Grid search will reliably return the best performing λ*.

2.3.3.2 Random Search Method

The Random search method is also discussed by Bergstra and Bengio. It simply uses random combinations of hyperparameters in a specified range to find the best λ* for a given model. Contrary to the Grid method, this method does not try every possible combination of hyperparameters. In other words, it does not suffer from exponential growth of evaluations with the number of available hyperparameters and is equally easy to implement.

2.3.3.3 Bayesian Optimization Techniques

Based on “Bayes’ theorem” from which the name derives, Bayesian optimiza-tion (Bay. opt.) determines the next locaoptimiza-tion 𝑥_𝑡+1 ∈ 𝐴 through an acquisition function [10]. Compared to other techniques it requires fewer function evalu-ations. When evaluating a function 𝑓(𝑥), this is accomplished by using all available information from previous evaluations of 𝑓(𝑥) to find the minimum of the function [11]. The fewer evaluations are done at the cost of performing more computations to find the next point to try. In cases where evaluations of 𝑓(𝑥) is expensive, as they often are when training a machine learning algo-rithm, increasing computations, and decreasing evaluations provide a more efficient way of training. As described by James Bergstra et al. [12], for the step of approximating 𝑓(𝑥) from previous evaluations, approaches such as Gaussian Process (GP) and Tree-structured Parzen Estimator (TPE) are often used. If the prior distribution of a function 𝑓 is believed to be a GP with mean 0 and kernel 𝑘, the conditional distribution of 𝑓 knowing a sample ℋ

(20)

6 | BACKGROUND

ℋ = (𝑥_i𝑓(𝑥_𝑖))

𝑖=1 𝑛

(2)

Whereas the Gaussian process-based approach modeled 𝑝(𝑦|𝑥) directly, TPE models 𝑝(𝑥|𝑦) and 𝑝(𝑦). The TPE models 𝑝(𝑥|𝑦) by transforming that gener-ative process, replacing the distributions of the configuration prior with non-parametric densities.

Using different observations {𝑥(1)_{, … , 𝑥}(𝑘)_{} in the non-parametric densities,} these substitutions represent a learning algorithm that can produce a variety of densities over the configuration space 𝒳. The TPE defines 𝑝(𝑥|𝑦) using two such densities

𝑝(𝑥|𝑦) = {ℒ(𝑥) 𝑖𝑓 𝑦 < 𝑦

∗

𝑔(𝑥) 𝑖𝑓 𝑦 ≥ 𝑦∗ (3)

where ℒ(𝑥) is the density formed by using the observations {𝑥(𝑖)_{} such that} corresponding loss 𝑓(𝑥(𝑖)_{) was less than 𝑦}∗_{, and 𝑔(𝑥) is the density formed by} using the remaining observations. Whereas the GP-based approach favoured quite an aggressive 𝑦∗_{(typically less than the best observed loss), the TPE} al-gorithm depends on a 𝑦∗_{that is larger than the best observed 𝑓(𝑥) so that} some points can be used to form (𝑥). The TPE algorithm chooses 𝑦∗_{to be some} quantile  of the observed 𝑦 values, so that 𝑝(𝑦 < 𝑦∗_{) =  but no specific model} for 𝑝(𝑦) is necessary. By maintaining sorted lists of observed variables in ℋ,

the runtime of each iteration of the TPE algorithm can scale linearly in |ℋ| and

linearly in the number of variables (dimensions) being optimized.

As briefly stated previously, an acquisition function is used to determine the next point to evaluate. Jasper Snoek et al. [11] states that an acquisition func-tion determines what point in 𝒳 should be evaluated next via a proxy optimi-zation 𝑥_{𝑛𝑒𝑥𝑡} = 𝑎𝑟𝑔𝑚𝑎𝑥_𝑥 𝑎(𝑥), where several different functions have been pro-posed. Furthermore, three of the most popular acquisition functions are de-scribed. Their dependence on previous observations, as well as the GP hy-perparameters, are denoted as 𝑎(𝑥 ; {𝑥_𝑛, 𝑦_𝑛}, θ). Under the Gaussian process prior, these functions depend on the model solely through its predictive mean function μ(x ; {x_n, y_𝑛}, θ) and predictive variance function σ2_{(𝑥 ; {𝑥}

𝑛, 𝑦𝑛, }, θ). The best current value is denoted as shown in equation 4, and the cumulative distribution function of the standard normal as ϕ (·).

(21)

BACKGROUND | 7

The Probability of Improvement strategy maximizes the probability of im-proving over the best current value. If the GP approach is used, it can be ana-lytically computed as

𝑎_𝑃𝐼(𝑥 ; {𝑥_𝑛, 𝑦_𝑛}, θ = ϕ(γ(x)), γ(x) =𝑓(𝑥𝑏𝑒𝑠𝑡) = μ(x ; {xn, y𝑛}, θ) σ (x ; {x_n, y_𝑛}, θ) (5) Expected Improvement (EI) maximizes the expected improvement over the current best. Under GP this is computed as

𝑎_𝐸𝐼(𝑥 ; {𝑥_𝑛, 𝑦_𝑛}, θ) = σ (x ; {x_𝑛, y_n}, θ)(𝛾(𝑥)ϕ(γ(x) + 𝑁 (𝛾(𝑥) ; 0, 1)) (6) A newer strategy called GP Upper Confidence Bound utilizes the lower confi-dence bounds (upper, when considering maximization) to construct acquisi-tion funcacquisi-tions that minimize regret over the course of their optimizaacquisi-tion. It can be computed as

𝑎_𝐿𝐶𝐵(x ; {x_n, y_𝑛}, θ) = μ(x ; {x_n, y_𝑛}, θ) − 𝒦σ (x ; {x_n, y_𝑛}, θ) (7) with a tunable 𝒦 to balance exploitation against exploration.

2.3.4 Tuning Challenges

Marc Claesen and Bart De Moor [4] describes challenging aspects of the pro-cess of tuning and optimizing hyperparameters.

Firstly, the evaluation aspect is assessed. For each combination of hyperpa-rameters 𝜆 which is to be tested, an evaluation of the performance of a trained model is required. Depending on the complexity of learning algorithm A and the size of the problem, i.e. the dataset, this might take a very long time. Times ranging up to weeks are not unheard of.

In addition, the aspect of randomness in an evaluation is mentioned. The eval-uation often exhibits a stochastic component which for example could be due to inherent randomness of A used to solve the given problem. This might affect the evaluation in a way that makes the set of hyperparameters 𝜆∗_{deemed to} be the best by the process, not truly be the best set. In many cases this is coun-tered by methods designed to produce a lot of sets of parameters which are close to the best set. This can help with determining if the produced best set 𝜆∗_{is an outlier or not post evaluation.}

(22)

8 | BACKGROUND

evaluation can be very costly if there are many hyperparameters. Usually the number of parameters is small, i.e. not more than five. If there are many hy-perparameters there is usually only a handful that has a significant input on performance. Though selecting the relevant ones in advance is a hard task as there might even be hyperparameters that exist conditionally upon the value of others.

2.4 Tools and frameworks

To develop the AI Verification Application, the following tools and frame-works were used. As stated below in each sub-chapter, some were recom-mended to us, and others supplied to us, from the employer and therefore used.

2.4.1 gRPC

gRPC, developed by Google, is used for Remote Procedure Calls [13] between clients and the server on Xilinx ZCU102. gRPC is an open source RPC frame-work and is capable of high performance running in any environment. It sup-ports 10 different programming languages, making it applicable for a wide va-riety of devices.

Figure 1, server-client communication through gRPC with multiple client libraries.[14]

(23)

BACKGROUND | 9

2.4.2 Kivy

The Kivy framework is an open source Python library for developing Graphical User Interfaces (GUIs). Kivy enables cross-platform GUIs to be written in Py-thon code, effectively making the same code runnable for Linux, Windows, OS X, Android, iOS, and Raspberry Pi [16]. Apart from the open source aspect, the library is being developed and maintained by developers using Kivy in their professional products [17]. As Kivy supports multi-touch and has a wide variety of highly customizable widgets, it enables the writing of modular cross-platform GUIs. Kivy uses its own programming language called The KV lan-guage apart from Python [18]. This can be represented in either Python strings or .kv files, making it possible to separate the definition and implementation of widgets, giving the code a clearer structure by further separating logic and UI.

2.4.3 Xilinx ZCU102

Xilinx ZCU102 is a general-purpose evaluation board for rapid-prototyping based on the Zynq® UltraScale+™ XCZU9EG-2FFVB1156E MPSoC (multi-processor system-on-chip) [19]. It is equipped with a Deep Learning Processor Unit (DPU), a configurable computation engine dedicated for convolutional neural networks that supports most convolutional neural networks, such as VGG, ResNet, GoogLeNet, YOLO, SSD, MobileNet and FPN [20].

2.4.4 Xilinx Vitis AI Development Kit

Xilinx Vitis AI Development Kit is a deep learning Software Development Kit (SDK) for a Deep-learning Processing Unit (DPU). It provides solutions for deep learning interference application development through optimized tool chains and lightweight Python and C/C++ APIs and is compatible with Ten-sorflow Keras models. [21]

2.4.5 Keras-tuner

Keras-tuner is a hyperparameter tuning framework that simplifies the imple-mentation of hyperparameter tuning algorithms on Keras models. The frame-work provides easy solutions to define a search space. Both Random search and Bayesian optimization with a Gaussian process is supported by Keras-tuner [22].

2.4.6 Hyperopt and Hyperas

(24)

10 | BACKGROUND

2.5 Previous Work

In a study performed by Zhao Yang and Hongbing Qian [24], four different binary classification models in an ANN were trained using an automated pa-rameter tuning technique called Caret. Two of the models were trained using default parameters and two of them were trained with parameters decided by Caret. The parameters tuned were Coupling Between Methods (CBM), Aver-age Method Complexity (AMC) and Afferent Couplings (CA) in the first of the tuned models and CMB, AMC, and Response For a Class (RFC) in the second. The parameters were selected using the Caret R data analysis tool and by stud-ying the resulting correlation graph. The first tuned model had three hidden units and a decay value of 0.1 in the neural network, the second had nine hid-den units and a decay value of 0.01 while the two models with default settings had one hidden unit and a decay value of 0. The model’s confusion matrices were analyzed, and it was determined that both the tuned models had higher accuracy than their default valued counterparts.

James Bergstra and Yoshua Bengio presented a study [9] on how Random search compares to Grid search. This was done by tuning and evaluating mod-els with Random search on the same datasets Larochelle et al. [25] used in their study on Grid search coupled with Manual search. Three different hy-perparameters were tuned for all datasets and the accuracy and efficiency were compared to the results in the mentioned study on Grid search. Bergstra and Bengio concluded that Random search is a more efficient method since not all hyperparameters are equally important to tune. This means that Grid search allocates too many trials in order to tune hyperparameters of very low to non-existent importance. In most cases the Random search method found better models and required less computational time compared to Larochelle et al. Furthermore, they pointed out that random experiments are easier to carry out practically. This is due to the ability to stop trials at any time, conduct trials on different machines, conduct trials asynchronously and to recover from failed trials without ruining the experiment.

(25)

BACKGROUND | 11

(26)

(27)

METHODOLOGY | 13

3 Methodology

This chapter presents and explains how the study and the application devel-opment was conducted. The candidate algorithm selection, implementation and evaluation are covered in section 3.1. The applications development and its validation are covered in section 3.2. Figure 2. displays the workflow in chronological order.

Figure 2. visualization of chronological workflow 3.1 Neural Network Tuning Algorithms

This section covers the selection of candidate algorithms as well as the imple-mentation and evaluation of the selected candidates. The resulting candidate algorithms are presented in section 4.1 and the reasoning behind the selec-tions are presented in section 5.1.

3.1.1 Candidate Algorithm Selection

The investigation of neural network tuning algorithms for model hyperparam-eters was done by analyzing previous work in the area. The goal was to find at least two candidate algorithms to implement and evaluate. The selection itself was done by comparing the three key points: execution time, lowest model loss value and highest model evaluation accuracy from different algorithms in pre-vious work.

3.1.2 Neural Network Model

(28)

14 | METHODOLOGY

Table 1. The layers and corresponding default parameters of the model. Layer # Type of layer Hyperparameters

1. 2D Convolutional

(Input layer) filters=16, kernel_size=3, activation=”relu”, input_shape=(32,32,3) 2. 2D Convolutional filters=16, kernel_size=3, activation=”relu”

3. 2D Max Pooling pool_size=2

4. Dropout rate=0.25

5. 2D Convolutional filters=32, kernel_size=3, activation=”relu”

6. 2D Convolutional filters=32, kernel_size=3, activation=”relu”

7. 2D Max Pooling pool_size=2

9. Flatten -

10. Dense units=128, activation=”relu”

12. Dense (Output layer) num_classes=10, activation=”softmax”

3.1.3 Dataset

(29)

METHODOLOGY | 15

3.1.4 Algorithm Implementation

The implementation of the candidate algorithms was done using Keras-tuner (described in section 2.4.5) and Hyperopt along with the wrapper Hyperas (described in section 2.4.6). The implementation was done in two trials, Trial A and Trial B, tuning two different sets of seven hyperparameters of the model. For both trials, the learning rate of the Adam optimizer was tuned with a max-imum value set at 1e-2 and a minmax-imum value set at 1e-4. Table 2 displays the search space for Trial A and Table 3 displays the search space for Trial B. Dur-ing each trial, the algorithms were allowed to run for 50 and 200 iterations each. Every iteration included 40 epochs which means the model was trained on the dataset 40 times. In addition, the default model was trained on the da-taset for 40 epochs as a reference point.

Table 2. Search space for Trial A. Layer # Type of

layer

Hyperparame-ter Value Range

4. Dropout rate Min=0.00, Max=0.5, steps of 0.05

6. 2D

Con-volutional filters Choice between 32 and 64

10. Dense

units Min=32, Max=512, steps of 32 activation Choice between relu, tanh and

sig-moid

- - optimizer

(30)

16 | METHODOLOGY

Table 3. Search space for Trial B Layer # Type of

layer Hyperparame-ter Value Range

1. 2D Convo-lutional (Input layer)

activation Choice between relu, tanh and sigmoid

2. 2D Convo-lutional

filters Choice between 16 and 32

filters Choice between 32 and 64

filters Choice between 32, 64 and 96

10. Dense units Min=32, Max=512, steps of 32

- - optimizer

learning rate Logarithmic sampling between 0.01 and 0.0001

3.1.5 Algorithm Evaluation

The best models from each algorithm, in each trial, were evaluated on execu-tion time, lowest model loss value and highest model evaluaexecu-tion accuracy. The best model from each algorithm was the one with the highest evaluation accu-racy. The results of Trial A, Trial B and the default reference was compared. The confusion matrices of the default reference and the best model of each trial was analyzed. In addition, the hyperparameter deviance, between the de-fault model and the best model from both trials, was analyzed.

3.1.6 Hyperparameters

(31)

METHODOLOGY | 17

3.2 Application Development

This section covers the development process of the AI Verification Application and the different tools and frameworks used during development. Further-more, it covers how the validation of the application was conducted.

3.2.1 Evaluation Board Configuration

The Xilinx ZCU102 evaluation board (described in 2.4.3) was setup with Vitis AI and Xilinx PetaLinux 2019.2 Target Reference Design (TRD). The board was connected to a host computer through an ethernet connection and setup to masquerade the hosts IP, enabling it to utilize tools such as Git and apt-get for a smoother development process.

The Xilinx ZCU102 was used to host a gRPC (described in 2.4.1) server for the AI Verification Application through which the models were executed and eval-uated. The evaluation board and gRPC was used upon request from the em-ployer. The ZCU series are used in other products developed by said employer and gRPC used for similar RPCs with said products.

3.2.2 Development

The development process incorporated agile procedure with a demonstration on a weekly basis. This way the employer could provide feedback throughout the development process, making the application more customized after their needs. With the evaluation board setup to masquerade the hosts IP, the code for the application was written on desktops in graphical IDEs such as Visual Studio Code, and remotely pulled to the evaluation board through a Git repos-itory.

The development aimed at ensuring that each of the components were loosely coupled and could be replaced, and that additional components could be added easily, if deemed necessary. The Python package abc, containing Ab-stract Base Classes [29], was used to write Interfaces to enforce this.

To interact with the application, a Graphical User Interface (GUI) was de-signed with the Kivy framework, described in section 2.4.2. A FileChooser [30] was used to let the user select files or folders. Appendix B shows the basic con-figuration of the FileChooser. A RecyclerView is used for displaying messages, with a list of strings where each item in the list is its own message. Appendix C shows the basic configuration of the RecyclerView.

(32)

18 | METHODOLOGY

3.2.3 Application Validation

(33)

RESULTS | 19

4 Results

This chapter presents the results of the conducted study and application de-velopment. Section 4.1 presents the results of the candidate algorithm selec-tion. Section 4.2 presents the result of the tuning trials as well as the results of default reference. Section 4.3 presents the developed application.

4.1 Candidate Algorithms

The algorithm selection process (described in section 3.1), resulted in the three candidate algorithms: i) Random search (described in section 2.3.3.2), ii) Bayesian optimization with a Gaussian process (Bay. opt. GP), iii) Bayesian optimization with TPE (Bay. opt. TPE). Both variations of Bayesian optimiza-tion were described in secoptimiza-tion 2.3.3.3. For the implementaoptimiza-tion of two of the three algorithms, Random search and Bayesian optimization with a Gaussian process, Keras-tuner was used. For the third algorithm, Bayesian optimization with TPE, Hyperopt was used. To use Hyperopt with Keras models the wrap-per Hywrap-peras was used.

4.2 Tuning Trial Results

4.2.1 Trial Model Performances

The default reference trained during 40 epochs resulted in a model with 69.5% accuracy, 0.868 loss after an execution time of 10.5 minutes. The performance of the best models, produced by the different tuning algorithms during Trial A and Trial B, are displayed in Table 4.

(34)

20 | RESULTS

Table 4. Results of the tuning trials.

Trial Algorithm Iterations

Execut-ion time Accuracy Loss

A

Random search 200 ≈33.4 h ≈78.3% ≈0.65

50 ≈7.5 h ≈77.5% ≈0.707

Bay. opt. GP 200 ≈34.37 h ≈76.7% ≈0.72

50 ≈7.8 h ≈74.5% ≈0.75

Bay. opt. TPE 200 ≈34.73 h ≈78.3% ≈0.67

50 ≈8.1 h ≈75.7% ≈0.69

B

Random search 50 ≈10 h ≈78.1% ≈0.75

Bay. opt. GP 50 ≈10.3 h ≈78.3% ≈0.66

Bay. opt. TPE 50 ≈10.87 h ≈79% ≈0.76

Table 5. Result of the tuning trials. Discarding the runs made with 200 iterations in Trial A. Trial Algorithm Iterations Execution

time Accuracy Loss

A

Random search 50 ≈7.5 h ≈77.5% ≈0.707

Bay. opt. GP 50 ≈7.8 h ≈74.5% ≈0.75

Bay. opt. TPE 50 ≈8.1 h ≈75.7% ≈0.69

B

Random search 50 ≈10 h ≈78.1% ≈0.75

Bay. opt. GP 50 ≈10.3 h ≈78.3% ≈0.66

(35)

RESULTS | 21

Comparison between the best, i.e. most accurate model confusion matrix, dis-played in Figure 3, and the default model confusion matrix, disdis-played in Fig-ure 4, shows a significant improvement in classification across all ten catego-ries post tuning. A confusion matrix displays the distribution of predicted la-bels compared to the true lala-bels. This visualizes the accuracy on a per category basis instead of overall accuracy. In addition, it shows wrong predictions, which can display categories the model has a hard time to separate.

(36)

22 | RESULTS

Figure 4. Default model confusion matrix. The sidebar represents number of images and the deci-mal numbers display percentage (0.0 - 1.0)

4.2.2 Hyperparameters

The two tuning trials both resulted in models with better performance, i.e. ac-curacy and loss, than the default model. The hyperparameters of the best model from Trial A are displayed in Table 6 and the hyperparameters of the best model from Trial B are displayed in Table 7.

(37)

RESULTS | 23

Table 6. Hyperparameter values of the best model from Trial A and their deviance from the default settings.

Layer # Type of

layer Hyperpara-meter Value Default Value Deviance

4. Dropout rate 0.35 0.25 0.1 6. 2D Convo-lutional filters 64 32 32 8. Dropout rate 0.35 0.25 0.1 10. Dense units 352 128 224

activation tanh relu N/A

11. Dropout rate 0.35 0.25 0.1

- - optimizer

learning rate ≈0.00071 0.002 ≈0.00129

Table 7. Hyperparameter values of the best model from Trial B and their deviance from the default settings.

layer Hyperpara-meter Value Default Value Deviance

1. 2D

Convo-lutional (Input layer)

activation tanh relu N/A

(38)

24 | RESULTS

Table 8. Hyperparameters of a high performing (78.5% accuracy) model from Trial B and its devi-ance from the best performing model (79% accuracy).

layer Hyperparameter Value Deviance from best model

1. 2D

Convolut-ional

(Input layer)

activation relu N/A

2. 2D Convolut-ional filters 32 0 4. Dropout rate 0.5 0.15 5. 2D Convolut-ional filters 64 0 6. 2D Convolut-ional filters 32 64 10. Dense units 512 224 - - optimizer learning rate ≈0.00047 ≈0.00006 4.3 Application Prototype

This section describes the architecture behind the AI Verification Application developed and presents the result of the developed prototype.

4.3.1 General Architecture

(39)

RESULTS | 25

As the application consists of several different modules, the application was developed in a way that enables any part of it, such as server, client, method of communication or GUI, to be easily replaced if the requirements were to change. This was enforced on the client side by having the components sepa-rated by either a Controller and/or an Interface.

On the server side the different components are divided into scripts. The gRPC classes and methods are following the Model-View-Controller (MVC) pattern, while the model execution and evaluation is its on script. The gRPC method simply executes another Python script, making the script easy to replace if other approaches of execution or evaluation is necessary.

The GUI, Controller and gRPC method calls are all run on separate threads, making sure the GUI is responding and continuously providing the user with feedback during transmissions, executions, and evaluations. With the GUI be-ing on a separate thread, any actions such as additional file transmissions in-itiated by the user while gRPC is already busy, will be queued up and inin-itiated when gRPC becomes available.

(40)

26 | RESULTS

Figure 5. UML diagram of the applications architecture.

4.3.2 Client

(41)

RESULTS | 27

4.3.3 Server

The server consists of two parts. The first part being the gRPC classes, meth-ods, and utility functions and the second being the script for executing and evaluating the model. The gRPC methods provides the server with a higher-level functionality to receive the streams of binary files. The utility functions provide the lower level functionality of saving the files in a predefined file hi-erarchy, enabling the model execution script to locate and use the necessary files. Any exception raised during the file saving process is considered a po-tential risk for the model execution and evaluation, as missing files could lead to inaccurate or misleading evaluation. For this reason, any exception caught will lead to interruption of the file transmission, the erase of any files already received, and a response being sent to the client with an error message. If no exception is raised during the transmission, a response is sent when all files are received, telling the user that the server is ready for execution. When the script for executing and evaluating the model is started, the server echoes use-ful output from said script back to the client.

4.3.4 Client-server Communication

Communication between the client, host and the hardware server was per-formed through gRPC, which is described in section 2.4.1. This section goes through how gRPC was structured and the requests and responses defined.

4.3.4.1 gRPC Security

With the employer intending to use the application on a private local area net-work, the application itself provides little to no security for the gRPC commu-nication and is solely relying on the security provided by other layers in the network and on the host computer. In addition, as the evaluation board is masked behind the hosts IP address, the gRPC server is only accessible from the host computer, thus no other than the intended client on the host com-puter can access the server. With several layers of security provided elsewhere, the gRPC client-server connection uses an insecure channel for the communi-cation.

4.3.4.2 Request and Response Messages

(42)

28 | RESULTS

(43)

ANALYSIS AND DISCUSSION | 29

5 Analysis and Discussion

This chapter analyzes and evaluates the results presented in chapter 4, in re-lation to the goals set in section 1.2 and the methodologies used throughout the thesis. Furthermore, social, economic, environmental, and ethical aspects of the thesis are discussed.

5.1 Tuning Algorithms Selection

This section presents and motivates which algorithms were selected and which algorithms were deselected as the study continued. The subsections below go through each of the methods researched in chapter 2.3 and motivates why some were implemented and others left out.

5.1.1 Manual and Grid Search

As the application aimed at an automated process of tuning hyperparameters, any manual method of doing so were discarded. Grid search is often used when manually tuning as the user firstly manually tries a number of hyperpa-rameter combinations, and thereafter specifies a hyperpahyperpa-rameter search space to iterate over. Since Grid search can be automated to iterate over every hy-perparameter it was therefore a viable option to consider. Still, as stated by James Bergstra and Yoshua Bengio[9], Grid search proved inefficient as the number of hyperparameters and their possible values increased when put in comparison to both Random search and Bayesian optimization. Granted Grid search is a valid choice with a limited amount of iterations, its use cases were too few to include the algorithm in the investigation.

5.1.2 Random Search

Random search proved to be a valid contestant as the efficiency of the algo-rithm came down to the number of iterations done, regardless of the number of hyperparameters to iterate over. As more possible combinations of hy-perparameters were added, more iterations were required for Random search to provide a reliable optimization. Whereas Grid search finds the best possible optimization, Random search often provides a good enough alternative given enough iterations while being significantly less time consuming. As this makes it a viable method for tuning, the number of iterations required for it to give a reliable result still makes it a time-consuming process. Still, if implemented correctly, Random search will outperform Grid search in terms of efficiency for most, if not all, applicable use cases.

5.1.3 Bayesian Optimization

(44)

computa-30 | ANALYSIS AND DISCUSSION

tions to make calculated guesses on possible good hyperparameter combina-tions to try. These combinacombina-tions are selected from a defined hyperparameter search space. This can prove Bayesian optimization to be time saving while still providing better results than Random search. Bayesian optimization can use either a Gaussian process or a TPE for the computations of calculated guesses.

5.1.4 Selected Candidates Algorithms

Bayesian optimization with a Gaussian process, Bayesian optimization with TPE and Random search ended up being the selected candidate algorithms. Consequently, Manual search and Grid search were deselected. The reason for this was that a manual process was not desired, and that previous work (pre-sented in section 2.5) had proven Grid search to be inefficient in comparison to the selected candidates.

5.2 Trial Results Evaluation

5.2.1 Basis

As described in section 3.1.4 the initial thought was to let each trial run the three candidate algorithms for 200 and 50 iterations each. However, after running the entirety of Trial A, we discovered that the difference in perfor-mance between the best model produced after 200 iterations and the best model produced after 50 iterations was minimal (≈0.8% accuracy and ≈0.5 loss). The amount of time it took to run 200 iterations, around 33 to 35 hours depending on the algorithm, was deemed too long for such a small improve-ment. Therefore, Trial B only ran each algorithm for 50 iterations. Due to the fact that only Trial A included runs with 200 iterations, these runs will be dis-carded in order to perform an impartial analysis and the evaluation will be based on the results shown in Table 5.

5.2.2 Trial A

Surprisingly, the most accurate model in Trial A with an accuracy of 77.5%, was produced using Random search. Since both Bay. opt. with GP and Bay. opt. with TPE utilizes acquisition algorithms in order to produce a candidate set of hyperparameters from the previous iterations, the initial thought was that one of these algorithms would produce the most accurate model in both trials.

One possible reason for the result in Trial A might simply be pure luck. Since Random search randomly tries combinations of hyperparameters from the search space, it is possible that the surprisingly good result might have been a fluke.

(45)

the goal of the Bayesian optimization techniques is to minimize the loss func-tion, in this case cross entropy loss. Furthermore, section 2.3.2 explains that the model with the lowest loss does not have to be the model with the highest accuracy, even though there is certainly a strong relationship between high accuracy and low loss. This could mean that these techniques might be inferior when applied to models using cross entropy loss.

5.2.3 Trial B

The results of Trial B were more in line with the expected result. Bay. opt. with TPE produced the most accurate model with 79% accuracy, which was also the most accurate model across both trials, followed by Bay. opt. with GP in sec-ond place and Random search in last place. However, the difference in accu-racy between the model produced by Bay. opt. with TPE and the model pro-duced by Random search, was a mere 0.9%.

5.2.4 Algorithm Performance

Firstly, Random search produced the most accurate model in Trial A and the least accurate model in Trial B. For this reason, it might seem like Random search is an unpredictable algorithm, which would not be surprising given its random nature. However, when ignoring placement and solemnly examining pure accuracy, Random search produced a model with an accuracy very close to the best model from Trial B. Therefore, Random search still seems like a viable algorithm for hyperparameter tuning.

Secondly, Bay. opt. with GP produced the least accurate model in Trial A and the second most accurate model in Trial B. The model produced in Trial A was the least accurate of all the models across both trials, while the model pro-duced in Trial B was the second most accurate.

Lastly, Bay. opt. with TPE produced the second most accurate model in Trial A and the most accurate model in Trial B. The model produced in Trial B was the most accurate across both trials. To summarize this, the results from the two trials are somewhat in conflict with each other, and not clear enough in order to draw any indefinite conclusions regarding which algorithm is best.

5.3 Hyperparameters

5.3.1 Hyperparameter Evaluation

(46)

32 | ANALYSIS AND DISCUSSION

hyperparameters. However, it was observed that in all these models, a major-ity of the other hyperparameters had changed from the default settings. These changes created different combinations of hyperparameters, and it seemed like they were dependent on each other. I.e. if hyperparameter A had a value X, then a good result was generated if hyperparameter B had value Y.

To summarize, the only hyperparameter indefinitely deemed important was optimizer learning rate. It was however concluded that the combinations of the other hyperparameters had some level of importance. The list of important hyperparameters thereby ended with one entry, optimizer learning rate.

5.4 Application Prototype

This section goes through some of the debatable aspects of the application. As some of the expected behavior of some components in the application can vary from user to user, this section discusses the thought process when developing them and how the users are expected to utilize them.

5.4.1 Feedback

The amount and type of messages presented to the user is something debata-ble for this type of application. The model execution and evaluation can itself be regulated to output more specific or more abstract feedback than currently done. This comes down to what the user expects the application to present it with and is one of the important aspects of showcasing a demo of application and getting feedback from the employer. Still, with the application functioning as expected, the intended users of this application have more than enough knowledge and experience to tune the feedback given from model execution and evaluation after their own satisfaction. Even if the user deems some of the feedback unnecessary or non-existent, being able to simply add or change a few rows of code in the execution script should be minor adjustments to an overall well-functioning application.

5.4.2 GUI

(47)

5.5 Choice of Programming Languages

This section motivates the programming languages chosen for the hyperpa-rameter tuning as well as the application. Along with the motivation of choices made, the section goes through some of the other options available and why they were ruled out.

5.5.1 Language for Hyperparameter Tuning

For machine learning, AI in general and hyperparameter tuning, Python was the given language to use. As Python contains a significant number of libraries and APIs provided by Tensorflow, Keras and Hyperopt to name a few, along with its extensive machine learning community, Python stood out as the lan-guage for machine learning and hyperparameter tuning. Guides, tutorials, and documentation in Python by either the authors of the libraries and APIs, or developers with sufficient knowledge and experience in the area, proved more than enough to suffice the writing of the hyperparameter tuning code.

5.5.2 Languages for the Application

As stated in 3.2.2, the only programming language the application was written in was Python. With the architecture of the application, it would be possible to write the client, gRPC communication and server in different languages if necessary. Still, while there were other options than Python available, the ap-plication was developed in Python for several reasons.

5.5.2.1 Server Side

For the server, both Python and C++ were valid options. Since Xilinx provided tutorials and guides in Python as well as C++, both languages were considered for model execution and therefore potentially the whole server. During devel-opment, gRPC was written in Python and different versions of model execu-tion were written and experimented with in C++ and Python respectively. As the development progressed, C++ was ruled out in favor of Python. No real benefits were gained from using C++ while switching to Python provided con-sistency throughout the whole application and enabled faster and smoother coding as the developers were more experienced with Python than C++.

5.5.2.2 Client Side

(48)

34 | ANALYSIS AND DISCUSSION

written in said language, the rest of the client along with GUI was written in the same language.

5.6 Economic, Societal, Ethical and Environmental Aspects

(49)

CONCLUSION | 35

6 Conclusion

Firstly, this thesis presented an investigation on which hyperparameters are important to tune in Convolutional Neural Networks for Single Shot Detec-tion. The investigation was conducted by selecting candidate hyperparameter tuning algorithms, through investigating results of previous work. The se-lected candidate algorithms were then implemented and used to tune the hy-perparameters of a default model. Following, hyhy-perparameters of high accu-racy models produced in the implementation were analyzed, in order to find correlations between high accuracy and certain hyperparameters. In addition, the implemented tuning algorithms were analyzed based on performance of their produced best model. Secondly, this thesis covers the development and validation of an application for remote validation of neural networks on em-bedded systems.

The main goal of the thesis was reached since the hyperparameter optimizer learning rate was deemed important to tune and an application for remote validation was developed and had its functionality validated. The analysis of tuning algorithms did not reach a sufficient conclusion, probably because of the small sample size.

6.1 Future Work

This section goes through additional work and aspects that could be included to further improve the results of this thesis.

6.1.1 Tuning Algorithms and Hyperparameters

As mentioned in 5.2.4, the tuning algorithm evaluation could not generate an indefinite conclusion. This could be because of the sample size being too small. In order to reach a conclusion, more trials could be conducted in order to re-ceive a larger sample size. A larger sample size could help determine if the result from the trials conducted in during this thesis were anomalies. Moreo-ver, a larger sample size might make it possible to detect additional correla-tions between certain hyperparameters and high accuracy.

6.1.2 Application Prototype

(50)

36 | CONCLUSION

6.1.2.1 Securing gRPC Connections

As previously mentioned in 4.3.4.1, the application is designed for internal use between a single host and the evaluation board on a private LAN. If this were to change, and the evaluation board were made accessible for other computers in the network, e.g. if given its own public IP address, the lack of security for the server could become an issue. As the gRPC provides secure alternatives to the insecure channel used in this application, replacing the channel would be an efficient way of providing more security and effectively restricting the ac-cessibility to the server.

6.1.2.2 Improving User Experience

With the intention of making an easily tunable application, the user experi-ence can be improved if the current design no longer seems fit. The tuning of feedback is expected to be altered as it is difficult to predict exactly what the user requires and not.

With the architecture of the application being loosely coupled, the GUI is eas-ily replaced, overhauled, or improved without affecting the rest of the applica-tion. As discussed in 5.4.2, the lack of functionality in the GUI leaves it with room for both improvements and additional widgets. If no additional widgets are necessary, an overhaul of it could make it more appealing.

(51)

REFERENCES | 37

7 References

[1] Beale, Hagan Demuth, Howard B. Demuth, and M. T. Hagan. "Neural net-work design." Pws, Boston (1996).

[2] Nunes da Silva I. Artificial Neural Networks, A Practical Course. 1. Swit-zerland: Springer, Cham; 2017, ISBN: 3-319-43162-5.

[3] H. Osman, M. Ghafari and O. Nierstrasz, "Hyperparameter optimization to improve bug prediction accuracy," 2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), Klagen-furt, 2017, pp. 33-38.

[4] Claesen, Marc; Bart De Moor. 2015. "Hyperparameter Search in Machine Learning". arXiv:1502.02127 [cs.LG].

[5] Kuhn M, Johnson K. Applied Predictive Modeling [Internet]. ISBN 978-1-4614-6849-3. Springer Science+Business Media: New York; 2013. Available from https://vuquangnguyen2016.files.wordpress.com/2018/03/applied-predictive-modeling-max-kuhn-kjell-johnson_1518.pdf

[6] M. Wistuba, N. Schilling and L. Schmidt-Thieme, "Hyperparameter Opti-mization Machines," 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Montreal, QC, 2016, pp. 41-50.

[7] Weisstein, Eric W. "Laplace Transform." From MathWorld--A Wolfram Web Resource. Cited date 2020-04-05. Available from https://math-world.wolfram.com/LaplaceTransform.html

[8] J. Cao, Z. Su, L. Yu, D. Chang, X. Li and Z. Ma, "Softmax Cross Entropy Loss with Unbiased Decision Boundary for Image Classification," 2018 Chi-nese Automation Congress (CAC), Xi'an, China, 2018, pp. 2028-2032, doi: 10.1109/CAC.2018.8623242.

[9] Bergstra, James, and Yoshua Bengio. “Random search for hyper-parame-ter optimization.” Journal of Machine Learning Research 13, no. Feb (2012): 281–305.

(52)

38 | REFERENCES

[12] James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization. In Proceedings of the 24th In-ternational Conference on Neural Information Processing Systems (NIPS’11). Curran Associates Inc., Red Hook, NY, USA, 2546–2554.

[13] gRPC. About gRPC. Cited date 2020-05-15. Available from https://grpc.io/about/

[14] gRPC. Guides. Cited date 2020-05-15. Available from https://grpc.io/docs/guides/

[15] gRPC. Protocol Buffers. Cited date 2020-05-15. Available from https://developers.google.com/protocol-buffers

[16] Kivy. Kivy framework. ver. 1.11.0. Cited date 2020-05-15. Available from https://kivy.org/doc/stable/api-kivy.html

[17] Kivy. About us. ver. 1.11.0. Cited date 2020-05-15. Available from https://kivy.org/#aboutus

[18] Kivy. Kv language. ver. 1.11.0. Cited date 2020-05-15. Available from https://kivy.org/doc/stable/guide/lang.html

[19] Xilinx. ZCU102 Evaluation Board, User Guide. UG1182 (v1.6). 12 June 2019. Cited date 2020-05-06. Available from https://www.xilinx.com/sup-port/documentation/boards_and_kits/zcu102/ug1182-zcu102-eval-bd.pdf [20] Xilinx. DPU for Convolutional Neural Network v3.0, DPU IP Product Guide. PG338 (v3.0). 13 August 2019. Cited date 2020-05-06. Available from

https://www.xilinx.com/support/documentation/ip_documenta-tion/dpu/v3_0/pg338-dpu.pdf

[21] Xilinx. Vitis AI User Guide. UG1414 (v1.0). 18 December 2019. Cited date 2020-05-06. Available from https://www.xilinx.com/support/documenta-tion/sw_manuals/vitis_ai/1_0/ug1414-vitis-ai.pdf

[22] Keras. Keras Tuner documentation. Cited date 2020-05-15. Available from https://keras-team.github.io/keras-tuner/

[23] Bergstra, J., Yamins, D., Cox, D. D. (2013) Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. To appear in Proc. of the 30th International Conference on Ma-chine Learning (ICML 2013).

(53)

REFERENCES | 39

Association for Computing Machinery, New York, NY, USA, 203–209. DOI: https://doi-org.focus.lib.kth.se/10.1145/3239576.3239622

[25] Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An empir-ical evaluation of deep architectures on problems with many factors of varia-tion. In Z. Ghahramani, editor, Proceedings of the Twenty-fourth Interna-tional Conference on Machine Learning (ICML’07), pages 473–480. ACM, 2007.

[26] Andrew Quitadadmo, James Johnson, and Xinghua Shi. 2017. Bayesian Hyperparameter Optimization for Machine Learning Based eQTL Analysis. In Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics (ACM-BCB ’17). Association for Computing Machinery, New York, NY, USA, 98–106. DOI: https://doi-org.focus.lib.kth.smodele/10.1145/3107411.3107434

[27] Harvard Intelligent Probabilistic Systems Group. Spearmint. Cited date 2020-05-12. Available from https://github.com/HIPS/Spearmint

[28] Nica, A. C., & Dermitzakis, E. T. (2013). Expression quantitative trait loci: present and future. Philosophical transactions of the Royal Society of London.

Series B, Biological sciences, 368(1620), 20120362.

https://doi.org/10.1098/rstb.2012.0362

[29] Python. abc — Abstract Base Classes. ver 3.6.10. Cited date 2020-05-14. Available from https://docs.python.org/3.6/library/abc.html

[30] Kivy. FileChooser. ver. 1.11.0. Cited date 2020-05-15. Available from https://kivy.org/doc/stable/api-kivy.uix.filechooser.html

(54)

(55)

APPENDIX | 39

8 Appendix

Appendix A – CIFAR-10 Example Image

Appendix B - FileChooser Basic Configuration

Appendix C - RecyclerView Basic Configuration

(56)

42 | APPENDIX

(57)

(58)