• No results found

Investigating Skin Cancer with Unsupervised Learning

N/A
N/A
Protected

Academic year: 2022

Share "Investigating Skin Cancer with Unsupervised Learning"

Copied!
43
0
0

Loading.... (view fulltext now)

Full text

(1)

IN

DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS

STOCKHOLM SWEDEN 2019 ,

Investigating Skin Cancer with Unsupervised Learning

KEIVAN MATINZADEH RAFAEL DOLFE

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

(2)

Investigating Skin Cancer with Unsupervised Learning

KEIVAN MATINZADEH RAFAEL DOLFE

Degree Project in Computer Science Date: June 3, 2019

Supervisor: Pawel Herman Examiner: Örjan Ekeberg

School of Electrical Engineering and Computer Science

Swedish title: Undersökande av hudcancer med oövervakat lärande

(3)
(4)

iii

Abstract

Skin cancer is one of the most commonly diagnosed cancers in the world.

Diagnosis of skin cancer is commonly performed by analysing skin lesions on the patient’s body. Today’s medical diagnostics use a established set of labels for different types of skin lesions. Another way of categorising skin lesions could be to let a computer perform the analysis without any prior knowledge of the data, where the data is a data set of skin lesion images. This categorisation could then be compared to the already existing medical labels assigned to each image. This categorisation and comparison could provide insight into underlying structures of skin lesion data.

To investigate this, three unsupervised learning algorithms; K-means, agglom- erative clustering, and spectral clustering, have been used to produce cluster partitionings on a data set of skin lesion images. We found no clear cluster partitionings and no connection to the already existing medical labels. The highest scoring partitioning was produced by spectral clustering when the number of clusters was set to two. Further investigation into the structure of this partitioning revealed that one cluster contained essentially every image.

Although relatively low, the score does indicate that the underlying structure

may be best represented by a single cluster.

(5)

iv

Sammanfattning

Hudcancer är en av de mest förekommande typerna av cancer i världen. Det vanligaste sättet att diagnosticera hudcancer är för en dermatolog att analysera hudsår på en patients kropp. Dagens medicinsk diagnostik använder en etablerad mängd beteckningar för olika typer av hudsår. Ett alternativ till denna typ av diagnostisering skulle kunna vara att låta en dator utan förkunskap om datan (bilder på hudsår) sköta analysen. Denna katogorisering skulle sedan kunna jämföras med de existerande medicinska katogorierna som varje bild fått.

För att undersöka detta användes tre algoritmer av typen oövervakat lärande för att producera kluster-indelningar på ett dataset innehållandes bilder på hudsår. Dessa algoritmer var K-means, agglomerative clustering, och spectral clustering. Vi fann inga uppenbara kluster-indelningar och ingen koppling mellan de nuvarande medicinska beteckningarna. Den indelning av kluster som fick högst poäng när den evaluaredes internt var den indelning av kluster genererad av spectral clustering. Detta skedde när antalet kluster som algoritmen skulle dela upp datan i var satt till två. En djupare undersökning i strukturen av denna indelning visade att ett av klustrerna i princip innehöll varje bild.

Även fast Silhouette-värdet för denna indelning var låg, pekar värdet på att den

underliggande strukturen bäst kan representeras av ett enda kluster.

(6)

v

Acknowledgements

We would first like to thank our supervisor associate professor Pawel Herman

at KTH for his helpful feedback and suggestions. He helped us formulate

meaningful research questions and always pointed us in the right direction

throughout the course of the project. We would also like to thank the creators

of the HAM10000 data set. Without the data set, this project would not have

been possible.

(7)

Contents

1 Introduction 1

1.1 Problem Statement & Research Questions . . . . 2

1.2 Scope . . . . 3

2 Background 4 2.1 Machine learning - An Unsupervised Learning Approach . . . 4

2.1.1 Unsupervised Learning . . . . 4

2.1.2 Clustering . . . . 5

2.2 Clustering Algorithms . . . . 5

2.2.1 K-Means . . . . 6

2.2.2 Agglomerative Hierarchical Clustering . . . . 6

2.2.3 Spectral Clustering . . . . 7

2.3 Silhouette Index . . . . 7

2.4 Rand Index . . . . 8

2.5 Dimensionality . . . . 9

2.5.1 The Curse of Dimensionality . . . . 9

2.5.2 Principal Component Analysis (PCA) . . . . 10

2.6 Related Work . . . . 11

2.6.1 Data Clustering . . . . 11

2.6.2 Image Segmentation . . . . 11

3 Method 13 3.1 Data set . . . . 13

3.2 Implementation . . . . 14

3.2.1 Approach . . . . 14

3.2.2 Scikit-learn . . . . 15

3.2.3 OpenCV . . . . 15

3.2.4 Setup . . . . 15

3.2.5 Downloading and Reading . . . . 16

vi

(8)

CONTENTS vii

3.2.6 Preprocessing . . . . 16

3.2.7 Clustering Algorithms . . . . 17

3.2.8 Silhouette and Rand Index Scores . . . . 17

4 Result 18 4.1 Preprocessing the data . . . . 18

4.1.1 Reading images as grayscale and resizing . . . . 18

4.1.2 Applying PCA . . . . 19

4.2 Clustering the Data . . . . 20

4.3 Computing Silhouette Scores . . . . 20

4.4 Computing Rand Index Scores . . . . 21

4.5 Analysing the Data . . . . 22

4.5.1 Investigating the Findings of the Rand Index Score . . 23

4.5.2 Investigating the Findings of the Silhouette Index Score 24 5 Discussion 26 5.1 Influence of Feature Extraction . . . . 26

5.2 Source of Errors . . . . 27

5.3 Retrospective . . . . 28

6 Conclusion 29

Bibliography 30

(9)
(10)

Chapter 1 Introduction

Skin cancer is one of the most commonly diagnosed cancers in the US with skin cancer itself being most common amongst non-hispanic whites, with a rate of 27 per 100 000 [1]. This means that it is of importance to conduct further research in order to gain a better understanding of the disease. The most common way of diagnosis is for a dermatologist to analyse lesions on the patients body. Dermatologists can then analyse large amounts of independent lesions to gain further understanding of the lesions themselves. Large image- sets of lesions can also be used in computer aided diagnostics (CAD), meaning having machine learning algorithms intelligently analyse image sets to e.g. gain a deeper understanding of the data.

Generally, many studies have been conducted for different types of cancers for the purpose of disease prognosis and prediction of outcome of treatments [2].

Most of these studies involve the use of machine learning algorithms belonging to a subset of machine learning called supervised learning. A supervised learning model learns to generalise on labeled pairs of input-output to respond correctly to all possible inputs [3]. Categorising data based on prior knowledge is called classification. A different approach would then be to use a group of machine learning algorithms belonging to the subset unsupervised learning.

These type of algorithms are not trained, thus they do not know beforehand what to look for. Instead the algorithms draw their own conclusions from the data, and classify it in whatever way they deem appropriate. This type of categorising, without any prior knowledge, is called clustering. "The aim of unsupervised learning is to find clusters of similar inputs in the data without being explicitly told that these datapoints belong to one class and those to a

1

(11)

2 CHAPTER 1. INTRODUCTION

different class. Instead, the algorithm has to discover the similarities for itself."

[3]. One of the main purposes of clustering is to gain insight about data, finding underlying structures and anomalies, and to generate hypotheses [4].

1.1 Problem Statement & Research Questions

Artificial intelligence have been used to create models capable of diagnosing skin cancer accurately on par with trained dermatologists [5] and in some cases even outperforming the dermatologists [6]. This demonstrates the importance and the potential of using artificial intelligence in the field of skin cancer. While this is very important, a whole field of machine learning in the case of skin cancer seems to have been partially ignored, namely unsupervised learning.

Some research in regards to unsupervised learning in other types of cancer have been conducted, e.g. computing the survival rate of lung cancer patients [7]. While studies like this are somewhat relevant because of their use of unsupervised learning, the data used is numerical. An interesting approach is then to instead use images and then apply unsupervised learning techniques to that data. This works well with skin cancer since the tumours are visible on the skin as lesions, and also because of the fact that many different types of tumours fall under the same umbrella of skin cancer. This opens up for the possibility of investigating whether the different types of skin cancers have an underlying "natural" and potentially unknown structure.

In our thesis we will compare the result of classification on skin cancer car- ried out by various clustering algorithms, and also compare the results to the true labels of the data set. The aim of the thesis is to answer the following questions:

• "Do the classes generated through the different clustering algorithms resemble the ground truth labels of the data set?"

• "Does the result of the clustering provide any insight on any underlying structure of the data?"

Our hypothesis is that K-means will perform well in regards to recovering the true structure of the data. This was observed in previous research conducted by Souto et al. [8] on gene expression data.

Our research differ from the work presented in [8] in the type of data used. We

use images of lesions from the Tschandl, Rosendahl, and Kittler [9] data set

(12)

CHAPTER 1. INTRODUCTION 3

while the data used in [8] are gene expression data.

1.2 Scope

The scope is limited by the data set and the number of algorithms and validation measures. Three algorithms were chosen as to increase the likelihood of obtaining a meaningful result. However, there are many clustering algorithms available and research on new ones are continuously published.

To be able to determine the quality of a cluster partitioning, what is called an internal validation index is used. In some instances, a specific index is chosen depending on what clustering algorithm is used. In this thesis only one is used, the Silhouette Index. While many different ones do exist, the literature supports the idea of only using the Silhouette index to carry out the validation for different types of clustering algorithms, since the Silhouette index only depends on the partitioning of the data, and not on the clustering algorithm that was used to obtain said partitioning [10]. The same reasoning is used to motivate the choice of only having one external validation index. An external validation index is what is used to compare two cluster partitionings between each other.

Because of these limitations our scope is constrained to the data set described

in 3.1 and the algorithms and validation methods shown in table 3.1.

(13)

Chapter 2 Background

2.1 Machine learning - An Unsupervised Learn- ing Approach

Machine learning is a branch of Artificial Intelligence, where computational methods are used to improve performance or to make accurate predictions [11], where accuracy is measured by how well the chosen actions reflect the correct ones [3].

2.1.1 Unsupervised Learning

Unsupervised learning is a subset of machine learning. Contrary to the more well known branch of machine learning, supervised learning, where right answer is already labeled in the data, unsupervised learning uses no labeling and instead tries to categorise input on common features. Without being told which data points belong to which class, the algorithm have to cluster these points together based on their similarities. The aim of unsupervised learning is for an algorithm to, without being explicitly told that some data points belongs to different classes, find cluster of similar inputs based on the data [3].

4

(14)

CHAPTER 2. BACKGROUND 5

2.1.2 Clustering

Clustering, or Cluster analysis, is the formal study of algorithms and methods for grouping objects [12]. In cluster analysis the groups are of interest themselves, their assessments are incentric. The groups do not need to reflect some reference set of classes (which is a requirement in classification). The goal of cluster analysis is then not to establish rules to separate future data into categories, but to find a valid organisation of the data.

The clustering structure is formally represented as a set of subsets C = C 1 , ..., C k of S such that: S k

i=1 C i and C i T C j = ∅ [13].

To determine how similar two objects are, two main types of measurements are used: distance measures and similarity measures, where the distance measure between two instances x i and x j commonly is denoted as d(x i , x j ).

Figure 2.1 show a set of data points before and after being clustered. This partitioning contains three different clusters (which is denoted by the different colours).

Figure 2.1: A before and after image of data points that have been clustered by the K-means algorithm [14].

2.2 Clustering Algorithms

This section outlines the theory behind the three clustering algorithms that will

be used.

(15)

6 CHAPTER 2. BACKGROUND

2.2.1 K-Means

The K-means Algorithm is a popular partitioning algorithm that was first proposed over 60 years ago, in 1955 [4]. Given a set of data points x 1 ..x n and a set of k cluster centres c 1 ..c k , the algorithm tries to allocate the cluster centres in the input space such that there is one cluster centre in the middle of each cluster of data points.

The position for each cluster centre is randomly spaced out in the input space.

Then, for each data point x i the distance (e.g. the Euclidean distance) between x i

and each cluster centre c j , D(x i ,c j ), is calculated. The data point is assigned to whichever cluster centre c j that minimises the distance function D(x i ,c j ).

When each data point has been assigned to a cluster centre the algorithm calculates the mean value for all of the data points assigned to each cluster centre and then updates the position of the cluster centre to be that point in space. Thus moving each respective cluster centre to the centre of its assigned data points. The algorithm is then iterated until the cluster centres stop moving [3].

2.2.2 Agglomerative Hierarchical Clustering

Agglomerative clustering is a hierarchical clustering [13] method. Hierarchical clustering is a way of cluster analysis which constructs clusters by recursively partitioning instances in either a top down or bottom up fashion based on some similarity measure, such as the sum of squares. The agglomerative method is a bottom up approach; Each object is initially represented as its own cluster and then they are merged together until the desired structure is obtained.

Representing the nested grouping of objects and similarity levels at which groupings change results in a dendogram. Cutting the dendogram at the simi- larity level that is desired yields a clustering of the data objects. This is one of the strength of hierarchical clustering. The user has the option to choose different partitions according to the desired similarity level.

One of the weaknesses of hierarchical clustering is its time complexity, which

is at least O(n 2 ), where n is the total number of instances.

(16)

CHAPTER 2. BACKGROUND 7

2.2.3 Spectral Clustering

Spectral clustering is a method of clustering where the dimensionality of the data first is reduced before applying some clustering method on it, e.g. a partitioning algorithm such as K-means [15].

The data points are interpreted as vertices and the similarity between each point are interpreted as edges. Through the spectrum (or eigenvalues) of similarity matrix A, the dimensions are reduced and then a classical clustering algorithm is applied on this lower dimension. It can yield better results in cases where algorithms such as K-means, who looks for "round blobs" in the data, would fail [16].

2.3 Silhouette Index

The Silhouette index is a way to measure how accurate a clustering of an object is [10]. The index ranges from -1 to +1 where a low value indicates that the object would potentially fit better in another cluster while a high index indicates that the current cluster is a good match. A index close to 0 then means that the object could be assigned to another cluster without making the accuracy of the clustering any worse [17].

To create the silhouette what is needed is a set of data clustered into k clusters through a clustering technique, and the collection of proximities between all objects in the clusters. For each object, a value s(i) is then calculated and plotted in a graph.

Take any object i in the data set, and denote A as the cluster it is assigned, then a(i) is defined as the average dissimilarity from i to all the other points in A.

The minimum of all the average dissimilarities to all other clusters of which i is not assigned is denoted b(i). The Silhouette index s(i) of object i is then calculated as

s(i) =

 

 

1-a(i)/b(i) if a(i) < b(i)

0 if a(i) = b(i)

b(i)/a(i)-1 if a(i) > b(i)

(2.1)

The Silhouette index of a whole cluster C, ¯ s(C) is the mean of all objects i in

cluster C. The Silhouette index of a whole cluster partitioning is then the mean

(17)

8 CHAPTER 2. BACKGROUND

of each cluster’s Silhouette index.

Figure 2.2: "An illustration of the elements involved in the computation of s(i), where the object i belongs to cluster A." [10]

2.4 Rand Index

The Rand index is an external validation index named after William M. Rand [18]. An external validation index measure performance by matching a cluster- ing structure to a priori information [12].

Suppose U = u 1 ...u R and V = v 1 ...v C represents two different partitionings of the object set S = O 1 ...O n , then U and V are subsets of S and S R

i=1 u i = S = S C

j=1 v i . A contingency table is created where n ij denotes the number of objects that are common to classes u i and v j . The table represents the class overlap between the two partitions U and V.

RxC Contingency table Class v 1 v 2 ... v C Sums

u 1 n 11 n 12 ... n 1C n

u 2 n 21 n 22 ... n 2C n

· · · · ·

· · · · ·

· · · · ·

u R n R1 n R2 ... n RC n

Sums n ·1 n ·2 ... n ·C n.. = n

Object pairs in the table are classified as four different types

(18)

CHAPTER 2. BACKGROUND 9

(i) a, Objects in the pair are placed in the same class in U and in the same class in V

(ii) b, Objects in the pair are placed in different classes in U and in different classes in V

(iii) c, Objects in the pair are placed in different classes in U and in the same class in V

(iv) d, Objects in the pair are placed in the same class in U and in different classes in V

where types (i) and (ii) are interpreted as agreements in the classification of the objects from a pair while types (iii) and (iv) represent disagreements.

The Rand index is then calculated as R = a+b+c+d a+b = a+b (

n2

) .

What the Rand index represents is the frequency of agreements over the total pairs.

2.5 Dimensionality

By taking a set of input values, machine learning algorithms produce an output for that input vector. The vector contains several real numbers written down as, e.g. ~ v = (0.1, 0.2, −0.3, 0.4, −0.5). The size of the vector is then what is called the dimensionality of the input. Plotting ~ v would then require one dimension in the data space for each element in the vector. In this example, vector ~ v has five dimensions [3].

2.5.1 The Curse of Dimensionality

As the number of input dimensions gets larger, more data will be needed for the algorithm to be able to generalise in an accurate manner [3].

One can imagine covering a line with 10 points in 1D-space where the distance

d(x i , x j ) between two points are 1/10 of the length of the line l. Representing

the line in 2D-space while still having the distance d(x i , x j ) between points

i and j entails having to fill out the plane with 100, 10 2 , points. Taking it

further and representing it in 3D means having to cover a cube resulting in 10 3

points.

(19)

10 CHAPTER 2. BACKGROUND

2.5.2 Principal Component Analysis (PCA)

Principal components [19] allow the summation of the data set with a smaller number of representative variables that collectively explain most of the the variability in the original set.

Principal Component Analysis (PCA), is a technique for finding principal components, lower-dimensional representation of the data that captures as much information as possible.

Given a vector ~ x with p random variables, we try to reduce the number of dimensions to a smaller value q. In order to reduce the number of dimensions from p to q PCA finds linear combinations a 0 1 x, a 0 2 x, ..., a 0 q x, which are the principal components of the data. These specific components have the maxi- mum variance for the data. The vectors a 1 , a 2 , ..., a q are the eigenvectors of the covariance matrix S, corresponding to the q largest eigenvalues. The variance for each principal component is given by their respective eigenvalues. The proportion of the total variance in the original data set accounted for by the first q principal components is then represented by the ratio of the sum of the first q eigenvalues to the sum of the variances of all p original variables [20].

There exist several methods for choosing the appropriate number of PC’s for a data set. One intuitive way is the scree test method [21]. Given a scree plot (see 2.3 below), a curve representing the eigenvalues versus their rank, an "elbow"

in the curve is sought after. This elbow corresponds to an inflexion point. It is then sufficient to determine the point where the sign of the second-order derivative change and use that as a cut off point for choosing an appropriate number of principal components. The reason being as the sign changes, the curve gets less steep meaning the following eigenvalues will account for less variance of the data.

Figure 2.3: A scree plot [22].

(20)

CHAPTER 2. BACKGROUND 11

2.6 Related Work

The two main approaches of using clustering in medical diagnostics has been to use it either for clustering numerical data, or image segmentation.

2.6.1 Data Clustering

As mentioned before, clustering is used to find underlying structures in data.

Souto et al. [8] conducted a study comparing different clustering methods and proximity measures on cancer gene expression data. They found that a mixture of multivariate Gaussians and K-means produced the best results in terms of the recovery of the actual structure of the data sets. Rand index was used for external validation.

Bhattacharjee et al. [23] was able find subclasses of lung cancer by applying hierarchical and probabilistic clustering to expression data.

Most of the relevant research uses data in the form of gene expression and not image data which would be more relevant for this study. But Souto et al. [8]’s procedure of evaluating different cluster partitionings generated by different algorithms is similar to our approach explained in 3.2.1.

2.6.2 Image Segmentation

Even if our work focuses on clustering images for the sake of finding underlying structures, image segmentation is worth mentioning as clustering is widely used in that field.

Multiple studies have explored different ways of segmenting images using clustering algorithms. An early study from Coleman et al. [24] used a K- means algorithm in order to segment images. They concluded that while segmentation performed by a human or a trained segmenter would produce a more satisfying result, the unsupervised approach had its own advantages;

"the supervised method is incapable of satisfactory performance in situations

where the statistics of the scene vary substantially". This means that clustering

has an advantage in cases where the images look substantially different to one

another. Also mentioned was the fact that no training phase is needed, omitting

a tedious step in the procedure.

(21)

12 CHAPTER 2. BACKGROUND

More recent studies have looked at using segmentation for medical images. Li et al. [25] proposed a new image segmentation algorithm based on spatial fuzzy clustering. The data used was medical images of different modalities, including a CT scan of liver tumours and an MRI slice of cerebral tissues. Ng et al. [26]

proposed a new method of combining K-means with the watershed algorithm

to be used for segmentation on medical images, producing segmentation maps

which have 92% fewer partitions than the segmentation maps produced by the

conventional watershed algorithm.

(22)

Chapter 3 Method

This section details the methods used to perform the experiment. It is aimed at detailing the implementation well enough to be able to reproduce the re- sults.

3.1 Data set

One data set containing 10015 thousand images of skin lesions were used. This data set is called the HAM10000 training set and the images it contains were collected over a period of 20 years from the Department of Dermatology at the Medical University of Vienna and the skin cancer practice of Cliff Rosendahl in Queensland, Australia.

The images are organised into seven categories; bcc - Basacel cell carcinoma, bkl - Benign keratosis, df - Dermatofibroma, nv - Melanocytic nevi, mel - Melanoma, vasc - Vascular skin lesions. These are regarded as the ground truth labels of the data.

13

(23)

14 CHAPTER 3. METHOD

3.2 Implementation

3.2.1 Approach

Each algorithm runs for a total of 10 times with the number of desired clusters K ranging between 2 and 14. The reason for having K range between these values is because of the a priori knowledge of the number of true labels in the data set; 7. For each algorithm, each cluster partitioning from the different values of K is evaluated by an internal validation index. The index evaluates the validity of the generated clusters. See 2.3 for more information. The cluster partitioning that scores the best for each algorithm is then chosen. Only one internal validation index is used, namely the Silhouette index

Since the data set being used has already been labeled, the cluster partitioning scoring the highest for the Silhouette Index for each algorithm is compared to the true labels with the external validation method Rand Index.

Algorithms and evaluation methods used

Algorithms

Spectral clustering Agglomerative clustering K-means

Internal Evaluation Silhouette index External Evaluation Rand index

Table 3.1: All algorithms and validation measured used in this paper

A comparison between the best clusters for each algorithms based on the internal validation is then conducted.

For each algorithm’s generated clusters, the accuracy of those clusters is evalu-

ated by the internal validation index. E.g. all thirteen clusters generated through

K-means are evaluated with the Silhouette index. The one cluster that scores

the highest out of all fourteen clusters is then chosen. The highest scoring

cluster of each algorithm is selected and then compared based on the same

criterion as before, namely comparing them and picking out the one cluster

with the highest Silhouette index score. The highest scoring cluster is then

compared to the true labels based on an external validation criterion. The result

may then be a cluster partitioning that differs from the original classification of

(24)

CHAPTER 3. METHOD 15

the data set. This can yield information about potentially underlying structures in the data that the human eye might have missed.

Figure 3.1 below is a flowchart describing the whole process, from downloading the data set, applying the algorithms on the data and measuring the results, to analysing the data.

2019-05-14, 18+33 Untitled Diagram.drawio

Page 1 of 1 about:blank

Download and read data set

Preprocess data set - resize and PCA

Apply clustering algorithms on data

Compute Silhouette and Rand index

scores Analyse the data

Figure 3.1: A flowchart over the process of applying unsupervised learning on our data set.

3.2.2 Scikit-learn

The library used for the experiments is the open source library Scikit-learn [27].

Scikit-learn was chosen for its wide range of state-of-the-art algorithms for unsupervised learning and the fact that it is implemented through a high-level language, Python.

3.2.3 OpenCV

The library used for reading and resizing of the the images is the open source library OpenCV [28]. The library is written in C and C++ with interfaces for Python and other languages.

3.2.4 Setup

All code is in Python. The following code snippet shows the libraries imported

for the project.

(25)

16 CHAPTER 3. METHOD

i m p o r t o s i m p o r t k e r a s

i m p o r t numpy a s np i m p o r t p a n d a s a s pd i m p o r t cv2

i m p o r t m a t p l o t l i b . p y p l o t a s p l t i m p o r t s k l e a r n . c l u s t e r

i m p o r t s k l e a r n . m e t r i c s

i m p o r t s k l e a r n . d e c o m p o s i t i o n

The libraries and sublibraries have been installed with the package manager pip 19.1 and the python code was written on version 3.5.

3.2.5 Downloading and Reading

The data set is 3GB large and was downloaded from the website Kaggle. The Kaggle database is public and can be found by searching on Google for Skin Cancer MNIST: HAM10000 Kaggle.

The data set was read as grayscale using cv2.imread.

3.2.6 Preprocessing

The first feature reduction is reading the images as grayscale. This allows each pixel to be processed as only one grayscale value instead of three values as would be necessary for retaining the RGB color space. An image read as grayscale has 450x600 features, which is equivalent to 270000 features.

Each picture is then resized using cv2.resize with linear interpolation. This was done for every image.

Lastly, each image is reduced to its principal components using the

sklearn.decomposition.PCA implementation. Before applying the analysis, we must first normalize the pixels. Pixels in the HAM10000 data set have a value between 0 and 255. Therefore, we divide each pixel with 255, normalizing our data in the range of 0 to 1. Secondly, we analyse the variance explained by each principal component and pick a number of principal components to keep.

This choice is done using the scree test method, eyeing where the cumulated

explained variance begins to saturate 2.5.2.

(26)

CHAPTER 3. METHOD 17

The number of principal components to keep is then provided as a parameter for the sklearn.decomposition.PCA implementation. It is the n_components parameter.

3.2.7 Clustering Algorithms

The three clustering algorithms used are: K-means, hierarchical agglomerative clustering and spectral clustering. The python library scikit provides their implementations of the algorithms as part of the sklearn.cluster sublibrary.

These are sklearn.cluster.KMeans, sklearn.cluster.AgglomerativeClustering and sklearn.cluster.SpectralClustering. See 2.2 for a detailed overview on how each algorithm works.

Default settings have been used for all algorithms. That is except for the parameter n_clusters, which lets us vary the number of clusters. The number of clusters explored is 2 to 14. The reason for 2 is that it’s the lowest number of clusters that Silhouette index can produce a score for and the reason for 14 is that it is twice the number of classes in the ground truth labeling. The parameter n_jobs is set to -1 for K-means. This is a parameter that lets the algorithm run on multiple cores, lowering computation time.

3.2.8 Silhouette and Rand Index Scores

The two indices used for evaluation are the Silhouette and Rand index val-

idation measures. sklearn.metrics is the sublibrary used, and the specific

implementations are sklearn.metrics.silhouette_score for the Silhouette index

and sklearn.metrics.adjusted_rand_score for the Rand index. Default settings

are used only.

(27)

Chapter 4 Result

This section details the result of the preprocessing step, the clustering step, the score computation step and finally the analysing step. This order follows the flowchart 3.1 presented in the method.

4.1 Preprocessing the data

The goal of preprocessing is to reduce the amount of features as much as possible while still retaining image cohesion. This is in order to combat the curse of dimensionality.

4.1.1 Reading images as grayscale and resizing

After executing the first feature reduction, which is to read the images as grayscale, the images are also resized. Resizing the images from 450x600 to 75x100 reduces the amount of features drastically, by a factor of 36 to be exact. Doing both of these, reading as grayscale as well as resizing, reduces the amount of features from 270000 to 7500 (see figure 4.1).

18

(28)

CHAPTER 4. RESULT 19

Figure 4.1: The picture on the left is the raw, coloured 450x600 image (810000 features). The picture on the right is the same picture grayscaled and resized (7500 features).

4.1.2 Applying PCA

The next step is to apply PCA. Figure 4 shows the cumulative sum of the first 140 principal components. The blue dot shows the point where the explained variance begins to saturate. The x-value for that point becomes the amount of principal components kept.

Figure 4.2: The x-axis is the number of components, while the y-axis is the cumulative sum up to and including that component. The blue circle shows the place where the explained variance begins to saturate.

Most of the variance is explained by the two first components (58%). Neverthe-

less, the scree test method suggests keeping the first 18 principal components.

(29)

20 CHAPTER 4. RESULT

It results in keeping 87% of explained variance. This is the number of principal components that our algorithms have been run on.

The result of the preprocessing is a reduction of 810000 features down to 18 principal components.

4.2 Clustering the Data

For K-means and agglomerative clustering, the full data set size has been able to be utilised. However, because of spectral clustering’s high time complexity, running spectral clustering on the entire data set of 10000 images has not been possible with the hardware we have. Therefore, we have resorted to running spectral clustering on a random sample of 1500 images.

In limited capacities, however, we have run it on other choices of data set sizes too, such as 1000, 2000 and even 10000. For the size of 2000, we’ve only run our algorithms choosing a small number of clusters. For 10000, we have only run spectral clustering when keeping 2 principal components - in that case, it was few enough components to handle so that the algorithm did not get stuck.

The Silhouette Index score for spectral clustering was at its highest then, but the explained variance reached only 58%. The reason this is mentioned is that all of these configurations did in the end yield reasonably similar Silhouette index scores, indicating the results for 1500 images are not an anomaly.

Moreover, this is true for all algorithms. Although K-means and agglomerative clustering have been successfully run on 10000 images, they have also been run on many different sizes. Compared to spectral clustering, however, they have always yielded Silhouette and Rand index values very close to what other sizes of data have produced.

4.3 Computing Silhouette Scores

Silhouette and Rand index scores have been computed for every number of cluster for each algorithm. Using the result of the Silhouette index scores, the number of clusters (n_clusters) that achieved the highest score has been picked as the best clustering of that algorithm.

For the Silhouette scores, the results are as follows:

(30)

CHAPTER 4. RESULT 21

Silhouette scores for each algorithm

n_clusters K-means Agglomerative

clustering

Spectral clus- tering

2 0.228 0.192 0.423

3 0.209 0.134 0.423

4 0.165 0.128 0.343

5 0.163 0.123 0.423

6 0.140 0.090 0.327

7 0.133 0.089 0.423

8 0.126 0.079 0.423

9 0.124 0.080 0.327

10 0.116 0.072 0.327

11 0.115 0.063 0.343

12 0.115 0.063 0.327

13 0.107 0.062 0.423

14 0.108 0.064 0.316

Table 4.1: The Silhouette index score for each algorithm and number of clusters.

Setting n_clusters=2 yielded the best results. Nevertheless, the algorithms achieved low Silhouette index scores.

The number of clusters that consistently achieved the best score was

n_clusters=2. However, as seen in table 4.1, none of the algorithms produce a clustering that achieves a high Silhouette index score. Spectral clustering does achieve a considerably higher score than K-means and agglomerative clustering, but that score is still low. An investigation into these results is conducted in the next section.

4.4 Computing Rand Index Scores

Moving on to the Rand index scores, the results are as follows.

(31)

22 CHAPTER 4. RESULT

Rand index scores for each algorithm PCA = 87%

n_clusters K-means Agglomerative

clustering

Spectral clus- tering

2 0.005 0.040 -0.001

3 0.030 0.026 -0.001

4 0.034 0.039 -0.002

5 0.008 0.044 -0.001

6 0.004 0.054 -0.000

7 0.017 0.024 -0.001

8 0.021 0.028 -0.001

9 0.019 0.028 -0.000

10 0.015 0.007 -0.000

11 0.011 0.019 -0.002

12 0.006 0.016 -0.000

13 0.006 0.015 -0.000

14 0.006 0.013 -0.001

Table 4.2: The Rand index scores for each algorithm and number of clusters.

No algorithm achieved anything higher than a few percentages of agreement between ground truth labels and the labelings by each algorithm.

For Rand index, the scores were very close to 0 irrespective of algorithm or number of clusters which implies that none of the cluster partitionings have any connection to the ground truth labels. In other words, the different lesion types given by the ground truth labels do not correspond to any of the partitionings found by any of the algorithms used. An analysis of these results is conducted in the next section.

4.5 Analysing the Data

In order to get a deeper understanding of the structure of our data, and why the

results manifested as they did, it is important to visualise them. Since high-

dimensional data is difficult to visualise, we have for analysing the Rand index

scores resorted to visualising only the most significant principal components

and for analysing the Silhouette index scores decided to look at the distribution

of the labeling of our results.

(32)

CHAPTER 4. RESULT 23

4.5.1 Investigating the Findings of the Rand Index Score

The Rand index scores in table 4.2 indicated no resemblance to the ground truth labels already provided for our data. To investigate this result further, we decided to plot the most significant principal components and observe their distribution in regards to their corresponding ground truth labels.

In figure 4.3, each principal component (up to the 8th) is paired with its con- secutive principal component and plotted on a 2D chart. Also, the colour of each data point represents the ground truth label of the principal component’s corresponding image. These mappings are explained in table 4.3.

Figure 4.3: Each pair of principal components plotted on a 2D chart, up to the

8th. The colours represent the ground truth labels the principal components

correspond to.

(33)

24 CHAPTER 4. RESULT

Ground truth label to colour correspondence table Ground truth label Colour

nv blue

mel red

bkl yellow

bcc orange

df pink

vasc green

akiec brown

Table 4.3: Each ground truth label (skin lesion type) maps to one colour. These mappings are then used in figure 4.3

No clear clusters manifest as a result of this analysis of the principal components’

ground truth labels. However, it is worthy of notice that the concentration of nv labeled data is not completely uniform. The first subplot shows a higher concentration of blue points in the upper regions of the chart, while there’s a mix of all colours in the lower regions. Similar differences manifest in the other subplots.

In summary, no clear clusters were found. This fact, supported by the low Rand index scores shown in table 4.2, indicates that the cluster partitionings generated by the algorithms do not have any meaningful resemblance to the ground truth labels of the data set.

4.5.2 Investigating the Findings of the Silhouette In- dex Score

K-Means

The Silhouette index score for the K-means algorithm indicates that the data is

not well represented by two or more clusters. A Silhouette index value equal to

0 means, as explained in 2.3, that the objects in a partitioning could be assigned

to its neighbouring cluster without loss in accuracy. Thus, a Silhouette index

score of 0.2 means that most images could be assigned to its neighbouring

cluster without significant lost in accuracy. This indicates that the clusters

produced by K-means are not substantially different from each other.

(34)

CHAPTER 4. RESULT 25

Agglomerative Clustering

Agglomerative clustering produced the same score as the K-means algorithm.

The same conclusion about the partitioning of clusters drawn from the result of K-means can thus be made here. A score of 0.2 does not indicate that the clusters are partitioned in a meaningful way.

Spectral Clustering

Spectral clustering scored approximately 0.4, which is about double the score of K-means and agglomerative clustering. The way the algorithm achieved this score was by grouping the overwhelming majority of the images into one cluster. Figure 4.4 shows this discrepancy in the number of images assigned to cluster 1 versus cluster 2 for all algorithms.

Figure 4.4: Distribution of labels when n_clusters=2 for K-means, agglom- erative and spectral clustering, order preserved. Spectral clustering assigns essentially all images to cluster 1.

K-means clustering results in a roughly equal distribution while agglomerative clustering results in a slight preponderance of images assigned to cluster 1.

Spectral clustering, however, assigns virtually no images to cluster 2.

(35)

Chapter 5 Discussion

In the experiments spectral clustering, agglomerative hierarchical clustering and K-means clustering were used to create partitionings of images of skin lesions.

Out of all the generated partitionings, the cluster partitioning with the most cohesive clusters was selected. This cluster was then compared to the ground truth labels of the data set. The validation methods used in the evaluation of cluster partitionings were the Silhouette index for internal evaluation, and the Rand index for external evaluation.

Using our method, the results show that applying unsupervised learning algo- rithms on the HAM10000 data set does not produce meaningful clusters, nor do the produced clusters indicate any resemblance to the ground truth labels of the data set. While spectral clustering did achieve a decent Silhouette index score, it did so by putting essentially all data points into a single cluster. This is not a very meaningful cluster partitioning and indicates that the data may be best viewed as a single, sparse cluster of images. These findings suggest the answer no to both research questions.

5.1 Influence of Feature Extraction

Feature extraction did not significantly affect the Silhouette or Rand index scores produced by K-means or agglomerative clustering. The only statistical difference was produced by spectral clustering, which unless PCA was applied hovered around 0.0 in both Rand index and Silhouette index scores. This weakens the idea that our cluster partitioning results were the product of an

26

(36)

CHAPTER 5. DISCUSSION 27

insufficient feature extraction, since if there existed actual clusters not yet dis- covered due to faulty feature extraction, gradual appliance of feature extraction should reasonably have yielded gradually higher scores as well.

Furthermore, feature extraction allowed larger data to be funnelled into spectral clustering. If there are too many features, spectral clustering is not fast enough with our hardware to handle 1500 images. This was a major factor in choosing what feature extraction methods to apply. Generally speaking, the fewer features present when running a clustering algorithm the better its performance is. At the same time, however, it is important not to lose too much information in the process. When choosing an appropriate number of principal components with PCA, the amount of preserved variance has to be taken into consideration.

Clustering algorithms generally produce higher Silhouette index scores the fewer the features, but if that cluster partitioning is done on data that retains very little of the explained variance, the results may be misleading.

5.2 Source of Errors

One source of error is the HAM10000 data set itself. Images do not have consistent patterns in terms of how much of the images is comprised of damaged versus healthy skin. Furthermore, some pictures have hair strands or other irrelevant features that do not help the algorithms.

Additionally, as mentioned in the results section, hardware limitations did not allow us to experiment with spectral clustering on larger sizes of data as freely as would be optimal. Since spectral clustering was the most successful algorithm in terms of index scores, the likelihood that some interesting finding may have been missed is higher.

The greatest source of error, however, is likely to be the shape of the skin lesion images. Most images consist mostly of skin. Pixels of skin have equal representation whether they are pixels of the actual skin lesion or the surround- ing healthy skin. This may cause significant confusion for the unsupervised algorithms, since most images, when looking at it from this perspective, have similar features. A segmentation algorithm that somehow ensures that the unsupervised algorithms focus only on the damaged skin may therefore have been necessary to expect good cluster partitionings. Due to a lack of time, this potential solution was not tested.

Lastly, it is worth contemplating over some of the assumptions made about

(37)

28 CHAPTER 5. DISCUSSION

reading the images and processing them. There was no theoretical basis behind the 75x100 resizing resolution. That was picked in some sense arbitrarily, choosing a value that deviated from 450x600 to a great degree but not too much. Furthermore, reading the images as grayscale does indeed reduce the amount of features by a factor of 3, but it might have been more suitable to read the images with their colours and instead resizing to a smaller resolution such as 45x60. These questions are left unanswered, as no study on the optimal balance between these two feature reductions was conducted.

5.3 Retrospective

The report investigates two research questions. The second research question asks whether or not new underlying structures can be found using clustering algorithms. The first explores whether these structures ultimately refer back to the already existing ground truth labels provided by dermatologists. In retrospect, it may have been better to focus on one or the other to allow for a deeper analysis. While they are linked, they also behove different focuses.

When investigating the ground truth labels, for example, it may have been useful to examine more closely the characteristics of the medical labels. On the other hand, when attempting to uncover new underlying structures more research on image segmentation would have been beneficial.

For future work, our recommendation is to focus on segmenting the images into a skin lesion part and a healthy part. This would decrease the dimensionality and render the healthy skin obsolete in the clustering algorithm. The best approach may be to train a neural network to segment the skin lesions. However, a less complex approach for this may be to use a thresholding algorithm to differentiate the mostly similar healthy skin from the lesion, which is usually centered in the image, and then perform some kind of cutting algorithm to retrieve the lesion part. Alternatively, it may be possible to weight the skin lesion pixels higher than the surrounding skin rather than extract it.

If it’s also possible to prepare more capable hardware, that will decrease the

time spent waiting on algorithms significantly, thus allowing the researchers to

spend more time on testing different approaches and algorithms.

(38)

Chapter 6 Conclusion

Using the methods in this report, no resemblance between the ground truth labels of the data set and the classes generated by the clustering algorithm could be found. This is supported by both the low Rand index scores and the lack of discernible clusters when visualising the principal components’ of the images.

Although, if the sources of error were rectified, it may have been possible to produce more representative cluster partitionings, which may have yielded a higher Rand index score.

Furthermore, the low Silhouette index scores do not indicate any underlying structure of the data. While the spectral clustering algorithm achieved better Silhouette index scores than the other algorithms, it did so by essentially assigning every image to a single cluster. This Silhouette index score implies that the underlying structure may be best viewed as a single cluster. It is the case, however, that applying a segmentation algorithm of some kind could have helped eliminate homogeneity in the data, thus yielding a more meaningful cluster partitioning. For future work, such an implementation is worthy of consideration.

29

(39)

Bibliography

[1] American Cancer Society. Cancer Facts Figures. 2019.

[2] Konstantina Kourou et al. “Machine learning applications in cancer prognosis and prediction”. In: Computational and Structural Biotech- nology Journal 13 (2015), pp. 8–17. issn: 2001-0370. doi: https:

//doi.org/10.1016/j.csbj.2014.11.005. url: http:

/ / www . sciencedirect . com / science / article / pii / S2001037014000464.

[3] Stephen Marsland. Machine Learning: An Algorithmic Perspective. Chap- man and Hall/CRC, 2009.

[4] Anil K. Jain. “Data clustering: 50 years beyond K-means”. In: Pattern Recognition Letters 31.8 (2010). Award winning papers from the 19th International Conference on Pattern Recognition (ICPR), pp. 651–666.

issn: 0167-8655. doi: https://doi.org/10.1016/j.patrec.

2009 . 09 . 011. url: http : / / www . sciencedirect . com / science/article/pii/S0167865509002323.

[5] Andre Esteva et al. “Dermatologist-level classification of skin cancer with deep neural networks”. In: Nature 542 (Jan. 2017). doi: 10.1038/

nature21056.

[6] H A Haenssle et al. “Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists”. In: Annals of Oncology 29.8 (May 2018), pp. 1836–1842.

[7] Chip M. Lynch, Victor H. van Berkel, and Hermann B. Frieboes. “Ap- plication of unsupervised analysis techniques to lung cancer patient data”. In: PLOS ONE 12 (Sept. 2017), pp. 1–18. doi: 10 . 1371 / journal . pone . 0184370. url: https : / / doi . org / 10 . 1371/journal.pone.0184370.

30

(40)

BIBLIOGRAPHY 31

[8] Marcilio CP de Souto et al. “Clustering cancer gene expression data: a comparative study”. In: BMC Bioinformatics 9.1 (Nov. 2008), p. 497.

issn: 1471-2105. doi: 10.1186/1471-2105-9-497. url: https:

//doi.org/10.1186/1471-2105-9-497.

[9] Philipp Tschandl, Cliff Rosendahl, and Harald Kittler. “The HAM10000 Dataset: A Large Collection of Multi-Source Dermatoscopic Images of Common Pigmented Skin Lesions”. In: Scientific Data 5 (Mar. 2018).

doi: 10.1038/sdata.2018.161.

[10] Peter J. Rousseewu. “Image analysis and machine learning in digital pathology: Challenges and opportunities”. In: Journal of Computa- tional and Applied Mathematics 20 (1987), pp. 53–65. url: https:

/ / www . sciencedirect . com / science / article / pii / 0377042787901257.

[11] Ameet Talwalkar Mehryar Mohri Afshin Rostamizadeh. Foundations of Machine Learning. The MIT Press, 2012.

[12] Anil K. Jain and Richard C. Dubes. Algorithms for Clustering Data.

Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1988. isbn: 0-13- 022278-X.

[13] Oded Maimon and Lior Rokach. Data Mining and Knowledge Discovery Handbook, 2nd ed. Jan. 2010. isbn: 9780387098227.

[14] Niruhan Viswarupan. K-Means Data Clustering. 2017. url: https://

towardsdatascience.com/k-means-data-clustering- bce3335d2203.

[15] Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. “On Spectral Cluster- ing: Analysis and an Algorithm”. In: Proceedings of the 14th Interna- tional Conference on Neural Information Processing Systems: Natural and Synthetic. NIPS’01. Vancouver, British Columbia, Canada: MIT Press, 2001, pp. 849–856. url: http://dl.acm.org/citation.

cfm?id=2980539.2980649.

[16] Yoshua Bengio et al. “Out-of-sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering”. In: Proceedings of the 16th Interna- tional Conference on Neural Information Processing Systems. NIPS’03.

Whistler, British Columbia, Canada: MIT Press, 2003, pp. 177–184.

url: http://dl.acm.org/citation.cfm?id=2981345.

2981368.

(41)

32 BIBLIOGRAPHY

[17] Malay K. Pakhira. “Finding Number of Clusters before Finding Clus- ters”. In: Procedia Technology 4 (2012). 2nd International Conference on Computer, Communication, Control and Information Technology(

C3IT-2012) on February 25 - 26, 2012, pp. 27–37. issn: 2212-0173.

doi: https : / / doi . org / 10 . 1016 / j . protcy . 2012 . 05 . 004. url: http : / / www . sciencedirect . com / science / article/pii/S2212017312002836.

[18] William M. Rand. “Objective Criteria for the Evaluation of Clustering Methods”. In: Journal of the American Statistical Association 66.336 (1971), pp. 846–850. issn: 01621459. url: http://www.jstor.

org/stable/2284239.

[19] Gareth James et al. “Unsupervised Learning”. In: An Introduction to Sta- tistical Learning: with Applications in R. New York, NY: Springer New York, 2013, pp. 373–418. isbn: 978-1-4614-7138-7. doi: 10.1007/

978 - 1 - 4614 - 7138 - 7 _ 10. url: https : / / doi . org / 10 . 1007/978-1-4614-7138-7_10.

[20] Ian Jolliffe. “Principal Component Analysis”. In: International Encyclo- pedia of Statistical Science. Ed. by Miodrag Lovric. Berlin, Heidelberg:

Springer Berlin Heidelberg, 2011, pp. 1094–1096. isbn: 978-3-642- 04898-2. doi: 10 . 1007 / 978 - 3 - 642 - 04898 - 2 _ 455. url:

https://doi.org/10.1007/978-3-642-04898-2_455.

[21] Louis Ferré. “Selection of components in principal component anal- ysis: A comparison of methods”. In: Computational Statistics Data Analysis 19.6 (1995), pp. 669–682. issn: 0167-9473. doi: https : / / doi . org / 10 . 1016 / 0167 - 9473(94 ) 00020 - J. url:

http : / / www . sciencedirect . com / science / article / pii/016794739400020J.

[22] Wikipedia. url: https://en.wikipedia.org/wiki/Scree_

plot.

[23] Arindam Bhattacharjee et al. “Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma sub- classes”. In: Proceedings of the National Academy of Sciences 98.24 (2001), pp. 13790–13795. doi: 10.1073/pnas.191502998. eprint:

https://www.pnas.org/content/98/24/13790.full.

pdf. url: https : / / www . pnas . org / content / 98 / 24 /

13790.

(42)

BIBLIOGRAPHY 33

[24] G. B. Coleman and H. C. Andrews. “Image segmentation by clustering”.

In: Proceedings of the IEEE 67.5 (May 1979), pp. 773–785. issn: 0018- 9219. doi: 10.1109/PROC.1979.11327.

[25] Bing Nan Li et al. “Integrating spatial fuzzy clustering with level set methods for automated medical image segmentation”. In: Computers in Biology and Medicine 41.1 (2011), pp. 1–10. issn: 0010-4825. doi:

https : / / doi . org / 10 . 1016 / j . compbiomed . 2010 . 10 . 007. url: http : / / www . sciencedirect . com / science / article/pii/S0010482510001460.

[26] H. P. Ng et al. “Medical Image Segmentation Using K-Means Clustering and Improved Watershed Algorithm”. In: 2006 IEEE Southwest Sympo- sium on Image Analysis and Interpretation. Mar. 2006, pp. 61–65. doi:

10.1109/SSIAI.2006.1633722.

[27] Fabian Pedregosa et al. “Scikit-learn: Machine Learning in Python”. In:

J. Mach. Learn. Res. 12 (Nov. 2011), pp. 2825–2830. issn: 1532-4435.

url: http://dl.acm.org/citation.cfm?id=1953048.

2078195.

[28] Gary Bradski and Adrian Kaehler. Learning OpenCV: Computer vision

with the OpenCV library. " O’Reilly Media, Inc.", 2008.

(43)

www.kth.se

TRITA-EECS-EX-2019:397

References

Related documents

Current projects focus on the design of digital games and user studies of digital artifacts to encourage energy conserving behaviour in the home and in workplace settings.. ABOUT

We want to investigate how different clustering algorithms perform and how an alternative deep learning technique, such as the Variational Autoencoder (VAE), can be used in the

unsupervised clustering algorithms namely K-Means, DBSCAN and OPTICS as a suitable approach to detect anomalies on different dimensionality and cluster overlap. In addition,

The unsupersived aspect of this work forces the experimentation to be struc- tured as follows: create a useful and meaningful representation of the corpus of experiments with a

En sista viktig fördel enligt Brinkmann (2011, s. 35) är att investeringskostnaden per kran anses vara överkomlig. Enligt Brinkmann är en tumregel att det behövs två till

Detta har gjort att GETRAG AWD AB bestämt sig för att genomföra en energikartläggning av ett produktionsflöde för att kunna visa på var i processkedjan de största

In this chapter, we report results on a diverse set of experiments for five models. These models are CNN with AlexNet architecture trained with different versions of our method.

To find patterns from data streams without training data, unsupervised anomaly detection utilizing distributed clustering algorithms should be used.. Clustering is an unsupervised