Segmentation of cancer epithelium using nuclei morphology with Deep Neural Network

(1)

IN

DEGREE PROJECT MEDICAL ENGINEERING, SECOND CYCLE, 30 CREDITS

STOCKHOLM SWEDEN 2020,

Segmentation of cancer epithelium using nuclei morphology with

Deep Neural Network

OSHEEN SHARMA

KTH ROYAL INSTITUTE OF TECHNOLOGY

SCHOOL OF ENGINEERING SCIENCES IN CHEMISTRY, BIOTECHNOLOGY AND HEALTH

(2)

(3)

Segmentation of cancer epithelium using nuclei

morphology with Deep Neural Network

OSHEEN SHARMA

Master in Medical Engineering Date: June 3, 2020

Supervisor: Riku Turkki Examiner: Matilda Larsson

School of Engineering Sciences in Chemistry, Biotechnology and Health

Host company: Science for Life Laboratory (ScilifeLab)

Swedish title: Segmentering av cancerepitel utifrån kärnmorfologi med djupinlärning

(4)

(5)

iii

Abstract

Bladder cancer (BCa) is the fourth most commonly diagnosed cancers in men and the eighth most common in women. It is an abnormal growth of tissues which develops in the bladder lining. Histological analysis of bladder tissue facilities diagnosis as well as it serves as an important tool for research. To better understand the molecular profile of bladder cancer and to detect predictive and prognostic features, microscopy methods, such as immunofluorescence (IF), are used to investigate the characteristics of bladder cancer tissue.

For this project, a new method is proposed to segment cancer epithelial using nuclei morphology captured with IF staining. The method is implemented using deep learning algorithms and performance achieved is compared with the literature. The dataset is stained for nuclei (DAPI) and a marker for cancer epithelial (panEPI) which was used to create the ground truth. Three popu- lar Convolutional Neural Network (CNN) namely U-Net, Residual U-Net and VGG16 were implemented to perform the segmentation task on the tissue microarray dataset. In addition, a transfer learning approach was tested with the VGG16 network that was pre-trained with ImageNet dataset.

Further, the performance from the three networks were compared using 3fold cross-validation. The dice accuracies achieved were 83.32% for U-Net, 88.05% for Residual U-Net and 82.73% for VGG16. These findings suggest that segmentation of cancerous tissue regions, using only the nuclear morphology, is feasible with high accuracy. Computer vision methods better utilizing nuclear morphology captured by the nuclear stain, are promising approaches to digitally augment the conventional IF marker panels, and therefore offer improved resolution of the molecular characteristics for research settings

Keywords: histological, morphology, Convolutional Neural Network, seg- mentation, bladder cancer

(6)

iv

Sammanfattning

Urinblåsecancer (BCa) är den fjärde vanligaste cancerdiagnosenhos män och den åttonde vanligaste cancerdiagnosen blandkvinnor. I majoriteten av fallen innebär diagnosen en onormal vävnadstillväxt i det cellager som täcker insidan av urinblåsan. Histologisk analys av vävnad fungerar som ett viktigt verktyg fördiagnos och forskning. För att bättre förstå den molekyläraprofilen och för att upptäcka prediktiva och prognostiska egenskaper används mikroskopime- toder, såsomimmunfluorescens (IF).

För detta projekt föreslås en ny metod för att segmentera cancerepitel med kärnmorfologi efter IF-färgning. Metoden implementeras med hjälp av djupin- lärningsalgoritmer och resultaten jämförs med litteraturen. Datasetet är färgat för cellkärnor (DAPI) och med en bred markör för cancerepitel(panEPI), som användes för att skapa en så kallad ”ground truth”.Tre populära ”Convolutional Neural Network” (CNN), U-Net, RestU-Net och VGG16, implementerades för att utföra segmenteringen i mikroarraydatasetet. Dessutom testades enmetod för ”transfer learning” med VGG16-nätverket som förtränades med ImageNet- datasetet.

Vidare jämfördes prestanda från de tre nätverken med hjälp av trefaldig korsvalidering. De uppnådda tärningsnoggrannheternavar 83,32% för U-Net, 88,0% för Rest U-Net och 82,73% för VGG16. Dessa fynd tyder på att segmentering av cancervävnadsregioner, med endast kärnmorfologi, är möjlig med hög noggrannhet. Datorvisionsmetoder som bättre utnyttjar kärnmorfologi, där kärninfärgning tillämpats, är lovande metoder för att digitalt förstärka de konventionella IF-markörpanelerna och erbjuder därför förbättrad upplösning av molekylära egenskapersom kan användas i forskning.

(7)

Acknowledgement

I would like to express my sincere gratitude to all the people who supported me during my master thesis. First of all, I would like to thank my supervisor Riku Turkki, for his constant encouragement and guidance. He has walked me through all the stages of my thesis study and always gave me feed-backs to improve more on my work. I would also like to thank Olli Kallioniemi’s group in SciLife Laboratory for giving me the opportunity to conduct my thesis work in their group and always providing me with nice and positive environment.

Further, I would also like to thank my reviewer at KTH, Chunliang Wang, for helping me and providing his knowledge and suggestions throughout the thesis work.

I thank my fellow friends in KTH Royal Institute of Technology: Blanca Cabrera Gil, Blanca Bastardés and Andrés for the stimulating discussions, their feed-backs and constant support throughout my thesis work.

I would like to give a special thanks to Nitish Sehgal for his constant encouragement, support and valuable inputs.

And mostly importantly, I am extremely grateful to my wonderful parents.

This project and my masters education would not have been possible without their unconditional love, support, guidance and trust in me. Their valuable suggestions were crucial for the entire masters degree. I owe my every-thing to them.

v

(8)

List of Abbreviations

BCa: Bladder Cancer

IHC: Immunohistochemistry H&E: Hematoxylin & Eosin

MRI: Magnetic Resonance Imaging TMA: Tissue microarray

IF: Immunofluorescence

DCNN: Deep Convolutional Neural Network CNN: Convolutional Neural Network

IoU: Intersection of Union ROI: Region of interest

vi

(9)

List of Figures

2.1 DAPI channel and epithelial channel highlighting the cancer

regions . . . 5

2.2 High resolution full size image to tiles . . . 6

2.3 Tiles from Dowansampled Images and Cropped Images . . . . 6

2.4 Augmentation result on one tile with batch size of 9 . . . 10

3.1 Ground truth mask . . . 11

3.2 Learning curve for downsampled images . . . 13

3.3 Dice accuracy plots for cropped images with tile resolution of 512x512 with data augmentation . . . 16

3.4 Prediction on full size downsampled images shows that the model(U-Net) has difficulty in learning and it tampers the images and masks while predicting the label. . . 17

3.5 Prediction on full size image. Figure ( 3.5a)( 3.5b) shows prediction without thresholding and Figure ( 3.5c) to ( 3.5f) shows thresholded predictions . . . 18

A.1 Convolutional Neural Network Architecture . . . 36

A.2 U-Net model architecture[13] . . . 41

A.3 VGG16 transfer learning model architecture[18] . . . 41

A.4 Residual U-Net model architecture[64] . . . 42

ix

(12)

List of Tables

2.1 Different hyper-parameter setting trail . . . 9

3.1 Metrics accuracy scores on varied depths . . . 12

3.2 Metric scores on training and validation set (downsampled images) . . . 12

3.3 Dice metrics accuracy on varying the resolution of tiles for both U-net and Residual U-Net . . . 14

3.4 Metrics accuracy scores on augmented dataset . . . 14

3.5 Metric scores achieved on training VGG16 model . . . 15

3.6 Comparison using Cross-Validation . . . 15

A.1 Declaration of Staging Criteria . . . 34

x

(13)

Chapter 1 Introduction

Bladder cancer(BCa) is an abnormal growth of tissues which develops in the bladder lining. It is most commonly found in men as compared to women due to the presence of plentiful androgen receptors [1]. However the pathogenesis of cancer can either be due to genetic reasons or life style and environmental factors. Smoking is one of the causes of this type of cancer that also increases the chance by two to four times [2]. Diagnostic methods for bladder cancer include cytoscopy, cytology, biopsy, imaging test [3]. Until now biopsy is a very common procedure when it comes to examining and visualizing more closely the tissue or cell samples acquired from the living body. On the basis of the samples obtained from the biopsy, they are visualized under microscope to further classify them as low grade tumor or high grade tumor. A microscopy is also an essential tool in research efforts to better understand the biology be- hind bladder cancer.

Histopathological images are the microscopic images that are used in clinical pathology for the diagnosis of the disease. To prepare the slides and digitize them, different staining procedures are adopted which helps to highlight the features and contrast of the tissues. The staining procedure makes it easier for the observer to differentiate between the healthy cells and the cancerous cells as it marks the nuclei and proteins with different colors. Hematoxylin(H) is one such type of stain which is used to enhance the contrast of nuclei in the sample and color it with blue while eosin(E) is a stain that gives pink color to the cell nucleus [4]. It makes use of information based on the morphology of the cells to characterize tumors. Immunohistochemistry (IHC) is another type of dye that uses antibodies to detect the presence of protein. Labeling of antibodies with fluorescent dyes is called immunofluoresence staining. A microscope is

1

(14)

2 CHAPTER 1. INTRODUCTION

used to visualize the compounds that emit light of a certain wavelength when illuminated by shorter wavelength light [5]. These processes of converting the raw slides into digital format take around 13 to 24 hours as compared to other imagining modalities like X-ray or Magnetic Resonance Imaging (MRI) which are less time consuming.

The analysis of the histopathological images for a quantitative and repro- ducible result is a very challenging task for human observers. The drawbacks associated with manual analysis are efficiency, reproducibility, human factor etc. Manual segmentation of tumor cells from healthy cells can be inaccurate as it depends on the visual estimation of every observer [6]. Also, they can have their own rules for classifying the tissues. Hence, computer vision and deep learning approach are undertaken which has the ability to quantitate and better understand imaging data and also assist observers to compare classification and segmentation results with those obtained from the automated process [7]. Additionally, this feature of computer vision eliminates the need for manual and exhaustive inspection of the samples by the pathologists.

Recently many advances have been done in the pathological field to au- tomatize the visualization and analysis process of the tissue sample images.

In research work conducted by M. Kowal et. al. [8] used stained and digi- tized tissue samples to segment nuclei and achieved an accuracy of 83%. The research work by S.Graham et al. [9] implemented ResNet50 to segment nuclei from H&E stained samples and reported dice of 80.01%. S.Sornapudi et al. also proposed an automated segmentation method using a neural network to segment nuclei from stained histopathological images and achieved a dice accuracy of 98.02%. Research work on segmenting epithelial tissue has also shown promising results in prostrate and bladder cancer samples. Bulten.W. et al. proposed deep learning approach to segment epithelial from H&E stained prostatectomy slides using IHC reference standard [10]. U-Net was the neural network implemented in their work and F1 score was reported as 89.3%.

Similar work of segmenting epithelial from prostrate tissue biopsy samples was reported in the work of Qinqin Yang et al. where they achieved pixel accuracy of 92.3% [11]. The similarity between these studies is the type of dataset used for conducting the research work. In all the studies the dataset is stained. However, in pathology the samples can have limited staining to label the biomarker. This results in limited information that can be retrieved from the tissue/cell samples. In order to enable pathologists to use different stains on the samples will help them have other relevant information which it is hard

(15)

CHAPTER 1. INTRODUCTION 3

to find with limited staining procedures.

1.1 Aim

This project aims to:

• Prepare the dataset to make it appropriate for training neural network

• Evaluate the ability of neural network to segment cancer epithelial using nuclei morphology only.

(16)

Chapter 2 Methods

This chapter discusses the materials and tools that were used to accomplish the aim of the project and also the implementation of the neural network and training of the model.

2.1 Materials and Tools

2.1.1 Dataset

The dataset consists of bladder cancer tissue microarray (TMA) samples col- lected in Sweden between the years 2001 and 2015. Using immunofluorescence (IF) techniques, the tissue samples were stained with a nuclear marker (DNA staining with DAPI) and a pan-epithelium (PanEpi) marker (cocktail of e-cadherin and pan-cytokeratin c11 and ae1/ae3) as shown in Figure 2.1 at the Institute for Molecular Medicine Finland. In total the dataset consists of 374 cancer TMA spots from which only 263 samples were chosen for this project.

The reason for eliminating samples was due to the quality of the image which might have affected the training negatively. All images were saved in .tiff for- mat"

2.1.2 Ground Truth Generation

We use DAPI stained images as input and binary masks created based on PanEpi signal as the target. We used software proposed by Stuart Berg et al. called Ilastik [12] to binarize the PanEpi signal by assigning the epithelial (cancer regions) as 1 and the rest of the image as 0. The software utilizes ran-

4

(17)

CHAPTER 2. METHODS 5

(a) DAPI image (b) Epithelial mask

Figure 2.1: DAPI channel and epithelial channel highlighting the cancer regions

dom forest classifier to segment the image based on local image features. The following steps were carried out to generate the ground truth:

• Load the data into Ilastik.

• Select features based on intensity, edge and texture.

• Train the classifier: this requires user annotations. Two labels were used background(0) and epithelial(1).

• The feature called live update enables the user to visualize the perfor- mance of the software and correct the manual annotations if required.

2.1.3 Image Tiling

Once the masks were prepared then the next step was to make the tiles for both images and masks. The purpose of creating tiles from high resolution image is to divide the image into small patches that are used to train the network.

Further, these tiles are merged back to visualize the segmentation achieved by the neural network on the full size image. Usually, the high resolution images from different modalities like MRI, X-ray etc can be resized to a certain scale that can be used for training the model. However, this approach of resizing for histopathological images will lead to further loss of information as the images are in microscopic level. To this end two approaches were carried out in the project to study the impact of downsampling on the microscopic images.

In the first approach, the images from 7296x5376 dimension were resized by 12.5% resulting in the image size of 912x672. Further 25 tiles of size 128x128

(18)

6 CHAPTER 2. METHODS

were cut from these resized images illustrated in Figure 2.3a. In the second approach, the high resolution images were cropped instead of downsampling.

First, the images were made square with dimension 5250x5250 after which the tiling was done resulting in 25 tiles each of size 1050x1050 (Figure 2.3b).

Additionally, all the input images were normalized to the range of [0,1].

Figure 2.2: High resolution full size image to tiles

(a) Downsampled Image Tile (b) Cropped Image Tile

Figure 2.3: Tiles from Dowansampled Images and Cropped Images

2.2 Deep Convolutional Neural Network

Deep Convolutional Neural Network (DCNN) is an algorithm which consists of several layers where the input is fed to the network through a set of layers.

(19)

Usually, the layers consists of three main steps i.e convolution, maximum pooling and activation. The following sub-sections give brief introduction about the three neural networks that were used in this project. The model architectures for each of the three neural networks can be found in appendix A.

2.2.1 U-Net

For this project, the first CNN chosen was U-net introduced by Ronneberger et. al [13] for biomedical image segmentation tasks. Few changes were introduced in the architecture to train the network as shown in Figure A.2. The network consists of a contracting and an expanding path. The contracting path comprises of 3x3 convolutional layer, relu activation and 2x2 max pooling with the stride of 2. However, with the contracting path, we only have the information about what is present in an image. But for the segmentation we need to know where the object is present in an image and to know this we do the upsampling also known as expanding path. The upsampling path converts the low resolution image to high resolution by performing the concatenation from the corresponding contracting path.

There are several reasons for choosing this model. It is easy to implement in Keras and has a fast computation capability. Additionally, it is very effective even if the dataset is less because data augmentation techniques can be implemented to improve model performance [14]. Also, the first implementation of this network was done on biomedical images which were a great success for image segmentation tasks.

2.2.2 Residual U-Net

In addition to the U-Net, Residual U-Net was also implemented in this project which takes advantage of both U-Net and residual blocks[15]. The residual eases the training of the network. The skip connections within a residual unit facilitate information propagation without degradation, making it possible to maintain good performance with much fewer parameters [16]. The residual block consists of 3x3 convolutional block and mapping which connects input and output[17].

2.2.3 VGG16

VGG16 is a CNN model that was introduced by K. Simonyan and A. Zisser- man. It consists of 13 convolutional layers and 3 fully connected layers. The

(20)

network is best suited for the classification task but in this project, the focus is to perform the segmentation of the input image. Hence, after the 13 convolutional layers, the last 3 dense layers were replaced by de-convolutional layers also known as transpose convolutional layers to preserve the 2D structure of input images and not flatten them [18]. This method takes advantage of transfer learning approach where the pretrained network is used (i.e. ImageNet) and then adapted it to the new dataset for the application.

2.3 Implementation Details

To split the data, train_test_split function was used from sklearn model se- lection [19] for randomly splitting the whole dataset into training, validation and testing sets. In total we have 263 images and their corresponding masks.

The patch based method was implemented as one of the steps while prepro- cessing the data which divided both the images and masks into small square tiles. These tiles were used to train the neural network instead of using the full size image. Figure 2.2 illustrates how a full size image looked after dividing into tiles. Each image and mask had 25 tiles so in total 6575 tiles were gen- erated. These tiles were split in a way that 4700 tiles were used for training, 1200 were used for validation and 675 were used for testing. The splitting was done in a manner that all the sets have different tiles and no repetition.

Additionally, this same distribution (i.e the split) was maintained throughout the experiments which enable us to better compare the accuracy results of the networks.

The resizing the image tiles was done using OpenCV with INTER_NEAREST.

To find the best combination of hyperparameters (as given in Table 2.1.), different tests were performed for both U-Net and Residual U-net. After the splitting, both the images and their corresponding masks were converted into numpy array.

2.3.1 Setting hyper-parameters

The hyperparameters are the variables that govern the entire training process of the neural network architecture. For the project different hyper-parameter settings were tested which include, learning rate, number of epochs, optimizer, image resolution, augmentation, dropout, batch normalization and batch size.

Batch size describes how many samples will be propagated through the network in each iteration. The different tests with different hyper-parameter set-

(21)

ting were conducted to match the best combination which resulted in the highest dice score. The accuracy metrics with the lowest validation loss was recorded along with the epoch number. Table 2.1 illustrates the hyper-parameter setting

Hyper-parameters Settings

Learning Rate 0.001, 0.0001, 0.00001

Optimizer Adam, SGD

Epochs 100, 200, 250, 300

Resolution (128x128), (256x256),

(512x512)

Augmentation Horizontal and Vertical Flip Batch Normalization True

Dropout 0.1, 0.5

Depth of the U-Net 3, 4, 5

Filter for each layer (16,32,64,128), (16,32,64,128,256), (16,32,64,128,256,512) Table 2.1: Different hyper-parameter setting trail

2.3.2 Training

In the first trial, the U-Net model was trained with the downsampled image dataset. Further to explore the impact of downsampling , the training was carried out on the cropped image dataset for both U-Net and Residual U-Net.

Then data augmentation was implemented with a horizontal flip and vertical flip both on images and its corresponding masks as shown in Figure 2.4.

The values were given as ranges in accordance with the Keras 2.0 ImageData- Generator class API and batch size of 8 was used for cropped images and 16 for the downsampled images with seed of 1. Lower batch size was used for the cropped images as the resolution was high and to prevent memory errors [20]. Additionally, to avoid over-fitting dropout was introduced in all the architectures along with batch normalization to improve the training speed. To compute loss, binary_crossentropy function from the keras library was implemented. The reason for choosing this loss function was the availability of two classes in the target set (i.e. 0 for background and 1 for epithelial).

(22)

Figure 2.4: Augmentation result on one tile with batch size of 9

2.3.3 Evaluation Metrics

Appropriate evaluation technique is of utmost importance in both clinical decision making and algorithm designing. The evaluation in this project is carried out in three ways: Dice Coefficient, Precision and Recall. The detailed formulas of the metrics are found in appendix A.

2.3.4 Cross Validation

Cross validation is a method that is used to evaluate the performance of the neural network on a limited dataset. The method has a parameter k which defines the number of groups to split the dataset. After the splitting, one fold is used for validation while other folds are used for training the network. This process is repeated several times and finally, the average and standard deviation are computed from the acquired accuracies. One of the major advantages of using this technique is to prevent the model from over-fitting the training set.

2.4 Computer Specification & Software

The analysis computer had two RTX 2080 Ti GPUs, 128GB of RAM, an i9 processor, and was running Ubuntu Linux 18.04 LTS. To construct the network architectures, Keras framework was used and the underlying library was tensorflow.

(23)

Chapter 3 Results

3.1 Ground Truth Generation

To avoid manual pixel level annotation of cancer epithelium, ground truth masks were created using PanEpi stain and semantic segmentation. The Ilastik software with pixel classification workflow was used to create accurate binary masks. The algorithm assigns labels to pixels based on features and user annotations. The segmentation resulted in one epithelial mask where the epithelial regions were labeled with 1 and non-epithelial tissue and background were labeled with 0. To visualize the masks, Fiji(ImageJ) Image analysis tool was used [21] as illustrated in Figure 3.1. To generate the binary masks it took ap- proximately 36 hours (1.5 days) for the Ilastik software which was very time consuming.

Figure 3.1: Ground truth mask

11

(24)

12 CHAPTER 3. RESULTS

3.2 Experiment 1: Variation in U-net Depth

The first experiment that was conducted in this project was to explore the impact of neural network depth on the performance of the model. As shown in Table3.1 three different depths were tried and the best performance of the model was achieved with a depth of 5 and filter per layer as (16,32,64,128,256,512).

The training was conducted on 128x128 input tile image for 100 epochs and Adam optimizer was chosen with a learning rate of 0.0001.

Training Validation

Depth Dice Precision Recall Dice Precision Recall 3 0.7534 0.7461 0.7588 0.7557 0.7724 0.7851 4 0.7658 0.7486 0.7748 0.7668 0.7517 0.7922 5 0.7799 0.7901 0.7944 0.7815 0.7948 0.8005

Table 3.1: Metrics accuracy scores on varied depths

3.3 Experiment 2: Training on Downsampled Dataset

The second part of the project focused on training the model with downsampled images. Each raw image was resized by a factor of 12.5% using python library and then the tiles were obtained for each image of size 128x128. The network was trained for 300 epochs, Figure 3.2 illustrates the learning curve and Table3.2 shows the scores obtained after training the model. The accuracies were compared between the training and validation set.

Evaluation Metric Training Set Validation Set

Dice 0.8945 0.9044

Precision 0.8972 0.8820

Recall 0.9183 0.8753

Table 3.2: Metric scores on training and validation set (downsampled images)

(25)

CHAPTER 3. RESULTS 13

Figure 3.2: Learning curve for downsampled images

3.4 Experiment 3: Training on Cropped Dataset

Although the accuracy obtained by downsampled images were promising, the prediction results didn’t visually look good (shown in Figure 3.4). One of the reasons was excessive downsampling of the images which resulted in the loss of information. Thus, the third experiment was conducted in which the raw images were of size 7296x5376 were cropped instead of downsampled to have the shape of a square. This also helped while tiling the images which resulted in a square each of size 1050x1050. The tiles were converted into numpy array and then the arrays were resized. However, since this dimension of tile is very large, they were further resized using OpenCV Resize from the python library. The U-Net and Residual U-Net models were trained and compared on two different tile resolutions. It was observed that Adam optimizer performed better than SGD and hence all the further tests were conducted keeping Adam as optimizer with a learning rate of 1e-5.

The following experiments were conducted both on U-net and Residual U- Net and the performance was compared as shown in the following plots and tables.

(26)

3.4.1 Varying the resolution of tiles

The test was performed on three different tile resolution. It was observed that the best performance was achieved with a resolution of 512x512 by Residual U-Net as shown in Table3.3.

Resolution Metrics Training Validation Testing

U-Net 256x256

512x512 Dice 0.7772 0.8607

0.7790 0.8645

0.7887 0.8720 Residual U-Net 256x256

512x512 Dice 0.8868 0.8965

0.8727 0.8861

0.9025 0.9120 Table 3.3: Dice metrics accuracy on varying the resolution of tiles for both U-net and Residual U-Net

3.4.2 Effect of data augmentation

Since the previous tests were done to decide which resolution is best for both the models, hence to further improve the model performance and accuracy, data augmentation was implemented and compared for both U-Net and Resid- ual U-Net on 512x512 tile resolution. We observed that the results obtained from Residual U-Net were better than the one from the baseline U-Net. The metric scores are reported in Table 3.4.

U-Net Residual U-Net

Dataset Dice Precision Recall Dice Precision Recall Training 0.8705 0.9317 0.8433 0.8721 0.9457 0.8990 Validation 0.8562 0.9285 0.8506 0.8627 0.9464 0.9010 Test 0.8849 0.9532 0.8620 0.9007 0.9683 0.9090

Table 3.4: Metrics accuracy scores on augmented dataset

Training loss for cropped and downsampled images

The lowest loss for downsampled images was 0.2486 on the training set and 0.2532 validation set. Whereas for cropped images with U-Net was 0.3169 on the training set and 0.3243 on validation set and with Residual U-Net was 0.3083 on the training set and 0.3173 on validation.

(27)

3.5 Experiment 4: VGG16

As the last part of the experiments, the transfer learning technique was implemented using VGG16 which was pretrained on the ImageNet dataset. A tile size of 128x128 was used instead of 512x512 to avoid memory issues. We observed that although the network showed results similar to Residual U-net, it failed to predict the masks as good as the other two methods did. The accuracy metrics achieved by training the VGG16 is illustrated in Table 3.5.

Evaluation Metric Training Set Validation Set Test Set

Dice 0.8778 0.8380 0.8649

Precision 0.8872 0.8571 0.8275

Recall 0.9149 0.9027 0.9069

Table 3.5: Metric scores achieved on training VGG16 model

3.6 Comparison between the Architectures using Cross-Validation

The three architectures were compared using 3 K-fold cross validation. The results are shown in Table 3.6. From the table, we can easily see that the over- all best performance was achieved by Residual U-Net network as compared to U-Net and VGG16.

Network Dice Precision Recall

U-Net 83.32 (±1.922)% 88.39 (±1.553)% 87.68 (±1.933)%

Residual U-Net 88.05 (±3.190)% 88.15 (±3.421)% 95.75 (±1.765)%

VGG16 82.73 (±1.767)% 83.14 (±2.633)% 92.28 (±1.135)%

Table 3.6: Comparison using Cross-Validation

3.7 Prediction on Test Dataset

The segmentation performance on the test dataset was first visualized by evaluating the U-Net model. The square tiles were merged back to visualize the segmentation result on the full image. Figure 3.4 shows the prediction on

(28)

(a) U-Net (b) Residual U-Net

(c) VGG16

Figure 3.3: Dice accuracy plots for cropped images with tile resolution of 512x512 with data augmentation

(29)

the test dataset from downsampled images. As shown, the model is not able to predict cancer epithelial from the DAPI stained images. Further after performing Experiment 3 on the cropped image dataset, it was observed that the U-Net model as well as Residual U-Net was able to segment cancer epithelial from the nuclei morphology. The best segmentation was obtained with Resid- ual U-Net. Figure 3.5 shows the prediction results on the test dataset after merging tiles back to full size images.

Figure 3.4: Prediction on full size downsampled images shows that the model(U-Net) has difficulty in learning and it tampers the images and masks while predicting the label.

(30)

(a) Mask (b) Prediction

(c) Mask (d) Prediction

(e) Mask (f) Prediction

Figure 3.5: Prediction on full size image. Figure ( 3.5a)( 3.5b) shows prediction without thresholding and Figure ( 3.5c) to ( 3.5f) shows thresholded predictions

(31)

Chapter 4 Discussion

The results reported in Section 3.3 and Section 3.4 show that the research ques- tion is well answered. The proposed architectures tested in this project were able to segment cancer epithelial from nuclei morphology. However, for a few images, the segmentation was not accurate, and the main reasons were low image quality and background debris. Additionally, to address the problem of high resolution histopathological images, the two approaches (tiling and cropping) that were implemented showed marked difference when evaluating the model on the test dataset. The validation loss for the downsampled images was 0.2532 which is much lesser than that for the cropped images as 0.3243.

The learning curves for both data sets looked promising but while predicting the model it was observed that if the histopathological images are downsampled then the predicted mask looks like the model learns to predict images and not masks. This effect on prediction due to downsampling is called tampering where the raw image tampers (interferes) with the masks. Also, the prediction results that were achieved from VGG16 model were not so accurate and one of the reasons can be due to very small size of the tile which again justifies with the argument made in Section 3.4 that too small size of the tile can give good accuracy but when the prediction is visualized on full size image then it tampers with the input image and masks resulting in poor prediction.

4.1 Impact of Tiling the Images

The tiling helps the model to concentrate on small details and features as the tissue/cell samples have small structures that are hard to segment if the full image is fed to the neural network. Additionally, this approach has also the benefit of lower computation time, lower complexity, and lower memory re-

19

(32)

20 CHAPTER 4. DISCUSSION

quirements. Tiling is needed because we are balancing between requirements and image resolution.

4.2 Impact of Downsampling and Cropping

The results showed a lot of improvement on the cropped images as compared to the downsampled images. The reason for inaccurate prediction on downsampled images is poor image quality which results in loss of information and fine details from the tissue sample images. The histopathological images are zoomed to have a better understanding of the texture and morphology of the cells/tissues present in the sample. If the images are downsampled then these small details are lost and it gets harder for the network to learn from the morphology.

The cropping of the image doesn’t alter the morphology instead small batches cut from the image to process it in parts. This allowed us to maintain the details and fine features in the image and also resulted in a better resolution which enables the network to differentiate between non-epithelial cells and epithelial cells from the given image based on the changes in the nuclear morphology only.

4.3 Project Limitations

The work also has some limitations. As the network is trained on a limited amount of data, it is difficult to overcome the problems that are caused due to scanning of the tissue sample slides. For some images, the model used did not properly segment epithelial and this was when the immunofluorescence stain is faint or absent in the images. The improper segmentation is also influenced by the chosen magnification level. For better results, a higher magnification should be used for segmenting cancer epithelial from DAPI channel images and could potentially help with lowering the number of artifacts as the network can learn high level shapes of the tissue. Also, if the concentration of stain is altered and controlled then this would result in variation in the intensity and brightness of the image. However, segmenting individual epithelial cells requires input patches with enough detail to be able to distinguish epithelial cells from the surrounding cells. Therefore, one needs to consider that for some cells it is simply impossible to assess their class using the immunofluorescence stain alone, especially in areas with active inflammation. For such

(33)

CHAPTER 4. DISCUSSION 21

cases, a perfect segmentation does not exist [10].

4.4 Future Work

In this project work, the data augmentation technique was implemented to cope with the less training dataset. However, it would be interesting to implement brightness augmentation to alter the pixel intensity which would allow the network to learn better the different shapes of the nuclei and epithelial. Also, to visualize the segmentation on the whole image the tiles were merged and the effect of tiling was seen on the full size image. It will be interesting to come up with a solution that can eliminate the effect of tiling from the merged images. Additionally, different networks can be trained for the same dataset and the performance can be compared to know which model is the best for this kind of problem.

(34)

Chapter 5 Conclusions

This project focuses on data preparation and the implementation of neural network models to segment the cancer epithelial tissue regions based on only nuclei morphology. To achieve the aim, hyper-parameter optimization was conducted for U-Net and Residual U-Net models. Additionally, the transfer learning approach was also explored and compared with the U-Net and Residual U-Net results. The novelty of this work lies in using only nuclear morphology to segment cancer epithelium in BCa.

The results show promising performance achieved by Residual U-Net with the highest dice score of 90.07% on the test set and 86.27% on the validation set. However, the transfer learning approach didn’t provide good results as compared to Residual U-Net and one of the causes can be due to small resolution tiles that are used to train the network. It can be probably improved by adding weights from more shallow layers which will help the network to learn local patterns from the image.

This project gives a solution to segment cancer epithelial from nuclei morphology and we think there is a possibility for the proposed method being applied to assist pathologists with the diagnosis of BCa. However, more pathological data with variation (in terms of staining concentration, intensity variation etc) would be required to develop a more robust method which would give better results even if the scans have some artifacts.

22

(35)

Bibliography

[1] Yuan Chen et al. “Androgen receptor (AR) suppresses miRNA-145 to promote renal cell carcinoma (RCC) progression independent of VHL status”. In: Oncotarget 6.31 (2015), pp. 31203–31215. issn: 1949-2553.

doi: https://doi.org/10.18632/oncotarget.4522. url:

https://www.oncotarget.com/article/4522/.

[2] Chao-Zhe Zhu et al. “A review on the accuracy of bladder cancer detec- tion methods”. eng. In: Journal of Cancer 10.17 (July 2019), pp. 4038–

4044. issn: 1837-9664. doi: 10.7150/jca.28989. url: https:

//doi.org/10.7150/jca.28989.

[3] Eline Oeyen et al. “Bladder Cancer Diagnosis and Follow-Up: The Cur- rent Status and Possible Role of Extracellular Vesicles”. eng. In: Inter- national journal of molecular sciences 20.4 (Feb. 2019), p. 821. issn:

1422-0067. doi: 10 . 3390 / ijms20040821. url: https : / / doi.org/10.3390/ijms20040821.

[4] Hani A. Alturkistani, Faris M. Tashkandi, and Zuhair M. Mohammed- saleh. “Histological Stains: A Literature Review and Case Study”. eng.

In: Global journal of health science 8 (June 2015). issn: 1916-9736.

doi: 10.5539/gjhs.v8n3p72. url: https://doi.org/10.

5539/gjhs.v8n3p72.

[5] E. H. Beutner. “IMMUNOFLUORESCENT STAINING: THE FLUO- RESCENT ANTIBODY METHOD”. eng. In: Bacteriological reviews 25.1 (Mar. 1961), pp. 49–76. issn: 0005-3678. url: https://pubmed.

ncbi.nlm.nih.gov/16350169.

[6] Thomas J. Fuchs and Joachim M. Buhmann. “Computational pathol- ogy: Challenges and promises for tissue analysis”. In: Computerized Medical Imaging and Graphics 35.7-8 (Oct. 2011), pp. 515–530. issn:

0895-6111. doi: 10 . 1016 / j . compmedimag . 2011 . 02 . 006.

23

(36)

24 BIBLIOGRAPHY

url: http : / / dx . doi . org / 10 . 1016 / j . compmedimag . 2011.02.006.

[7] Angel Cruz-Roa et al. “Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks”. In:

Medical Imaging. 2014.

[8] Marek Kowal et al. “Cell Nuclei Segmentation in Cytological Images Using Convolutional Neural Network and Seeded Watershed Algorithm”.

In: Journal of Digital Imaging 33.1 (Feb. 2020), pp. 231–242. issn:

1618-727X. doi: 10.1007/s10278-019-00200-8. url: https:

//doi.org/10.1007/s10278-019-00200-8.

[9] Simon Graham et al. “Hover-Net: Simultaneous segmentation and clas- sification of nuclei in multi-tissue histology images”. In: Medical Im- age Analysis 58 (2019), p. 101563. issn: 1361-8415. doi: https://

doi . org / 10 . 1016 / j . media . 2019 . 101563. url: http:

/ / www . sciencedirect . com / science / article / pii / S1361841519301045.

[10] Wouter Bulten et al. “Epithelium segmentation using deep learning in H&E-stained prostate specimens with immunohistochemistry as refer- ence standard”. In: Scientific Reports 9.1 (2019), p. 864. issn: 2045- 2322. doi: 10 . 1038 / s41598 - 018 - 37257 - 4. url: https : //doi.org/10.1038/s41598-018-37257-4.

[11] Qinqin Yang et al. “Epithelium segmentation and automated Gleason grading of prostate cancer via deep learning in label-free multiphoton microscopic images”. In: Journal of Biophotonics 13.2 (2020), e201900203.

doi: 10.1002/jbio.201900203. eprint: https://onlinelibrary.

wiley . com / doi / pdf / 10 . 1002 / jbio . 201900203. url:

https : / / onlinelibrary . wiley . com / doi / abs / 10 . 1002/jbio.201900203.

[12] Stuart Berg et al. “ilastik: interactive machine learning for (bio)image analysis”. In: Nature Methods (Sept. 2019). issn: 1548-7105. doi: 10.

1038/s41592- 019- 0582- 9. url: https://doi.org/10.

1038/s41592-019-0582-9.

[13] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. “U-Net: Con- volutional Networks for Biomedical Image Segmentation”. In: CoRR abs/1505.04597 (2015). url: http://arxiv.org/abs/1505.

04597.

(37)

BIBLIOGRAPHY 25

[14] Haruki Imai et al. “Fast and Accurate 3D Medical Image Segmentation with Data-swapping Method”. In: ArXiv abs/1812.07816 (2018).

[15] Zhengxin Zhang, Qingjie Liu, and Yunhong Wang. “Road Extraction by Deep Residual U-Net”. In: IEEE Geoscience and Remote Sensing Let- ters 15.5 (May 2018), pp. 749–753. issn: 1558-0571. doi: 10.1109/

lgrs . 2018 . 2802944. url: http : / / dx . doi . org / 10 . 1109/LGRS.2018.2802944.

[16] Md. Zahangir Alom et al. “Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation”.

In: (Feb. 2018).

[17] Foivos I. Diakogiannis et al. “ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data”. In: CoRR abs/1904.00592 (2019). arXiv: 1904.00592. url: http://arxiv.org/abs/

1904.00592.

[18] Karen Simonyan and Andrew Zisserman. Very Deep Convolutional Net- works for Large-Scale Image Recognition. 2014. arXiv: 1409.1556 [cs.CV].

[19] F. Pedregosa et al. “Scikit-learn: Machine Learning in Python”. In: Jour- nal of Machine Learning Research 12 (2011), pp. 2825–2830.

[20] Martın Abadi et al. TensorFlow: Large-Scale Machine Learning on Het- erogeneous Systems. Software available from tensorflow.org. 2015. url:

https://www.tensorflow.org/.

[21] Johannes Schindelin et al. “Fiji: an open-source platform for biological- image analysis”. In: Nature Methods 9.7 (July 2012), pp. 676–682. issn:

1548-7105. doi: 10.1038/nmeth.2019. url: https://doi.

org/10.1038/nmeth.2019.

[22] Jun Huang et al. “The roles and mechanism of IFIT5 in bladder can- cer epithelial-mesenchymal transition and progression”. In: Cell Death

& Disease 10.6 (2019), p. 437. issn: 2041-4889. doi: 10 . 1038 / s41419- 019- 1669- z. url: https://doi.org/10.1038/

s41419-019-1669-z.

[23] “Sex Differences in Urothelial Bladder Cancer Survival”. In: Clinical Genitourinary Cancer 18.1 (2020), 26–34.e6. issn: 1558-7673. doi:

https://doi.org/10.1016/j.clgc.2019.10.020. url:

http : / / www . sciencedirect . com / science / article / pii/S1558767319303246.

(38)

26 BIBLIOGRAPHY

[24] Oner Sanli et al. “Bladder cancer”. In: Nature Reviews Disease Primers 3.1 (2017), p. 17022. issn: 2056-676X. doi: 10.1038/nrdp.2017.

22. url: https://doi.org/10.1038/nrdp.2017.22.

[25] Anke Richters, Katja K. H. Aben, and Lambertus A. L. M. Kiemeney.

“The global burden of urinary bladder cancer: an update”. In: World Journal of Urology (2019). issn: 1433-8726. doi: 10.1007/s00345- 019-02984-4. url: https://doi.org/10.1007/s00345- 019-02984-4.

[26] John Arevalo, Angel Cruz-Roa, and Fabio González. “Histopathology image representation for automatic analysis: A state-of-the-art review”.

In: Revista Med 22 (July 2014), pp. 79–91.

[27] Kurt Thorn. “A quick guide to light microscopy in cell biology”. In:

Molecular biology of the cell 27.2 (Jan. 2016), pp. 219–222. issn: 1939- 4586. doi: 10 . 1091 / mbc . E15 - 02 - 0088. url: https : / / pubmed.ncbi.nlm.nih.gov/26768859.

[28] Caglar Senaras et al. “Optimized generation of high-resolution phantom images using cGAN: Application to quantification of Ki67 breast cancer images”. In: PLOS ONE 13.5 (May 2018), pp. 1–12. doi: 10.1371/

journal . pone . 0196846. url: https : / / doi . org / 10 . 1371/journal.pone.0196846.

[29] M. Khalid Khan Niazi, Erinn Downs-Kelly, and Metin N. Gurcan. “Hot spot detection for breast cancer in Ki-67 stained slides: image depen- dent filtering approach”. In: vol. 9041. Society of Photo-Optical Instru- mentation Engineers (SPIE) Conference Series. 2014, p. 904106. doi:

10.1117/12.2045586.

[30] Mohammad Faizal Ahmad Fauzi et al. “Classification of follicular lym- phoma: the effect of computer aid on pathologists grading”. In: BMC Medical Informatics and Decision Making 15.1 (2015), p. 115. issn:

1472-6947. doi: 10.1186/s12911-015-0235-6. url: https:

//doi.org/10.1186/s12911-015-0235-6.

[31] Jianxin Chen et al. “Multiphoton microscopic imaging of histological sections without hematoxylin and eosin staining differentiates carci- noma in situ lesion from normal oesophagus”. In: Applied Physics Let- ters 103.18 (2013), p. 183701. doi: 10.1063/1.4826322. eprint:

https : / / doi . org / 10 . 1063 / 1 . 4826322. url: https : //doi.org/10.1063/1.4826322.

(39)

BIBLIOGRAPHY 27

[32] KASSANDRA (ASCP et al. “Hematoxylin and Eosin Tissue Stain in Mohs Micrographic Surgery: A Review”. In: Dermatologic Surgery 37 (June 2011), pp. 1089–1099. doi: 10 . 1111 / j . 1524 - 4725 . 2011.02051.x.

[33] Dilpreet Kaur and Yadwinder Kaur. “Various Image Segmentation Tech- niques: A Review”. In: International Journal of Computer Science and Mobile Computing 3 (2014), pp. 809–814. issn: 5. url: https://

ijcsmc . com / docs / papers / May2014 / V3I5201499a84 . pdf.

[34] Feng Zhao and Xianghua Xie. “An Overview of Interactive Medical Image Segmentation”. In: Annals of the BMVA (2013). url: http : //www.bmva.org/annals/2013/2013-0007.pdf.

[35] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. “ImageNet Classification with Deep Convolutional Neural Networks”. In: 2012, pp. 1097–1105. url: http : / / papers . nips . cc / paper /

4824-imagenet-classification-with-deep-convolutional- neural-networks.pdf.

[36] Jürgen Schmidhuber. “Deep Learning in Neural Networks: An Overview”.

In: CoRR abs/1404.7828 (2014). url: http://arxiv.org/abs/

1404.7828.

[37] V. Badrinarayanan, A. Kendall, and R. Cipolla. “SegNet: A Deep Con- volutional Encoder-Decoder Architecture for Image Segmentation”. In:

IEEE Transactions on Pattern Analysis and Machine Intelligence 39.12 (Dec. 2017), pp. 2481–2495. issn: 1939-3539. doi: 10.1109/TPAMI.

2016.2644615.

[38] Nitish Srivastava et al. “Dropout: A Simple Way to Prevent Neural Net- works from Overfitting”. In: Journal of Machine Learning Research 15 (2014), pp. 1929–1958. url: http://jmlr.org/papers/v15/

srivastava14a.html.

[39] Zhaoyang Xu. “file:///C:/Users/osheen/Downloads/XUZhaoyang_PhD_Final₀40919.pdf ”.

PhD thesis. Queen Mary University of London, 2020.

[40] Korsuk Sirinukunwattana et al. Gland Segmentation in Colon Histol- ogy Images: The GlaS Challenge Contest. 2016. arXiv: 1603.00275 [cs.CV].

(40)

28 BIBLIOGRAPHY

[41] Adnan Khan et al. “A Non-Linear Mapping Approach to Stain Normal- isation in Digital Histopathology Images using Image-Specific Colour Deconvolution”. In: IEEE Transactions on Biomedical Engineering 61 (June 2014). doi: 10.1109/TBME.2014.2303294.

[42] X.-X. Yin et al. “Correction: Tensor based multichannel reconstruc- tion for breast tumours identification from DCE-MRIs”. In: PloS one 12.4 (Apr. 2017). PONE-D-17-13509[PII], e0176133–e0176133. issn:

1932-6203. doi: 10.1371/journal.pone.0176133. url: https:

//pubmed.ncbi.nlm.nih.gov/28406980.

[43] Wouter Bulten et al. “Automated segmentation of epithelial tissue in prostatectomy slides using deep learning”. In: vol. 10581. 2018, pp. 219–

225. doi: 10.1117/12.2292872. url: https://doi.org/

10.1117/12.2292872.

[44] Dan Ciresan et al. “Deep Neural Networks Segment Neuronal Mem- branes in Electron Microscopy Images”. In: 2012, pp. 2843–2851. url:

http://papers.nips.cc/paper/4741- deep- neural- networks-segment-neuronal-membranes-in-electron- microscopy-images.pdf.

[45] Yuxin Cui et al. “A Deep Learning Algorithm for One-step Contour Aware Nuclei Segmentation of Histopathological Images”. In: CoRR abs/1803.02786 (2018). url: http://arxiv.org/abs/1803.

02786.

[46] Philipp Kainz, Michael Pfeiffer, and Martin Urschler. Semantic Seg- mentation of Colon Glands with Deep Convolutional Neural Networks and Total Variation Segmentation. 2015. arXiv: 1511.06919 [cs.CV].

[47] Aïcha Bentaieb, Jeremy Kawahara, and Ghassan Hamarneh. “Multi- loss convolutional networks for gland analysis in microscopy”. In: 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI) (2016), pp. 642–645.

[48] Hao Chen et al. DCAN: Deep Contour-Aware Networks for Accurate Gland Segmentation. 2016. arXiv: 1604.02677 [cs.CV].

[49] Aïcha BenTaieb and Ghassan Hamarneh. Topology Aware Fully Convo- lutional Networks For Histology Gland Segmentation. http://www.

sfu.ca/~abentaie/papers/miccai16.pdf.

(41)

BIBLIOGRAPHY 29

[50] J. Long, E. Shelhamer, and T. Darrell. “Fully convolutional networks for semantic segmentation”. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). June 2015, pp. 3431–3440.

doi: 10.1109/CVPR.2015.7298965.

[51] Liang-Chieh Chen et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. 2016. arXiv: 1606.00915 [cs.CV].

[52] Shuai Zheng et al. “Conditional Random Fields as Recurrent Neural Networks”. In: 2015 IEEE International Conference on Computer Vi- sion (ICCV) (Dec. 2015). doi: 10.1109/iccv.2015.179. url:

http://dx.doi.org/10.1109/ICCV.2015.179.

[53] Yizhe Zhang et al. “Deep adversarial networks for biomedical image segmentation utilizing unannotated images”. English (US). In: Lecture Notes in Computer Science (including subseries Lecture Notes in Arti- ficial Intelligence and Lecture Notes in Bioinformatics). Springer Ver- lag, Jan. 2017, pp. 408–416. doi: 10.1007/978-3-319-66179- 7_47.

[54] Yuanpu Xie et al. “Spatial Clockwork Recurrent Neural Network for Muscle Perimysium Segmentation”. In: Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention 9901 (Oct. 2016), pp. 185–193. doi: 10.1007/978-3-319-46723-8_

22. url: https://pubmed.ncbi.nlm.nih.gov/28090603.

[55] ZhiFei Lai and HuiFang Deng. “Medical Image Classification Based on Deep Features Extracted by Deep Model and Statistic Feature Fusion with Multilayer Perceptron”. In: Computational Intelligence and Neuro- science (2018). url: https://www.hindawi.com/journals/

cin/2018/2061516/.

[56] Quing et. al. “Medical image classification with convolutional neural network”. In: 13th International Conference on Control Automation Robotics Vision (ICARCV) (2014).

[57] Yaniv Bar et al. “Deep learning with non-medical training used for chest pathology identification”. In: in Proceedings of SPIE Medical Imaging, p. 94140V, International Society for Optics and Photonics, Orlando, FL, USA (February 2015.).

(42)

30 BIBLIOGRAPHY

[58] Hoo-Chang Shin et al. “Deep Convolutional Neural Networks for Computer- Aided Detection: CNN Architectures, Dataset Characteristics and Trans- fer Learning”. In: IEEE transactions on medical imaging 35.5 (May 2016), pp. 1285–1298. doi: 10.1109/TMI.2016.2528162. url:

https://pubmed.ncbi.nlm.nih.gov/26886976.

[59] F. Ercal et al. “Neural network diagnosis of malignant melanoma from color images”. In: IEEE Transactions on Biomedical Engineering 41.9 (Sept. 1994), pp. 837–845. issn: 1558-2531. doi: 10 . 1109 / 10 . 312091.

[60] Cristofer Marín et al. “Detection of melanoma through image recognition and artificial neural networks”. In: 51 (Jan. 2015), pp. 832–835.

doi: 10.1007/978-3-319-19387-8_204.

[61] Andrew Zisserman Karen Simonyan. “Very Deep Convolutional Net-

works for Large-Scale Image Recognition”. In: arXiv.org > cs > arXiv:1409.1556 (2015).

[62] Wei Liu et al. Christian Szegedy. “Going Deeper with Convolutions”.

In: (2015).

[63] Kaiming He et al. “Deep Residual Learning for Image Recognition”. In:

CoRR abs/1512.03385 (2015). url: http://arxiv.org/abs/

1512.03385.

[64] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully Convolu- tional Networks for Semantic Segmentation. 2014. arXiv: 1411.4038 [cs.CV].

(43)

Appendix A

State of Art Review

The main goal of this state of art review section is to summarize the different techniques which are being used currently for the segmentation of cancer epithelial cells and different research studies conducted on this topic in the past few years. The section will mainly consist of an introduction of bladder cancer and epithelial followed by the clinical practices used for segmentation and last but not the least a description of how computer vision can help and improve the segmentation (using CNN’s).

A.1 Introduction

Bladder cancer(BCa) is one of the most common types of cancers worldwide and it is more commonly found in men as compared to women [22, 23]. It is typically found in the epithelial lining of the bladder and the most common symptom is microscopic or macroscopic blood in the urine [24]. Advanced pathological stage (where the pathologists studies how different the cells look under microscope as compared to the normal cells) is associated with macroscopic blood in the urine whereas for microscopic there is no screening for the presence of cancer [24]. The presence of BCa widely depends on the geographical region as the population in European countries and North America are more likely to have this type of cancer as compared to those who are resid- ing in South America and Southeast Asia [24]. This variation depending on the geographical region is mainly due to different exposure factors like cigarette smoking, contact with chemicals in certain occupation and presence of arsenic in drinking water [25].

In order to diagnose BCa, the pathologist manually examines the images

31

(44)

32 APPENDIX A. STATE OF ART REVIEW

obtained under the microscope and the decision are made which is time consuming and can also result in human error. However, to make the entire procedure automated, efficient and more accurate, there is a need for computer vision which has the potential of giving more promising results.

A.2 Histopathology Images: Slide Prepara- tion and Digitization

Usually the medical images are acquired by techniques such as ultrasound, MRI, CT etc. The histopathological images are acquired in a complicated manner by following the procedure in six steps. The steps to digitize the tissue sample slides include: : fixation, grossing, processing, embedding, section- ing and staining [26]. Fixation is used for maintaining the molecular structure and state of the tissue as they were alive. Labelling and pre-processing of the sample is called as grossing. Next the sample goes through different chem- ical reactions and then solidified followed by staining as the last step. The tissue samples are stained in order to visualize it else nothing can be seen and it will be transparent. The mostly commonly used staining procedures are hematoxylin eosin (HE) and immunohistochenistry(IHC). Further, to visualize these stained samples, light microscopy is used which can be categorized in: bright field and florescence [27]. As compared to other imaging modalities, it takes longer time to acquire the histopathological images.

A.3 Clinical Approach

A.3.1 Segmentation Technique

In pathology different staining procedures are carried out to study the characteristics of the cell from the samples. Multiplexed immunohistochemistry(mIHC) is one of the staining techniques which allows for simultaneous labeling of up to 7 markers of interest in tissue specimen maintaining sample’s spatial structure and facilitating single cell analysis resolution for diagnosis and grading of tumors [28]. It is also used for deciding which cancer therapy should be given to the patient depending on the grade of the tumor [29]. Usually the pathologist manually counts the positively and negatively stained cells under the microscope to study the IHC staining. Traditionally the pathologist is ex- pected to manually segment the cells or count them and then compare the

(45)

APPENDIX A. STATE OF ART REVIEW 33

results with the computerized image analysis methods [30]. However this procedure is time consuming and prone to errors as over or under-staining can result in poor contrast and resolution during imaging [31, 32]. Additionally the manual segmentation can vary from pathologist to pathologist depending on their expertise.

A.3.2 Grading and Staging of the tumor

The doctors use the TNM system to describe the stage of the cancer and ac- cordingly decide on the treatment plan for the patient [10]. TNM stands for Tumor(T), Node(N) and Metastasis(M). T tells how large the tumor is and if it can spread to the lymph nodes is observed by the N staging. M staging tells if the tumor has the potential of spreading to the other parts of the body [10].

The following Table A.1 descibles how the tumor is categorized into different stages by the pathologists and doctors. Once the classification of the TNM is confirmed the doctor assigns the stage to the cancer. There are five stages in total starting from 0 and ranging from I to IV (1 to 4) [10].

A.4 Deep Learning Approach

Deep learning is a part of machine learning algorithms which mimics the functioning of the human brain and learns through the artificial neural networks with representation learning. Nowadays, it plays an important role in healthcare for early detection of the diseases, diagnosis and treatment plan- ning. Among various type of deep neural networks, convolutional neural network (CNN) serves as a great tool for image classification and segmentation.

Although the image type holds significance importance when it comes to mea- suring of the segmentation performance for instance: noise artifacts, imaging modality, variation in the pathologies and uncertanities etc. [33, 34], CNN still has the potential to give promising results.

A.4.1 Convolutional Neural Network

CNN is a type of deep neural network which has a prior knowledge of learning about the different objects from million images which helps to compensate for the less amount of dataset [35]. CNNs are especially effective at learning patterns from image data because such networks make assumptions about the

(46)

Staging Letter/Number Description

T TX tumor can not be

evaluated

T0 no primary tumor

in bladder

Ta non invasive car-

cinoma

Tis flat tumor

T1 spread to connec-

tive tissue

T2a spread to inner

muscle

T2b spread to deep

muscle

T3a grown to the

perivesical tissue

T3b grown to the

perivesical tissue and is large

T4a spread to pros-

trate

T4b spread to pelvic

and abdominal wall

N NX can not be evalu-

ated

N0 not spread to

lymph nodes

N1 spread to single

lymph node

N2 spread to two or

more lymph node

N3 spread to lymph

node

M M0 not metastasized

M1a spread to lymph

nodes

M1b spread to other

parts of body Table A.1: Declaration of Staging Criteria

(47)

APPENDIX A. STATE OF ART REVIEW 35

input data which simplifies the learning problem. The performance of the architecture models used for training the CNN’s can be controlled by varying the depth of the neural network, and this is same for all deep neural networks (DNNs). With the flexibility of adjusting the depth, the neural network has the potential to learn features from the different kind of inputs and make more pre- cise predictions [35, 36]. This is due to the fact that more high-level features of increased complexity can be extracted at deeper depths whereas, shallow networks might be beneficial for simple tasks, while deep networks are nec- essary for complex ones. The similarity between CNN and traditional neural network is that the optimization process during the learning is same in both the networks. Generally, one additional similarity of all DNNs is that they are: neural networks with multiple layers that learn hierarchical representation through a sequence of functional compositions.

Architecture of CNN

The basic CNN network consists of several layers which are responsible for performing specific function while training the network. Convolutional layers maps the features from the input by using convolutional filter to extract the features from the input image. Pooling layer is used to reduce the size of the representation thereby reducing the computational time. It also increases the receptive fields of consequent filters, which fundamentally allows such filters to go deep in the image and thus extract more high-level information. Fully connected input layer is used to convert the output which is obtained by the previous layers into a single vector (flatten) so that it can be fed to the suc- cessive layers. Further the weights are applied by the fully connected layer to predict the actual label. Lastly the final output layer is the representation of the category to which the image belongs to after training the entire model [37].

In addition to these there is a dropout layer which was firstly introduced in the research work [38]. While training the neural network the parameters are updated depending on the gradient. Due to this the gradient can be misleaded by some noise which is actually not present in the original dataset. To avoid this, dropout layer is introduced which will ignore some of the parameters while training iteration so that the model learns only the specific details.

Evaluation metrics and Loss Functions

Several metrics can be used in order to evaluate the segmentation performance of the model. The main aim of evaluation is to determine which approach is

(48)

Figure A.1: Convolutional Neural Network Architecture

best suited for the given case. Some of the metrics which are used in the previous studies are discussed below:

• Pixel Accuracy: It is determined by the predicted value at each pixel. At pixel level true positive (TP), true negative (TN), false positive (FP) and false negative (FN) is calculated [39].

• Mean Intersection of Union (MIoU): It is the mean of Jaccard indices as Jaccard indices are called the intersection over union (IoU) [39]. The Jaccard and Dice indices can be calculated as:

J accard(G, S) = |G ∩ S|

|G ∪ S| and Dice(G, S) = 2 × |G ∩ S|

|G| + |S|

Where G is the ground truth and S is the segmented result.

Therefore MIoU is:

M IoU = 1 c

c

X

i=1

(1 n

n

X

j=1

IoUij

Where n is the number of samples and c is the number of classes [39].

• Boundary accuracy can also be measured in order to evaluated the performance of the model which is also known as Hausdorff distance [40]

• Other methods can be receiver operating characteristic (ROC) curve and area under the curve (AUC). ROC can be used for binary classifier and AUC can be used to measure how well the model distinguishes between two groups [39].

Segmentation of cancer epithelium using nuclei morphology with Deep Neural Network

Segmentation of cancer epithelium using nuclei morphology with

Deep Neural Network

OSHEEN SHARMA

Segmentation of cancer epithelium using nuclei

morphology with Deep Neural Network

OSHEEN SHARMA

Abstract

Sammanfattning

Acknowledgement

List of Abbreviations

Contents

List of Figures

List of Tables

Chapter 1 Introduction

1.1 Aim

Chapter 2 Methods

2.1 Materials and Tools

2.1.1 Dataset

2.1.2 Ground Truth Generation

2.1.3 Image Tiling

2.2 Deep Convolutional Neural Network

2.2.1 U-Net

2.2.2 Residual U-Net

2.2.3 VGG16

2.3 Implementation Details

2.3.1 Setting hyper-parameters

2.3.2 Training

2.3.3 Evaluation Metrics

2.3.4 Cross Validation

2.4 Computer Specification & Software

Chapter 3 Results

3.1 Ground Truth Generation

3.2 Experiment 1: Variation in U-net Depth

3.3 Experiment 2: Training on Downsampled Dataset

3.4 Experiment 3: Training on Cropped Dataset

3.4.1 Varying the resolution of tiles

3.4.2 Effect of data augmentation

3.5 Experiment 4: VGG16

3.6 Comparison between the Architectures using Cross-Validation

3.7 Prediction on Test Dataset

Chapter 4 Discussion

4.1 Impact of Tiling the Images

4.2 Impact of Downsampling and Cropping

4.3 Project Limitations

4.4 Future Work

Chapter 5

Conclusions

Bibliography

Appendix A

State of Art Review

A.1 Introduction

A.2 Histopathology Images: Slide Prepara- tion and Digitization

A.3 Clinical Approach

A.3.1 Segmentation Technique

A.3.2 Grading and Staging of the tumor

A.4 Deep Learning Approach

A.4.1 Convolutional Neural Network