artificial intelligence in medicine · the tissues of the urinary bladder. bladder cancer begins...

16
Contents lists available at ScienceDirect Articial Intelligence In Medicine journal homepage: www.elsevier.com/locate/artmed Using multi-layer perceptron with Laplacian edge detector for bladder cancer diagnosis Ivan Lorencin a , Nikola Anđelić a, *, Josip Španjol b,c , Zlatan Car a a University of Rijeka, Faculty of Engineering, Vukovarska 58, 51000 Rijeka, Croatia b University of Rijeka, Faculty of Medicine, Braće Branchetta 20/1, 51000 Rijeka, Croatia c Clinical Hospital Center Rijeka, Krešimirova 42, 51000 Rijeka, Croatia ARTICLE INFO Keywords: Articial intelligence Image pre-processing Laplacian edge detector Multi-layer perceptron Urinary bladder cancer ABSTRACT In this paper, the urinary bladder cancer diagnostic method which is based on Multi-Layer Perceptron and Laplacian edge detector is presented. The aim of this paper is to investigate the implementation possibility of a simpler method (Multi-Layer Perceptron) alongside commonly used methods, such as Deep Learning Convolutional Neural Networks, for the urinary bladder cancer detection. The dataset used for this research consisted of 1997 images of bladder cancer and 986 images of non-cancer tissue. The results of the conducted research showed that using Multi-Layer Perceptron trained and tested with images pre-processed with Laplacian edge detector are achieving AUC value up to 0.99. When dierent image sizes are compared it can be seen that the best results are achieved if × 50 50 and × 100 100 images were used. 1. Introduction The bladder cancer is one of the most common cancers arising from the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more cancer cells develop eventually they form a tumor and spread to other areas of a human body. The symptoms which could indicate bladder cancer are: blood in urine, pain with urination and low back pain. Potential risk factors for bladder cancer are smoking, family history, prior radiation therapy, frequent bladder infection and exposure to certain chemicals (aromatic amines) [1]. It should be noted that to- bacco smoking is the main known contributor to urinary bladder cancer [2,3]. The research [4] showed that obesity is associated with a risk of urinary bladder cancer. There are dierent types of urinary bladder cancer and these are: urothelial carcinoma (transitional cell carcinoma) is one of the most common types of bladder cancer. This type of cancer originates in the urothelial cells which are located at the inside of the urinary bladder [5,6]. squamous cell carcinoma associated with chronic irritation of the bladder i.e. from an infection or long-term use of urinary catheter. This type of bladder cancer is rare and it is more common in world regions where the main cause of infection is one of parasitic origins [7,8], adenocarcinoma very rare type of carcinoma which may arise in the bladder as well as from a number of other organs [9,10]. small cell carcinoma another very rare, highly aggressive type of bladder cancer. The small cell carcinoma is diagnosed at advanced stages and it is generally believed to have high metastatic potential [11,12]. sarcoma extremely rare type of bladder cancer [13,14]. The rarity of this cancer makes cytologist dicult to detect the spindle cell lesions in urine for malignant cytology. As mentioned previously, urothelial carcinoma, also known as transi- tional cell carcinoma (TCC) is the most common type of bladder cancer. This type of cancer starts in the urothelial cells which are located inside the bladder. Urothelial cells are also found in other parts of urinary tract for example part of kidney that connects to the ureter (renal pelvis), the ureters and the urethra. Bladder cancers can be classied based on penetration depth of bladder wall and based on their shape. There are two types of bladder cancer based on penetration depth of bladder wall classication, and these are non-invasive and invasive cancers. Noninvasive cancers are only in the transitional epithelium which is inner cells layer of a bladder. The invasive cancers are grown into deeper layers of the bladder wall. In case of invasive cancers, there is higher possibility of spreading to other parts of human body and generally they are harder to treat. In case of shape based classication there are also two types of cancer and these are papillary carcinomas https://doi.org/10.1016/j.artmed.2019.101746 Received 21 May 2019; Received in revised form 22 October 2019; Accepted 27 October 2019 Corresponding author. E-mail address: [email protected] (N. Anđelić). Artificial Intelligence In Medicine 102 (2020) 101746 0933-3657/ © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/BY-NC-ND/4.0/). T

Upload: others

Post on 16-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

Contents lists available at ScienceDirect

Artificial Intelligence In Medicine

journal homepage: www.elsevier.com/locate/artmed

Using multi-layer perceptron with Laplacian edge detector for bladdercancer diagnosis

Ivan Lorencina, Nikola Anđelića,*, Josip Španjolb,c, Zlatan Cara

aUniversity of Rijeka, Faculty of Engineering, Vukovarska 58, 51000 Rijeka, CroatiabUniversity of Rijeka, Faculty of Medicine, Braće Branchetta 20/1, 51000 Rijeka, Croatiac Clinical Hospital Center Rijeka, Krešimirova 42, 51000 Rijeka, Croatia

A R T I C L E I N F O

Keywords:Artificial intelligenceImage pre-processingLaplacian edge detectorMulti-layer perceptronUrinary bladder cancer

A B S T R A C T

In this paper, the urinary bladder cancer diagnostic method which is based on Multi-Layer Perceptron andLaplacian edge detector is presented. The aim of this paper is to investigate the implementation possibility of asimpler method (Multi-Layer Perceptron) alongside commonly used methods, such as Deep LearningConvolutional Neural Networks, for the urinary bladder cancer detection. The dataset used for this researchconsisted of 1997 images of bladder cancer and 986 images of non-cancer tissue. The results of the conductedresearch showed that using Multi-Layer Perceptron trained and tested with images pre-processed with Laplacianedge detector are achieving AUC value up to 0.99. When different image sizes are compared it can be seen thatthe best results are achieved if ×50 50 and ×100 100 images were used.

1. Introduction

The bladder cancer is one of the most common cancers arising fromthe tissues of the urinary bladder. Bladder cancer begins when cells inthe urinary bladder mucosa start to uncontrollably grow and as morecancer cells develop eventually they form a tumor and spread to otherareas of a human body. The symptoms which could indicate bladdercancer are: blood in urine, pain with urination and low back pain.Potential risk factors for bladder cancer are smoking, family history,prior radiation therapy, frequent bladder infection and exposure tocertain chemicals (aromatic amines) [1]. It should be noted that to-bacco smoking is the main known contributor to urinary bladder cancer[2,3]. The research [4] showed that obesity is associated with a risk ofurinary bladder cancer. There are different types of urinary bladdercancer and these are:

• urothelial carcinoma (transitional cell carcinoma) – is one of themost common types of bladder cancer. This type of cancer originatesin the urothelial cells which are located at the inside of the urinarybladder [5,6].

• squamous cell carcinoma – associated with chronic irritation of thebladder i.e. from an infection or long-term use of urinary catheter.This type of bladder cancer is rare and it is more common in worldregions where the main cause of infection is one of parasitic origins[7,8],

• adenocarcinoma – very rare type of carcinoma which may arise inthe bladder as well as from a number of other organs [9,10].

• small cell carcinoma – another very rare, highly aggressive type ofbladder cancer. The small cell carcinoma is diagnosed at advancedstages and it is generally believed to have high metastatic potential[11,12].

• sarcoma – extremely rare type of bladder cancer [13,14]. The rarityof this cancer makes cytologist difficult to detect the spindle celllesions in urine for malignant cytology.

As mentioned previously, urothelial carcinoma, also known as transi-tional cell carcinoma (TCC) is the most common type of bladder cancer.This type of cancer starts in the urothelial cells which are located insidethe bladder. Urothelial cells are also found in other parts of urinarytract for example part of kidney that connects to the ureter (renalpelvis), the ureters and the urethra. Bladder cancers can be classifiedbased on penetration depth of bladder wall and based on their shape.There are two types of bladder cancer based on penetration depth ofbladder wall classification, and these are non-invasive and invasivecancers. Noninvasive cancers are only in the transitional epitheliumwhich is inner cells layer of a bladder. The invasive cancers are growninto deeper layers of the bladder wall. In case of invasive cancers, thereis higher possibility of spreading to other parts of human body andgenerally they are harder to treat. In case of shape based classificationthere are also two types of cancer and these are papillary carcinomas

https://doi.org/10.1016/j.artmed.2019.101746Received 21 May 2019; Received in revised form 22 October 2019; Accepted 27 October 2019

⁎ Corresponding author.E-mail address: [email protected] (N. Anđelić).

Artificial Intelligence In Medicine 102 (2020) 101746

0933-3657/ © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/BY-NC-ND/4.0/).

T

Page 2: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

and flat carcinomas. Papillary carcinomas grow on the inner surfacetowards the hollow center of the bladder. This shape of this cancer typeis finger-like projections. They often grow towards the center of thebladder without growing deeper into bladder layers, and they are callednon-invasive papillary cancers. On the other hand, flat carcinomas donot grow toward the hollow center of a bladder. For example if a flattumor is located only in the inner layer of bladder cells, it is called non-invasive flat carcinoma or flat carcinoma in situ (CIS).

There have been studies to classify cancer using AI techniques suchas Artificial Neural Networks (ANN) [15–18], Fuzzy Logic [19,20] andSupport Vector Machines (SVM) [21–23]. While AI is vast field withvariety of unexplored algorithms that can be used in medical researchthe ANNs are one of the best studied so far. The ANN have been suc-cessfully implemented in image classification [24–28] but there willalways be area for improvement in terms of classification performancewhile maintaining reasonable complexity of the ANN architecture. Theimplementation of ANN in medicine could be observed by number ofpublished papers [29]. The concept of ANN is not new since they havebeen a field of research for more than 80 years [30]. The first ANNmodels were perceptrons and these are artificial neurons capable ofsolving simple linear problems. The first ANN capable of solving non-linear problems was developed in 1974 [31]. The first useful compu-tational models of supervised learning ANNs were developed in 60s[32,33] and mid 80s [34] of last century.

There are numerous research papers that describe successful im-plementation of ANN in detection, staging and prognosis of prostateand urinary bladder cancer. For purposes of this paper the chron-ological overview of ANN implementation in diagnostic of bladder andprostate cancer is shown and is listed in Table 1.

As seen in Table 1 over the last 25 years there were many successfulattempts in utilizing AI tools in order to develop detection systems forbladder/prostate cancer. However, the results obtained in these re-search papers could be improved in terms of AUC performance whichcan be a viable classification measure [51]. AUC is term abbreviatedfrom Area Under the ROC Curve and provides the information of modelcapability for class distinction. The range of AUC values is between 0and 1, where 1 represents a perfect classification. The ROC is termabbreviated from Receiver Operating Characteristics and represents agraphical plot that illustrates the classification performance of the

classifier.In this paper the idea is to implement Multi-Layer Perceptron (MLP)

in combination with Laplacian edge detector in order to investigate thepossibility of implementing such algorithm in bladder cancer detectionand to achieve higher AUC values than methods presented in Table 1.The MLP has a long history of implementation in medical research forimage classification [52–57], detection [58–62] and prediction[63–65]. Although the MLP has been implemented in medical researchover the years there is still space for its implementation, in addition tostandard algorithms such as CNN.

The edge detection which is a fundamental tool in image processing,computer, and machine vision, is a mathematical method which is usedto identify points in a digital image at which the image brightnesschanges sharply or has a discontinuity [66,67]. The edges are re-presented as curved lines, and each line consists of points at whichbrightness is sharply changing. The methods for edge detection areclassified in two main categories and these are gradient based [66,67]and Laplacian based method [66–68]. The image edges in gradientbased method are detected by applying the first derivative of the image.The gradient magnitude measures the edge strength and computes thelocal edge orientation which is in the gradient direction. This methodhas been successfully applied in edge detection of medical images[69,70]. The Laplacian based method is used to compute second-orderderivative expression of an image which has a zero crossing. In thismethod, edges are found by searching zero crossing of nonlinear dif-ferential expression. In some cases, before the application of Laplacianedge detection, a preprocessing step Gaussian smoothing is appliedwhich is commonly a refining stage. The Laplacian method has alsobeen successfully applied in different fields such as computing [71] andmedicine [72]. According to [68,73,74] Laplacian based method foredge detection is performing better in terms edge localization in com-parison with gradient based methods so Laplacian based method will beutilized in this research.

To summarize the novelty of this paper, the idea is to implementMLP in combination with Laplacian edge detection in order to achievehigher AUC values than presented in Table 1 while maintaining rea-sonable complexity of classification algorithm. From previous state-ment the following hypothesis can be drawn:

Table 1Chronological overview of ANN implementation in diagnostic of bladder and prostate cancer.

References Techniques Tissue Results

[35] ANN Prostate Sensitivity: 0.93, Specificity: 0.81, AUC : 0.87[36] ANN Bladder Accuracy: 91%[37] ANN Prostate Sensitivity: 0.92, Specificity: 0.62, AUC : 0.77[38] ANN Bladder Sensitivity: 1, Specificity: 0.78, AUC : 0.89, Accuracy:75–82%[39] ANN Prostate Sensitivity: 0.59–0.67, 0.33–0.60, AUC : 0.88–0.91, 0.83–0.9[40] ANN Bladder Accuracy: 95%[41] ANN Bladder Sensitivity: 0.93, Specificity: 0.52, Accuracy: 98%, AUC : 0.72[42] ANN Prostate AUC : 0.74–0.76, 0.75–0.76[43] ANN Prostate Sensitivity: 0.95, Specificity: 0.68, 0.8–0.54, Accuracy: 83%, AUC : 0.61–0.79[44] ANN Bladder Accuracy: 90–97%[45] ANN Bladder Sensitivity: 0.80, Specificity: 0.90–0.97, AUC : 0.85–0.885[46] FCMUHL Bladder Accuracy: 72–95%[47] Similarity classifier Bladder Accuracy: 99%[48] DL-CNN Bladder AUC : 0.7± 0.06[49] RF, ERF, GBT Bladder AUC : 0.90–0.91[50] DL-CNN Bladder Accuracy: 95–98%

ANN – Artificial Neural Network.AUC – Area under ROC curve.DL-CNN – Deep Layered-Convolutional Neural Network.ERF – Extremely Randomized Forrest.FCMUHL – Fuzzy Cognitive Maps with Unspervised Hebbian Learning.LR – Logistic Regression.ROC – Receiver Operating Characteristic Curve.RF – Random Forrest.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

2

Page 3: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

• to achieve higher AUC values in image classification of bladdercancer,

• to investigate the influence of image size, MLP model complexity,non-linear activation function type, and solver on AUC value, and

• to investigate the influence of aforementioned parameters on MLPtraining time.

Based on obtained results from previously defined investigation theMLP architectures that produce highest AUC values will be determined.

2. Materials and methods

In this section the detailed overview of materials and methods usedwill be covered. This section is divided on Dataset Description,Laplacian edge detector, Image Resizing, MLP with Laplacian of animage as an input and Research Methodology. The Dataset Descriptiongives short and concise description of obtaining images procedure usingflexible cystoscopy and description of entire dataset used in this re-search. The Laplacian edge detector gives short overview and mathe-matical description of Laplacian edge detector. In the Image Resizingthe procedure of image downsizing is described. The MLP withLaplacian subsection describes the dataflow process of MLP withLaplacian of an image as an input, and in subsection ResearchMethodology, as the name states, the research methodology procedureis described.

2.1. Dataset description

According to [75], the cystoscopy is an endoscopic procedure whichis used to visually evaluate the urethra and bladder respectively. Gen-erally, the camera is inserted via the urethra under local or generalanesthesia. Once inside the bladder the lesions can be biopsied or re-moved. According to [76] there are two types of cystoscopy and theseare:

• flexible cystoscopy – is a procedure to check for any problem inpatients bladder using a flexible fiber-optic, and

• rigid cystoscopy – is a procedure to check for any problems in pa-tients bladder using a rigid fiber-optic.

In this research the flexible cystoscopy is used in order to obtain imagesof urinary bladder tissue. Together with flexible cystoscopy, a biopsy isperformed for medical findings evaluation. The example of an imagewhere urinary bladder cancer is diagnosed and confirmed with biopsyfindings is shown in Fig. 1.

Together with images that are representing urinary bladder cancer,images of non-cancer tumor tissue are also obtained. Findings are

confirmed with the biopsy. An example of such image is shown inFig. 2.

On the described way an image dataset is obtained. Dataset contains1997 images of bladder cancer and 986 images of non-cancer tissue.From the dataset 1500 images that represent cancer and 500 non-cancer images are used for classifier training. All images are 720 pixelswide and 540 pixels high. Images that are not included into trainingdataset are used for test dataset.

2.2. Laplacian edge detector

In image classification, edge detection is a process of capturingimage properties, which includes detecting discontinuities of an image[77]. For grayscale images, discontinuity is mostly occurred as stepbetween two regions, with almost constant but different grey levels.Differences between two regions are, in fact, differences between theirpixel values [78]. Step edges are defined as relative extremum of thefirst derivative, or zero-crossings of the second derivative, as is shownon the example of a 1-D function in Fig. 3.

First plot shown in Fig. 3 represents a pulse defined as

= + − −f x u x u x( ) ( 2) ( 2), (1)

where u x( ) represents the Heaviside step function. Second plot re-presents the first derivative of the shown pulse. Using first derivative, xvalues where edges can occur, are defined as x , if

′ ≠f x( ) 0. (2)

Third plot in Fig. 3 represents a second derivative of the pulse function.In this case, an edge can be occurred for all x on which zero crossing ofthe second derivative is present. If more general case is described, the

Fig. 1. Image of urinary bladder cancer.

Fig. 2. Image of non-cancer urinary bladder tissue.

Fig. 3. First derivative and second derivative of a pulse function.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

3

Page 4: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

second derivative can be defined with Laplacian as

∑= ∂∂=

M Mx

Δ ,i

n

i1

2

2(3)

where M represents a function with n variables. Furthermore, n re-presents dimensionality of the function. Because of this property, animage can be expressed as a discrete two-variable function M x y[ , ].Laplacian of an image can be defined as

= ∂∂

+ ∂∂

M x y M x yx

M x yy

Δ [ , ] [ , ] [ , ] ,2

2

2

2 (4)

where x represents the variable of image width and y represents thevariable of image height. On these terms, the edge is represented byzero crossing of the Laplacian. An example of the image on which La-placian edge detector is applied is shown in Fig. 4.

2.3. Image resizing

For purposes of this study, an image resizing method is described.This method can be used only for image downsizing. The ratio of size oftwo images can be expressed using scale factors. These scale factors arewidth (sx) and height (sy) scale factor. Width scale factor can be definedas

=s nn

,xx

x

1

2 (5)

where nx1 represents original image width in number of pixels and nx2represents new image width in number of pixels. Height scale factor canbe defined as

=snn

,yy

y

1

2 (6)

where ny1 represents original image height in number of pixels and ny2represents new image height in number of pixels. Scale factors are ac-tually representing width and height of the image pixel group whichwill be transformed into one pixel of the resized image. Using scalefactors, area of the pixel group used for resizing can be calculated as

=A s s .x y (7)

From scale factors standpoint, two different cases of image resizing canbe defined as:

• Resizing with integer scale factors and

• Resizing with non-integer scale factors.

For the case of image resizing with integer scale factors, sx and sy can bedefined as:

∈s s Z, .x y (8)

From Eqs. (7) and (8) it can be seen, that whole pixels will be includedinto the image pixel group used for creating a new pixel, as is shownwith the shaded area in Fig. 5.

Intensity of each pixel into the pixel group (I ) can be defined as apart of the interval

≤ ≤I0 255. (9)

Because original image is grayscale, zero represents black and 255 re-presents the white pixel. All values between are representing one of thegray levels [79]. By using described pixel values, area (A) and scalefactors (sx and sy), a pixel value of the new image pixel (IR) can becalculated as follows:

∑ ∑== =

IA

I x y1 [ , ].Rx

s

y

s

0 0

x y

(10)

From Eqs. (9) and (10) a conclusion can be drawn that the new pixelwill also be in following range

≤ ≤I0 255.R (11)

In case of scaling with non-integer scale factor, some pixels will be onlypartially included into the pixel group, as is shown in Fig. 6.

Part of the pixel that is included into the pixel group used for cal-culation of the new pixel value, can be determined as

=

⎪⎪

⎪⎪

≤ ⌊ ⌋ ∧ ≤ ⌊ ⌋

> ⌊ ⌋ ∧ ≤ ⌊ ⌋

≤ ⌊ ⌋ ∧ > ⌊ ⌋

> ⌊ ⌋ ∧ ≤ ∧ > ⌊ ⌋ ∧ ≤

P x y

x s y s

s x s y s

s x s y s

s s x s x N y s y N

[ , ]

1:

{ }:

{ }:

{ }{ }: ( ) ( )

,

x y

x x y

y x y

x y x x y y (12)

where s{ }x represents the fractional part of sx and s{ }y represents thefractional part of sy. As is it shown with Eq. (12)1 whole pixel will beincluded only if ≤ ⌊ ⌋ ∧ ≤ ⌊ ⌋x s y sx y . On the other hand, pixels will beonly partially included if > ⌊ ⌋ ∧ ≤x s x N( )x x or > ⌊ ⌋ ∧ ≤y s y N( )y y , as itis shown in Eq. (12)2, Eq. (12)3 and Eq. (12)4. Using P x y[ , ], new pixelvalue can be calculated as

Fig. 4. Image of urinary bladder cancer with Laplacian edge detector.

Fig. 5. Example of image resizing with integer scale factors.

Fig. 6. Example of image resizing with non-integer scale factor.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

4

Page 5: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

∑ ∑== =

IA

P x y I x y1 [ , ] [ , ].Rx

N

y

N

0 0

x y

(13)

Described procedures will be performed for all pixel groups of thepicture. The number of pixel groups can be determined as

= ⌈ ⌉⌈ ⌉N n n .A x y2 2 (14)

For the case of scaling with integer scale factors, one pixel will be partof just one pixel group. This is not a case for scaling with non-integerscale factors. In this case, one pixel can be partially included in morethan one pixel group.

2.4. MLP with Laplacian of an image as an input

For purposes of this study, a new method is described. In thismethod, original urinary bladder tissue image is first resized usingdescribed resizing method. After that, Laplacian is applied on resizedimages. Laplacian of an image is then used as the input for the neuralnetwork. In case of this study, a Multilayer perceptron (MLP) is used, asshown with dataflow diagram in Fig. 7.

The MLP is a type of feedforward ANNs that, in its basic form,consists of three layers and these are input layer, hidden layer andoutput layer. This type of ANN has high degree of connectivity, theextent of which is determined by synaptic weights of the network [80].The MLP is trained using supervised learning method called back-propagation algorithm. The training process according to [80] can bedivided in two phases and these are forward and backward phase. Inforward phase the synaptic weights of the MLP are fixed as the signal ispropagated through the network. In this phase the changes are confinedto the activation potentials and outputs of the neurons in the network.In the backward phase the error signal is produced as difference be-tween generated output and desired output. The error signal is thenpropagated backward until it reaches synaptic weight and the synapticweights are then adjusted. In the hidden layer, each node is a neuronthat contains a nonlinear activation function and it uses a supervisedlearning technique [81]. Activation function is the key element of anartificial neuron [82]. In this study, three different activation functionsare used and these are:

• Rectifier Linear Unit (ReLU),

• Hyperbolic Tangent (Tanh) and

• Logistic sigmoid.

In order to perform supervised learning technique, optimization algo-rithms (optimizers) are used. In case of this study, three different op-timizers are used and these are:

• Limited-memory Broyden–Fletcher-Goldfarb–Shanno algorithm (L-BFGS) [83],

• Stochastic gradient descent algorithm (SGD) [84] and

• Adaptive learning rate optimization algorithm (Adam) [85].

The number of input and output nodes are determined by data char-acteristics. For example, if the image is 30 pixels wide and 30 pixelshigh, the number of input nodes will be 900. That is the exact numberof image pixels. The number of output nodes depends on the number ofclasses in which data is divided. In the case of this study, the outputlayer contains only one output node. The output value is 1 if the inputimage represents cancer and 0 if the input image represents healthytissue. Proposed hidden layer configurations are determined experi-mentally by varying different configurations presented in Table 2. Inthe first column of Table 2 a numerical designation is assigned to eachof the configuration, while columns 2–7 show the number of nodes ineach of the hidden layers respectively.

2.5. Research methodology

In this research, MLPs designed with hidden layer configurationsreported in Table 2 and three different activation functions were used.Each of these 45 MLPs is trained using three different optimizers. Fi-nally, all 135 variations of MLPs were trained and tested using imagesof urinary bladder cancer resized on four different sizes:

• ×30 30,• ×50 50,• ×100 100 and

• ×200 200.

By using test dataset, Receiver operating characteristic (ROC) analysisis performed and Area Under the Receiver Operating Characteristics(AUC) value for all variations is obtained. In the case of this study,horizontal axis represents false positive rate (FPR) and vertical axisrepresents true positive rate (TPR). FPR can be defined as

= −FPR 1 specificity (15)

and TPR can be defined as

=TPR sensitivity. (16)

The AUC value can be used for determination of classificationquality, as shown in [51]. In this case AUC will be used as a perfor-mance measure together with MLP training time. Aforementionedmeasures will be used for determination of optimal hidden layers designand image size. Furthermore, the influence of the image size and dif-ferent hidden layer configurations on MLP performance will be in-vestigated.

Fig. 7. Dataflow diagram of MLP with Laplacian of an image as an input.

Table 2Configurations of hidden layers used for determination of optimal hidden layersdesign.

Hidden layers

Configuration 1 2 3 4 5 61. 20 – – – – –2. 40 – – – – –3. 20 20 – – – –4. 40 20 – – – –5. 40 40 – – – –6. 80 20 – – – –7. 40 20 10 – – –8. 40 20 20 – – –9. 80 20 20 – – –10. 80 40 20 – – –11. 40 20 20 10 – –12. 80 40 40 20 – –13. 40 20 20 20 10 –14. 80 40 40 40 20 –15. 40 40 20 20 20 10

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

5

Page 6: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

3. Results and discussion

In this section, AUC and training times of different configurationsand settings of MLP with Laplacian edge detector are compared.Detailed results achieved with each of configurations and settings areshown in Appendix.

3.1. Results

In case of MLP designed with ReLU activation function, with ×30 30input images, it is observed that highest AUC values are achieved ifMLP with six hidden layers is trained using L-BFGS solver. In this case,significant increase of AUC cannot be observed. The lowest AUC isachieved by using SGD solver, as is shown in Fig. 8.

When maximal AUC values achieved with each of defined config-urations are observed, it can be seen that the highest AUC of 0.96 isachieved if MLP designed with configuration No. 15 is utilized. It can benoticed that AUC values higher than 0.8 are achieved regardless of usedconfiguration, if right solver is utilized, as shown in Fig. 9.

If MLP training time is observed, it is obvious that by using SGD thelongest training time is required. With increasing number of hiddenlayers, training procedure is getting slower. By using L-BFGS and Adam,training procedure is significantly faster with lower oscillations, as isshown in Fig. 10.

If MLP designed with Tanh activation function is used, it is observedthat in the case of learning with L-BFGS solver higher values areachieved when MLP is designed with less hidden layers. If Adam solveris used, a slight increase of AUC value with increasing number ofhidden layers is noticed for all configurations with five or less hiddenlayers. If SGD solver is used, the highest AUC value is achieved if MLP

with 6 hidden layers is used, as it is shown in Fig. 11.If maximal AUC achieved with each of defined configuration is

observed, it can be seen that the highest AUC value of 0.95 is achievedif MLP designed with configuration No. 7 is utilized. When morecomplex configurations No. 14 and No. 15 are used a decrease ofmaximal AUC value can be noticed, as it is shown in Fig. 12.

If training time is measured, it can be observed that, again, thelongest time is required if SGD solver is used. Performances are alsosimilar to performances of MLP designed with ReLU activation functionif Adam and L-BFGS are used, as shown in Fig. 13.

In case of MLP designed with Logistic sigmoid activation function,significant decrease of AUC value can be observed with increasing

Fig. 8. AUC value versus the number of hidden layers for MLP designed withReLU activation function and ×30 30 input images.

Fig. 9. AUC value for each configuration of hidden layers for MLP designedwith ReLU activation function and ×30 30 input images.

Fig. 10. Training time versus the number of hidden layers for MLP designedwith ReLU activation function and ×30 30 input images.

Fig. 11. AUC value versus the number of hidden layers for MLP designed withTanh activation function and ×30 30 input images.

Fig. 12. AUC value for each configuration of hidden layers for MLP designedwith Tanh activation function and ×30 30 input images.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

6

Page 7: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

number of hidden layers. If number of hidden layers is three or less,acceptable performances are only achieved if L-BFGS is used. Use ofAdam or SGD solver will result with poor performances from AUCstandpoint, as shown in Fig. 14.

When AUC performances of each configuration are discussed, it canbe seen that maximal AUC is achieved if configuration No. 6 is used.Configuration marked with numerical designations 11 and higher areperforming poorly from AUC standpoint, as shown in Fig. 15.

Because of low AUC values, training times of MLP trained by usingSGD and Adam solvers cannot be included into MLP performanceevaluation. If training time of MLP with three or less Logistic sigmoidhidden layers is measured, increase of training time with number of

hidden layers increase can be observed, as it is shown in Fig. 16.When all results achieved with ×30 30 input images are summed

up, it can be noticed that the highest AUC value of 0.96 is achieved ifMLP with parameters reported in Table 3 is utilized.

If ×50 50 input images are used, MLP designed with ReLU activa-tion function will achieve highest AUC value if configuration with fourhidden layers is used. If MLP is trained by using L-BFGS optimizer, AUCvalue will fall if five hidden layers are used and again rise if MLP withsix hidden layers is used. This is not a case if Adam or SGD solvers areused. In this case, AUC value will decrease if number of hidden layers ishigher than four, as shown in Fig. 17.

If AUC performance is observed for each configuration individuallyit can be noticed that the highest AUC value of 0.99 are achieved ifconfigurations No. 10 and No. 12 are utilized. If more complex con-figurations are used, a decrease of AUC value can be noticed, as it isshown in Fig. 18.

From Fig. 19 it can be seen that if SGD optimizer is used then thetraining procedure will be significantly longer. As in the case of trainingwith ×30 30 images, MLP will take more time to train up if config-uration with more hidden layers is used.

MLP designed with Tanh activation function will achieve highestAUC values if MLP with one hidden layer is trained using SGD opti-mizer. If configuration with two or more hidden layers is used, MLP will

Fig. 13. Training time versus the number of hidden layers for MLP designedwith Tanh activation function and ×30 30 input images.

Fig. 14. AUC value versus the number of hidden layers for MLP designed withLogistic sigmoid activation function and ×30 30 input images.

Fig. 15. AUC value for each configuration of hidden layers for MLP designedwith Logistic sigmoid activation function and ×30 30 input images.

Fig. 16. Training time versus the number of hidden layers for MLP designedwith Logistic sigmoid activation function and ×30 30 input images.

Table 3MLP with the highest AUC for ×30 30 input images.

Configuration Activation function Solver AUC Training time (s)

15. ReLU L-BFGS 0.96 18.48

Fig. 17. AUC value versus the number of hidden layers for MLP designed withReLU activation function and ×50 50 input images.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

7

Page 8: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

perform badly if it is trained by using SGD optimizer. A decrease ofAUC value can also be observed if Adam and L-BFGS are used, as isshown in Fig. 20.

When AUC value is observed for each configuration, it can be seenthat the highest AUC is achieved if configuration No. 11 is utilized. Ifmore complex configurations are used, a significant fall of AUC can benoticed, as shown in Fig. 21.

Training time can be taken into account only for MLPs designedwith three or less hidden layers, if Adam or L-BFGS are used. For case ofMLP trained with SGD, only training time of configuration with onehidden layer can be taken into account. These constrains are introducedbecause no meaningful conclusions can be driven from training times of

MLPs with low AUC values (Fig. 22).If MLP designed with Logistic sigmoid activation function is utilized

it can be noticed that maximal AUC is achieved if MLP with four hiddenlayers is used. MLPs designed with fewer hidden layers are achievingslightly lower AUC values, regardless of solver utilized. For MLPs de-signed with higher number of hidden layers a fall of AUC value can beobserved for all three solvers, as shown in Fig. 23.

When maximal AUC values are observed individually for eachconfiguration, it can be noticed that the highest AUC will be achieved ifconfiguration No. 9 is used. Configurations with lower and highercomplexity are achieving lower AUC values, as it is shown in Fig. 24.

In this case, training time of MLP trained with SGD optimizer will be

Fig. 18. AUC value for each configuration of hidden layers for MLP designedwith ReLU activation function and ×50 50 input images.

Fig. 19. Training time versus the number of hidden layers for MLP designedwith ReLU activation function and ×50 50 input images.

Fig. 20. AUC value versus the number of hidden layers for MLP designed withTanh activation function and ×50 50 input images.

Fig. 21. AUC value for each configuration of hidden layers for MLP designedwith Tanh activation function and ×50 50 input images.

Fig. 22. Training time versus the number of hidden layers for MLP designedwith Tanh activation function and ×50 50 input images.

Fig. 23. AUC value versus the number of hidden layers for MLP designed withLogistic sigmoid activation function and ×50 50 input images.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

8

Page 9: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

significantly longer than in the case of Adam or L-BFGS. If MLP with 5hidden layers is used an anomaly can be observed. For this configura-tion, MLP will take approximately 55 s to train, regardless of solverutilized, as shown in Fig. 25.

From shown results it can be seen that the highest AUC value of0.99 in the case of ×50 50 input images is achieved if MLP withparameters reported in Table 4 is utilized.

If ×100 100 input images are used for training MLPs designed withReLU activation function, a decrease of AUC value can be observed forMLPs designed with higher number of hidden layers. Maximal AUC isachieved if configurations with four hidden layers are used. This is thecase for all three described solvers, as is shown in Fig. 26.

When maximal AUC values achieved with each of defined config-urations of hidden layers are compared, it can be seen that the highestAUC value of 0.98 is achieved if configuration No. 11 is utilized.Slightly lower AUC values are achieved if simpler configurations (No.1-10) are utilized. Significant drop of AUC value can be noticed if morecomplex configurations (No. 12-15) of hidden layers are used, as shownin Fig. 27.

As it is shown in the case of ×30 30 and in the case of ×50 50images, the longest time to train is required if training is performedwith SGD optimizer. This is repeated for all configurations of hiddenlayers. In case of Adam and L-BFGS optimizers, shorter time is required

with the tendency of decreasing for configurations with more hiddenlayers, as is shown in Fig. 28.

For case of MLPs designed with Tanh activation function, highestAUC value will be achieved if SGD solver is used for training MLP withone hidden layer. MLP with more than one hidden layer will notachieve sufficient AUC rate. If Adam and L-BFGS are used, MLP willachieve constant AUC rate for all configurations with four or lesshidden layers. MLP designed with more hidden layers will produceinsufficient AUC rate, as it is shown in Fig. 29.

If maximal AUC values achieved with each configuration are com-pared, it can be seen that maximal AUC will be achieved if config-uration No. 5 is utilized. If more complex configurations are utilized, a

Fig. 24. AUC value for each configuration of hidden layers for MLP designedwith Logistic sigmoid activation function and ×50 50 input images.

Fig. 25. Training time versus the number of hidden layers for MLP designedwith Logistic sigmoid activation function and ×50 50 input images.

Table 4MLP with the highest AUC for ×50 50 input images.

Configuration Activation function Solver AUC Training time (s)

10. Tanh SGD 0.99 6.79

Fig. 26. AUC value versus the number of hidden layers for MLP designed withReLU activation function and ×100 100 input images.

Fig. 27. AUC value for each configuration of hidden layers for MLP designedwith ReLU activation function and ×100 100 input images.

Fig. 28. Training time versus the number of hidden layers for MLP designedwith ReLU activation function and ×100 100 input images.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

9

Page 10: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

significant drop of AUC value can be noticed, especially for config-urations No. 13, No. 14 and No. 15, as shown in Fig. 30.

In the case of MLP with one hidden layer that is trained with SGD,training time is significantly longer when compared with training timesof MLPs trained with Adam or L-BFGS solver. Other MLPs cannot betaken into account if SGD solver is used. For MLPs trained with L-BFGSsolver, the longest time is required if the three-hidden layers design isused. If MLP with four hidden layers is used, training is shorter. If Adamsolver is used, training time is constant for all configurations of hiddenlayers, as shown in Fig. 31. Designs with more than four hidden layerscannot be taken into account if Adam and L-BFGS solvers were used.

The MLP with four hidden layers that uses Logistic sigmoid

activation function achieves highest AUC values. This can be observedfor all solvers. By increasing number of hidden layers, a decrease ofAUC rate can occur regardless of used solver, as is shown in Fig. 32. Itcan be noticed that the highest AUC value is achieved if MLP designedwith configuration No. 7 is utilized. More complex configurations ofhidden layers are achieving lower AUC, as it is shown in Fig. 33.

Similar to most cases that are shown, MLP trained by using SGD willrequire the longest training time. Training times of MLPs are sig-nificantly decreasing if deeper configurations are used. This trend ismanifested for all solvers used, as it can be seen in Fig. 34.

In case of ×100 100 input images, it can be seen that the highestAUC of 0.99 is achieved if MLP with parameters reported in Table 5 isutilized.

In the case of ×200 200 input images, MLPs designed with ReLUactivation function will achieve highest AUC values for all configura-tions of hidden layers if SGD solver is used. Significant decrease in AUCvalue can be observed for MLPs trained with L-BFGS and Adam solver,in comparison with previous MLP with ReLU designs, as shown inFig. 35.

When AUC performance of each configuration is observed in-dividually, it can be noticed that maximal AUC value of 0.91 isachieved if configuration No. 12 is utilized, as shown in Fig. 36. Similarto majority of previously discussed cases, when more complex config-urations of hidden layers are utilized, a drop in AUC value occurs.

If training times are compared, it can be seen, that MLPs trainedwith L-BFGS solver converge significantly faster than MLPs trained withAdam solver. MLPs trained with SGD solver are converging faster ifconfigurations with less hidden layers were used. For deeper config-urations, MLP trained with SGD solver will take longer to converge,similarly to previously described cases. For configuration with 6 hidden

Fig. 29. AUC value versus the number of hidden layers for MLP designed withTanh activation function and ×100 100 input images.

Fig. 30. AUC value for each configuration of hidden layers for MLP designedwith Tanh activation function and ×100 100 input images.

Fig. 31. Training time versus the number of hidden layers for MLP designedwith Tanh activation function and ×100 100 input images.

Fig. 32. AUC value versus the number of hidden layers for MLP designed withLogistic sigmoid activation function and ×100 100 input images.

Fig. 33. AUC value for each configuration of hidden layers for MLP designedwith Logistic sigmoid activation function and ×100 100 input images.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

10

Page 11: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

layers, MLP trained with SGD will take the same amount of time totrain, as it will take if it is trained with Adam solver, as is shown inFig. 37.

If MLP designed with Tanh activation function is trained and testedwith ×200 200 images, best results are achieved if MLPs with one ortwo hidden layers are trained by using SGD solver. MLPs trained with

SGD solver will have low AUC values, if configurations with more thantwo hidden layers are used. For MLPs trained with Adam and L-BFGSsolver, significant fall of AUC value is noticeable if configurations withfive or six hidden layers are used, as it is shown in Fig. 38.

In the case of MLP designed with Tanh activation function, it can beseen that maximal AUC of 0.92 is achieved if hidden layers config-uration No. 5 is utilized. For more complex configurations, drop of AUCvalue occurs, as shown in Fig. 39.

The MLP will take the longest time to converge if it is trained withSGD solver. For this type of solver, configurations with three or morehidden layers cannot be taken into account in terms of training timebecause of low AUC values. If L-BFGS solver is used for training,

Fig. 34. Training time versus the number of hidden layers for MLP designedwith Logistic sigmoid activation function and ×100 100 input images.

Table 5MLP with the highest AUC for ×100 100 input images.

Configuration Activation function Solver AUC Training time (s)

5. Tanh SGD 0.99 181.79

Fig. 35. AUC value versus the number of hidden layers for MLP designed withReLU activation function and ×200 200 input images.

Fig. 36. AUC value for each configuration of hidden layers for MLP designedwith ReLU activation function and ×200 200 input images.

Fig. 37. Training time versus the number of hidden layers for MLP designedwith ReLU activation function and ×200 200 input images.

Fig. 38. AUC value versus the number of hidden layers for MLP designed withTanh activation function and ×200 200 input images.

Fig. 39. AUC value for each configuration of hidden layers for MLP designedwith Tanh activation function and ×200 200 input images.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

11

Page 12: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

increase of training time with deeper configurations, can occur. If Adamsolver is used, no significant increase of training time is noticeable, asseen in Fig. 40. In case of L-BFGS and Adam solver, only MLPs with fouror less hidden layers can be taken into account.

When Logistic sigmoid activation function is used for MLP designthere is no significant AUC value oscillations with change of hiddenlayers number. This characteristic is present for all three solvers, asshown in Fig. 41. The AUC value is noticeably lower than it is in thecase of previous variations.

If AUC performance of each configuration of hidden layers is ob-served, it can be seen that the highest AUC of 0.88 is achieved ifconfiguration No. 4 is utilized. AUC values that are achieved with otherconfigurations are lower than 0.85, as it is shown in Fig. 42.

By using SGD solver, MLPs will take more time to converge than it isin the case of L-BFGS or Adam solver. If MLP is designed with six hiddenlayers, training time is significantly shorter for the case of training withany of three described solvers, as is shown in Fig. 43.

When results for case of ×200 200 input images are summed up, itcan be seen that the highest AUC value of 0.92 will be achieved if MLPwith parameters reported in Table 6 is utilized.

When average AUC values of MLPs designed with ReLU activationfunction are compared, it is evident that the highest AUC value will beachieved if ×50 50 images are used. In comparison with MLPs with

×50 50 input images, MLPs with ×100 100 input images will, inaverage, achieve slightly lower AUC value. For both ×30 30 and

×200 200, lower AUC values are observed. When results are comparedfor each of used solvers, it can be concluded that AUC values are fol-lowing the described trend, regardless of the used solver, as shown inFig. 44.

By comparing average training time for each of described solvers

and image sizes it can be concluded that MLPs trained with SGD solverrequires longer time to converge. By observing change of training timewith increasing number of image pixels, an exponential growth trendcan occur. In other words, by increasing number of input nodes,training time will be exponentially longer, as it is shown in Fig. 45.

In the case of MLPs designed with Tanh activation function, higherAUC values are achieved if Adam or L-BFGS solver were used. The

Fig. 40. Training time versus the number of hidden layers for MLP designedwith Tanh activation function and ×200 200 input images.

Fig. 41. AUC value versus the number of hidden layers for MLP designed withLogistic sigmoid activation function and ×200 200 input images.

Fig. 42. AUC value for each configuration of hidden layers for MLP designedwith Logistic sigmoid activation function and ×200 200 input images.

Fig. 43. Training time versus the number of hidden layers for MLP designedwith Logistic sigmoid activation function and ×200 200 input images.

Table 6MLP with the highest AUC for ×200 200 input images.

Configuration Activation function Solver AUC Training time (s)

5. Tanh SGD 0.92 530.73

Fig. 44. AUC versus the image size for MLPs designed with ReLU activationfunction.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

12

Page 13: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

MLPs trained by using SGD are, from AUC value standpoint, per-forming weaker. If L-BFGS or Adam solver were used, higher AUCvalues are achieved if images with less pixels were used. Significantdrop of AUC value can be observed for all three solvers if ×200 200images are used, as seen in Fig. 46.

As it is in the case for MLPs designed with ReLU activation function,in the case of MLPs designed with Tanh activation function, using SGDsolver will result with significantly longer training time. Training timesare exponentially longer when input images are larger, as it is shown inFig. 47.

Similar to MLPs designed with ReLU and Tanh, MLPs designed withLogistic sigmoid activation function are achieving lower AUC values if

×200 200 input images were used, regardless of used solver. If SGD isused, AUC value is significantly lower, if ×30 30 images are used. Thisis not a case when the MLPs are trained by using L-BFGS or Adamsolver. The highest average AUC value is achieved by using L-BFGS

solver to train MLPs with ×50 50 images as inputs, as shown in Fig. 48.When average training times are observed, it can be seen that MLPs

will take longer time to train if they are trained with SGD solver.Training times are getting exponentially longer, if larger images wereused as inputs. This characteristic can be noticed in case of trainingwith all three solvers, as it is shown in Fig. 49.

3.2. Discussion

When all shown results are summed up, it can be noticed that, ex-cept in the case of ×30 30 images, maximal AUC is achieved if MLPmodel with intermediate complexity is utilized. This property shows thecorrelation of experimental results and theoretical knowledge of modelselection [86,87]. If classification performances of four proposed MLPswith Laplacian edge detector are compared, it can be noticed that allfour methods are achieving AUC value of 0.92 or higher. Furthermore,it can be noticed that the highest AUC value is achieved if ×50 50 and

×100 100 input images are used. If ×30 30 and ×200 200 input imagesare used, lower AUC values of 0.96 and 0.92 are achieved respectively,as shown in Table 7. In the case of ×30 30 input images, the highestAUC is achieved if MLP is designed with six hidden layers with 40, 40,20, 20, 20 and 10 ReLU nodes in each hidden layer, respectively.Aforementioned MLP achieves the highest AUC value if it is trained byusing L-BFGS solver. In the case of ×50 50 input images, the highestAUC is achieved if SGD solver is used for training of MLP designed withthree hidden layers with 80, 40 and 10 Tanh nodes, respectively.Configuration with two hidden layers of 40 nodes achieves the highestAUC in cases of ×100 100 and ×200 200 input images. In both cases,the highest AUC is achieved if nodes in hidden layers are designed withTanh activation function and if MLP is trained by using SGD solver.

When AUC performances of proposed MLPs are compared withAUC performance of related ANN-based methods [37,38,41,45,48] it

Fig. 45. Average training times versus the image size for MLPs designed withReLU activation function.

Fig. 46. AUC versus the image size for MLPs designed with Tanh activationfunction.

Fig. 47. Average training times versus the image size for MLPs designed withTanh activation function.

Fig. 48. AUC versus the image size for MLPs designed with Logistic sigmoidactivation function.

Fig. 49. Average training times versus the image size for MLPs designed withLogistic sigmoid activation function.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

13

Page 14: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

can be noticed that higher AUC values are achieved if MLP with La-placian is utilized for urinary bladder cancer diagnosis, as shown inTable 7. It can be noticed that MLP with Laplacian for all four inputimage sizes achieves higher AUC values, in comparison to relatedmethods.

When influence of MLP model complexity on training time perfor-mances is observed, it can be noticed that, in the case of ×30 30 inputimages, there is no significant difference in training time performances,regardless of model complexity. Similar property can be noticed in thecase of ×50 50 input images, if L-BFGS and Adam solver are utilized.Significantly longer training times are required if SGD solver is utilized.In the case of ×100 100 input images the described trend in regard ofSGD solver can also be noticed. If MLPs are designed with Logisticsigmoid, a significant drop of training times can be noticed when morecomplex MLP models are utilized. In the case of ×200 200 input images,training times are significantly shorter when more complex MLP modelswere utilized. In difference with the case of ×100 100 input images,where this property is noticed only for MLPs designed with Logisticsigmoid, in the case of ×200 200 drop of training times can be noticedregardless of activation function utilized. As in previous cases, MLPswill take significantly longer to train if SGD is utilized.

When average AUC values are observed for each activation func-tion, input image size and solver utilized it can be noticed that highestAUC values are achieved in case of ×50 50 and ×100 100 input imagesizes. On the other hand, if average training times are observed, anexponential growth of training time with increasing size of input imagescan be noticed. This property can be noticed regardless of activationfunction and solver utilized.

4. Conclusion

In this paper, the urinary bladder cancer diagnostic method basedon MLP is presented. This method uses image resizing and Laplacianedge detector for pre-processing of input images. Described methodoffers an alternative approach to bladder cancer diagnostic method.This method can be used as a main or secondary algorithm in design ofexpert system in diagnosis of urinary bladder cancer, alongside stan-dardly used DL-CNN. Based on performed investigation the followingconclusions can be drawn:

• higher AUC values are achieved if MLP with Laplacian edge de-tector is used when compared to other related methods for urinarybladder cancer diagnosis. The highest AUC values are achieved bytwo methods with ×50 50 and ×100 100 input image sizes. Firstmethod achieves AUC value of 0.99 if MLP designed with threehidden layers of 80, 40 and 10 Tanh nodes and SGD solver areutilized. Second method also achieves AUC value of 0.99 if MLPdesigned with two hidden layers of 40 Tanh nodes and SGD solverare utilized,

• investigation showed that MLPs with intermediate model com-plexity and image size achieve higher AUC values, the highest AUC

values were achieved if Tanh activation function and SGD solverwere utilized, and

• investigation showed that as input image size increases, the trainingtime of MLP model becomes exponentially longer regardless ofmodel complexity and solver utilized.

From results of conducted research it can also be concluded that dif-ferent MLP architectures are achieving viable classification resultswhich allows parallel utilization of different MLP architectures. Theconducted research also showed that MLP, as an simpler algorithm incomparison with DL-CNN could give viable classification results.

Acknowledgements

This research has been (partly) supported by the CEEPUS networkCIII-HR-0108, European Regional Development Fund under the grantKK.01.1.1.01.0009 (DATACROSS) and University of Rijeka scientificgrant uniri-tehnic-18-275-1447.

Appendix A. Supplementary Data

Supplementary data associated with this article can be found, in theonline version, at https://doi.org/10.1016/j.artmed.2019.101746.

References

[1] Janković Slavenka, Radosavljević Vladan. Risk factors for bladder cancer. Tumori J2007;93(1):4–12.

[2] Zeegers Maurice PA, Tan Frans ES, Dorant Elisabeth, van den Brandt Piet A. Theimpact of characteristics of cigarette smoking on urinary tract cancer risk: a meta-analysis of epidemiologic studies. Cancer 2000;89(3):630–9.

[3] Burger Maximilian, Catto James WF, Dalbagni Guido, Grossman H Barton, HerrHarry, Karakiewicz Pierre, Kassouf Wassim, Kiemeney Lambertus A, La VecchiaCarlo, Shariat Shahrokh, et al. Epidemiology and risk factors of urothelial bladdercancer. Eur Urol 2013;63(2):234–41.

[4] Sun Jiang-Wei, Zhao Long-Gang, Yang Yang, Ma Xiao, Wang Ying-Ying, XiangYong-Bing. Obesity and risk of bladder cancer: a dose–response meta-analysis of 15cohort studies. PLOS ONE 2015;10(3):e0119313.

[5] Cancer Genome Atlas Research Network, et al. Comprehensive molecular char-acterization of urothelial bladder carcinoma. Nature 2014;507(7492):315.

[6] Takahashi Keigo, Kimura Go, Endo Yuki, Akatsuka Jun, Hayashi Tatsuro, ToyamaYuka, Hamasaki Tsutomu, Kondo Yukihiro. Urothelial carcinoma of the bladder,lipid cell variant: a case report and literature review. J Nippon Med School 2019.pages JNMS-2019_86.

[7] Dotson Aaron, May Allison, Davaro Facundo, Raza Syed Johar, Siddiqui Sameer,Hamilton Zachary. Squamous cell carcinoma of the bladder: poor response toneoadjuvant chemotherapy. Int J Clin Oncol 2019;24(6):706–11.

[8] Celis Julio E, Wolf Hans, Østergaard Morten. Bladder squamous cell carcinomabiomarkers derived from proteomics. Electrophoresis Int J 2000;21(11):2115–21.

[9] Dadhania Vipulkumar, Czerniak Bogdan, Guo Charles C. Adenocarcinoma of theurinary bladder. Am J Clin Exp Urol 2015;3(2):51.

[10] Sharma Amit, Fr“ohlich Holger, Zhang Rong, Ebert Anne-Karoline, R”osch Wolfgang, Reis Henning, Kristiansen Glen, Ellinger Jldquo;org, ReutterHeiko. Classic bladder exstrophy and adenocarcinoma of the bladder: methylomeanalysis provide no evidence for underlying disease-mechanisms of this association.Cancer Genet 2019.

[11] Ismaili Nabil. A rare bladder cancer-small cell carcinoma: review and update.Orphanet J Rare Dis 2011;6(1):75.

[12] Gil Rui Tiago, Esteves Gonçalo. Small cell carcinoma of the urinary bladder: a rareand aggressive tumor. Acta Radiol 2019;31(1):23–6.

[13] Mitra Suvradeep, Kaur Gurwinder, Kakkar Nandita, Singh Priya, Dey Pranab.Sarcoma in urine cytology; an extremely rare entity: a report of two cases. J Cytol2017;34(3):171.

[14] Daga Garima, Kerkar Prashant. Sarcomatoid carcinoma of urinary bladder: a casereport. Indian J Surg Oncol 2018;9(4):644–6.

[15] Hu HP, Niu ZJ, Bai YP, Tan XH. Cancer classification based on gene expressionusing neural networks. Genet Mol Res 2015;14:17605–11.

[16] Araújo Teresa, Aresta Guilherme, Castro Eduardo, Rouco José, Aguiar Paulo, EloyCatarina, Polónia António, Campilho Aurélio. Classification of breast cancer his-tology images using convolutional neural networks. PLOS ONE2017;12(6):e0177544.

[17] Esteva Andre, Kuprel Brett, Novoa Roberto A, Ko Justin, Swetter Susan M, BlauHelen M, Thrun Sebastian. Dermatologist-level classification of skin cancer withdeep neural networks. Nature 2017;542(7639):115.

[18] Sharma Harshita, Zerbe Norman, Klempert Iris, Hellwich Olaf, Hufnagl Peter. Deepconvolutional neural networks for automatic classification of gastric carcinomausing whole slide images in digital histopathology. Comput Med Imaging Graph2017;61:2–13.

Table 7Classification performances comparison of different methods for urinarybladder cancer diagnosis.

Method AUC

MLP with Laplacian ( ×30 30 images) AUC : 0.96MLP with Laplacian ( ×50 50 images) AUC : 0.99MLP with Laplacian ( ×100 100 images) AUC : 0.99MLP with Laplacian ( ×200 200 images) AUC : 0.92ANN [37] AUC : 0.77ANN [38] AUC : 0.89ANN [41] AUC : 0.72ANN [45] AUC : 0.85–0.885DL-CNN [48] AUC : ±0.7 0.06

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

14

Page 15: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

[19] Soria Daniele, Garibaldi Jonathan M, Green Andrew R, Powe Desmond G, NolanChristopher C, Lemetre Christophe, Ball Graham R, Ellis Ian O. A quantifier-basedfuzzy classification system for breast cancer patients. Artif Intell Med2013;58(3):175–84.

[20] Barboni Miranda Gisele Helena, Felipe Joaquim Cezar. Computer-aided diagnosissystem based on fuzzy logic for breast cancer categorization. Comput Biol Med2015;64:334–46.

[21] Gao Lingyun, Ye Mingquan, Wu Changrong. Cancer classification based on supportvector machine optimized by particle swarm optimization and artificial bee colony.Molecules 2017;22(12):2086.

[22] Geeitha S, Thangamani M. Incorporating EBO-HSIC with SVM for gene selectionassociated with cervical cancer classification. J Med Syst 2018;42(11):225.

[23] Li Jiance, Weng Zhiliang, Xu Huazhi, Zhang Zhao, Miao Haiwei, Chen Wei, LiuZheng, Zhang Xiaoqin, Wang Meihao, Xu Xiao, et al. Support vector machines(SVM) classification of prostate cancer gleason score in central gland using multi-parametric magnetic resonance images: a cross-validated study. Eur J Radiol2018;98:61–7.

[24] Lo Shih-Chung B, Chan Heang-Ping, Lin Jyh-Shyan, Li Huai, Freedman Matthew T,Mun Seong K. Artificial convolution neural network for medical image pattern re-cognition. Neural Netw 1995;8(7-8):1201–14.

[25] Jiang Jianmin, Trundle P, Ren Jinchang. Medical image analysis with artificialneural networks. Comput Med Imaging Graph 2010;34(8):617–31.

[26] Li Qing, Cai Weidong, Wang Xiaogang, Zhou Yun, Feng David Dagan, Chen Mei.Medical image classification with convolutional neural network. 2014 13thInternational Conference on Control Automation Robotics & Vision (ICARCV)2014:844–8.

[27] Kusumoto Dai, Yuasa Shinsuke. The application of convolutional neural network tostem cell biology. Inflam Regenerat 2019;39(1):14.

[28] Monisha M, Suresh A, Rashmi MR. Artificial intelligence based skin classificationusing GMM. J Med Syst 2019;43(1):3.

[29] Lisboa Paulo J, Taktak Azzam FG. The use of artificial neural networks in decisionsupport in cancer: a systematic review. Neural Netw 2006;19(4):408–15.

[30] Daniel Graupe. Principles of artificial neural networks, volume 7. World Scientific;2013.

[31] Werbos PJ. Beyond regression: new tools for prediction and analysis in the beha-vioral sciences. Ph. D. Thesis. Cambridge, MA: Harvard University; 1974. 1974.

[32] Widrow Bernard, Hoff Marcian E. Adaptive switching circuits Stanfrod Univ CaStanford Eelectronics Labs; 1960. Technical report.

[33] Rosenblatt F. Principles of neurodynamics. Washington, DC: Spartan Books; 1962.[34] Samuel D. Stearns. of aldapfive signal processing. 1985.[35] Babaian R Joseph, Fritsche Herbert A, Zhang Zhen, Zhang Hong K, Madyastha

Rama K, Barnhill Stephen D. Evaluation of prostasure index in the detection ofprostate cancer: a preliminary report. Urology 1998;51(1):132–6.

[36] Pantazopoulos D, Karakitsos P, Iokim-Liossi A, Pouliakis A, Botsoli-Stergiou E,Dimopoulos C. Back propagation neural network in the discrimination of benignfrom malignant lower urinary tract lesions. J Urol 1998;159(5):1619–23.

[37] Babaian Joseph R, Fritsche Herbert, Ayala Alberto, Bhadkamkar Vijaya, JohnstonDennis A, Naccarato William, Zhang Zhen. Performance of a neural network indetecting prostate cancer in the prostate-specific antigen reflex range of 2.5 to4.0 ng/ml. Urology 2000;56(6):1000–6.

[38] Qureshi Khaver N, Naguib Raouf NG, Hamdy Freddie C, Neal David E, Mellon KilianJ. Neural network analysis of clinicopathological and molecular markers in bladdercancer. J Urol 2000;163(2):630–3.

[39] Djavan Bob, Remzi Mesut, Zlotta Alexandre, Seitz Christian, Snow Peter, MarbergerMichael. Novel artificial neural network for early detection of prostate cancer. JClin Oncol 2002;20(4):921–9.

[40] Spyridonos Panagiota, Cavouras Dionisios, Ravazoula Panagiota, NikiforidisGeorge. Neural network-based segmentation and classification system for auto-mated grading of histologic sections of bladder carcinoma. Anal Quant Cytol Histol2002;24(6):317–24.

[41] Parekattil Sijo J, Fisher Hugh AG, Kogan Barry A. Neural network using combinedurine nuclear matrix protein-22, monocyte chemoattractant protein-1 and urinaryintercellular adhesion molecule-1 to detect bladder cancer. J Urol2003;169(3):917–20.

[42] Porter Christopher R, Crawford EDavid. Combining artificial neural networks andtransrectal ultrasound in the diagnosis of prostate cancer. Oncology-Williston Parkthen Huntington the Melville New York 2003;17(10):1395–418.

[43] Remzi Mesut, Anagnostou Theodore, Ravery Vincent, Zlotta Alexandre, StephanCarsten, Marberger Michael, Djavan Bob. An artificial neural network to predict theoutcome of repeat prostate biopsies. Urology 2003;62(3):456–60.

[44] Tasoulis Dimitris K, Spyridonos Panagiota, Pavlidis Nicos G, Cavouras D, RavazoulaPanagiota, Nikiforidis George, Vrahatis Michael N. Urinary bladder tumor gradediagnosis using on-line trained neural networks. International Conference onKnowledge-Based and Intelligent Information and Engineering Systems2003:199–206.

[45] Mueller J, Von Eggeling F, Driesch D, Schubert J, Melle C, Junker K. Proteinchiptechnology reveals distinctive protein expression profiles in the urine of bladdercancer patients. Eur Urol 2005;47(6):885–94.

[46] Papageorgiou Elpiniki I, Spyridonos PP, Stylios Chrysostomos D, RavazoulaPanagiota, Groumpos Peter P, Nikiforidis GN. Advanced soft computing diagnosismethod for tumour grading. Artif Intell Med 2006;36(1):59–70.

[47] Luukka Pasi. Similarity classifier in diagnosis of bladder cancer. Comput MethodsProgr Biomed 2008;89(1):43–9.

[48] Cha Kenny H, Hadjiiski Lubomir M, Samala Ravi K, Chan Heang-Ping, CohanRichard H, Caoili Elaine M, Paramagul Chintana, Alva Ajjai, Weizer Alon Z. Bladdercancer segmentation in CT for treatment response assessment: application of deep-

learning convolution neural network-a pilot study. Tomography 2016;2(4):421.[49] Sokolov I, Dokukin ME, Kalaparthi V, Miljkovic M, Wang A, Seigne JD, Grivas P,

Demidenko E. Noninvasive diagnostic imaging using machine-learning analysis ofnanoresolution images of cell surfaces: detection of bladder cancer. Proc Natl AcadSci 2018;115(51):12920–5.

[50] Bogović Korino, Lorencin Ivan, Anđelić Nikola, Blažević Sebastijan, Smolčić Klara,Španjol Josip, Car Zlatan. Artificial intelligence-based method for urinary bladdercancer diagnostic. International Conference on Innovative Technologies, IN-TECH2018:2018.

[51] Fawcett Tom. An introduction to ROC analysis. Pattern Recognit Lett2006;27(8):861–74.

[52] Amiri S, Movahedi MM, Kazemi K, Parsaei H. An automated MR image segmenta-tion system using multi-layer perceptron neural network. J Biomed Phys Eng2013;3(4):115.

[53] Ditzler Gregory, Polikar Robi, Rosen Gail. Multi-layer and recursive neural net-works for metagenomic classification. IEEE Trans Nanobiosci 2015;14(6):608–16.

[54] Sabanci Kadir, Kayabasi Ahmet, Toktas Abdurrahim. Computer vision-basedmethod for classification of wheat grains using artificial neural network. J Sci FoodAgric 2017;97(8):2588–93.

[55] Makris Georgios-Marios, Pouliakis Abraham, Siristatidis Charalampos, MargariNiki, Terzakis Emmanouil, Koureas Nikolaos, Pergialiotis Vasilios, PapantoniouNikolaos, Karakitsos Petros. Image analysis and multi-layer perceptron artificialneural networks for the discrimination between benign and malignant endometriallesions. Diagn Cytopathol 2017;45(3):202–11.

[56] Li Yang, Yang Hao, Lei Baiying, Liu Jingyu, Wee Chong-Yaw. Novel effective con-nectivity inference using ultra-group constrained orthogonal forward regressionand elastic multilayer perceptron classifier for MCI identification. IEEE Trans MedImaging 2018;38(5):1227–39.

[57] Chica Juan Fernando, Zaputt Sayonara, Encalada Javier, Salamea Christian,Montalvo Melissa, et al. Objective assessment of skin repigmentation using a mul-tilayer perceptron. J Med Signals Sens 2019;9(2):88.

[58] Taravat Alireza, Oppelt Natascha. Adaptive weibull multiplicative model andmultilayer perceptron neural networks for dark-spot detection from SAR imagery.Sensors 2014;14(12):22798–810.

[59] Zhang Yudong, Sun Yi, Phillips Preetha, Liu Ge, Zhou Xingxing, Wang Shuihua. Amultilayer perceptron based smart pathological brain detection system by fractionalfourier entropy. J Med Syst 2016;40(7):173.

[60] Ma Congcong, Li Wenfeng, Gravina Raffaele, Fortino Giancarlo. Posture detectionbased on smart cushion for wheelchair users. Sensors 2017;17(4):719.

[61] Sriraam N, Raghu S, Tamanna Kadeeja, Narayan Leena, Khanum Mehraj, Hegde AS,Kumar Anjani Bhushan. Automated epileptic seizures detection using multi-featuresand multilayer perceptron neural network. Brain Inform 2018;5(2):10.

[62] Ipina Karmele Lopez-de, Lizarduy Unai Martinez-de, Calvo Pilar M, Mekyska Jiri,Beitia Blanca, Barroso Nora, Estanga Ainara, Tainta Milkel, Ecay-Torres Mirian.Advances on automatic speech analysis for early detection of Alzheimer disease: anon-linear multi-task approach. Curr Alzheimer Res 2018;15(2):139–48.

[63] Heddam Salim. Multilayer perceptron neural network-based approach for modelingphycocyanin pigment concentrations: case study from lower Charles River Buoy,USA. Environ Sci Pollut Res 2016;23(17):17210–25.

[64] Sun WZ, Jiang MY, Ren L, Dang J, You T, Yin FF. Respiratory signal predictionbased on adaptive boosting and multi-layer perceptron neural network. Phys MedBiol 2017;62(17):6822.

[65] Fujita Takaaki, Sato Atsushi, Narita Akira, Sone Toshimasa, Iokawa Kazuaki,Tsuchiya Kenji, Yamane Kazuhiro, Yamamoto Yuichi, Ohira Yoko, Otsuki Koji. Useof a multilayer perceptron to create a prediction model for dressing independencein a small sample at a single facility. J Phys Therapy Sci 2019;31(1):69–74.

[66] Moeslund Thomas B. Introduction to video and image processing: Building realsystems and applications. Springer Science & Business Media; 2012.

[67] Pratt William K. Introduction to digital image processing. CRC press; 2013.[68] Wang Xin. Laplacian operator-based edge detectors. IEEE Trans Pattern Anal Mach

Intell 2007;29(5):886–90.[69] Zheng Yinfei, Zhou Yali, Zhou Hao, Gong Xiaohong. Ultrasound image edge de-

tection based on a novel multiplicative gradient and canny operator. UltrasonImaging 2015;37(3):238–50.

[70] Chen Delphic, Lui G-T, Kuo J-C. Application of edge detection method based onimage quality gradient for twin detection. J Microsc 2009;236(1):44–51.

[71] Melih Yildirim and Firat Kacar. Adapting laplacian based filtering in digital imageprocessing to a retina-inspired analog image processing circuit. Analog IntegratedCircuits and Signal Processing, pages 1-9.

[72] Ranjbaran Ali, Abu Hassan Anwar Hasni, Jafarpour Mahboobe, Ranjbaran Bahar. ALaplacian based image filtering using switching noise detector. SpringerPlus2015;4(1):119.

[73] Kaur Satbir, Singh Ishpreet. Comparison between edge detection techniques. Int JComput Appl 2016;145(15):15–8.

[74] Van Dokkum Pieter G. Cosmic-ray rejection by Laplacian edge detection.Publications of the Astronomical Society of the Pacific 2001;113(789):1420.

[75] Parker Stephen. General surgery & urology. Eureka; 2015.[76] Hashim Hashim, Abrams Paul, Dmochowski Roger. The handbook of office ur-

ological procedures. Springer; 2008.[77] Shrivakshan GT, Chandrasekar C. A comparison of various edge detection techni-

ques used in image processing. Int J Comput Sci Issues (IJCSI) 2012;9(5):269.[78] Ujjainiya Lohit, Chakravarthi M Kalyan. Raspberry-pi based cost effective vehicle

collision avoidance system using image processing. ARPN J Eng Appl Sci2015;10(7).

[79] Yue Fuyong, Zhang Chunmei, Zang Xiao-Fei, Wen Dandan, Gerardot Brian D, ZhangShuang, Chen Xianzhong. High-resolution grayscale image hidden in a laser beam.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

15

Page 16: Artificial Intelligence In Medicine · the tissues of the urinary bladder. Bladder cancer begins when cells in the urinary bladder mucosa start to uncontrollably grow and as more

Light: Sci Appl 2018;7(1):17129.[80] Haykin Simon S, et al. Neural networks and learning machines/Simon Haykin. New

York: Prentice Hall; 2009.[81] Pal Sankar K, Mitra Sushmita. Multilayer perceptron, fuzzy sets, classifiaction.

1992.[82] Karlik Bekir, Olgac A Vehbi. Performance analysis of various activation functions in

generalized MLP architectures of neural networks. Int J Artif Intell Expert Syst2011;1(4):111–22.

[83] Zhu Ciyou, Byrd Richard H, Lu Peihuang, Nocedal Jorge. Algorithm 778: L-bfgs-b:Fortran subroutines for large-scale bound-constrained optimization. ACM Trans

Mathem Softw (TOMS) 1997;23(4):550–60.[84] Bottou Léon. Large-scale machine learning with stochastic gradient descent.

Proceedings of COMPSTAT’2010 2010:177–86. Springer.[85] Kingma Diederik P, Ba Jimmy. Adam: A method for stochastic optimization. 2014.

arXiv preprint arXiv:1412.6980.[86] Hastie Trevor, Tibshirani Robert, Friedman Jerome, Franklin James. The elements

of statistical learning: data mining, inference and prediction. Math Intell2005;27(2):83–5.

[87] Bishop Christopher M. Pattern recognition and machine learning. Springer; 2006.

I. Lorencin, et al. Artificial Intelligence In Medicine 102 (2020) 101746

16