Deep Stacked Sparse Autoencoders – A Breast Cancer Classifier

Breast cancer is among one of the non-communicable diseases that is the major cause of women's mortalities around the globe. Early diagnosis of breast cancer has significant death reduction effects. This chronic disease requires careful and lengthy prognostic procedures before reaching a rational decision about optimum clinical treatments. During the last decade, in Computer-Aided Diagnostic (CAD) systems, machine learning and deep learning-based approaches are being implemented to provide solutions with the least error probabilities in breast cancer screening practices. These methods are determined for optimal and acceptable results with little human intervention. In this article, Deep Stacked Sparse Autoencoders for breast cancer diagnostic and classification are proposed. Anticipated algorithms and methods are evaluated and tested using the platform of MATLAB R2017b on Breast Cancer Wisconsin (Diagnostic) Data Set (WDBC) and achieved results surpass all the CAD techniques and methods in terms of classification accuracy and efficiency.


INTRODUCTION
ealth concerns and risks are at ramps with the fast-growing population around the globe. Non-Communicable Diseases (NCD) are a major cause of mortalities in the world. In [1],World Health Organization (WHO) had reported that active and healthy lifestyle may contribute to reduction and prevention of NCDs and it was observed that 3 in 4 adolescents and 1 in 4 adults didn't meet the physical activity standard set by WHO and its affects are more prevalent in developed countries. WHO had proposed a Global Action Plan (GAP) [1] to reduce the physical inactivity up to 15% and to cope with the adverse effects of COVID-19 on GAP. In [2] more coordinated actions for developing robust and sustainable health sector have been proposed. According to [3] these types of diseases contribute patient ratio is far low and even worse in low-income countries. According to [6], in 44% of World Health Organization associated countries, the availability of medical doctors per 1000 people is less than 1. These statistics are becoming more alarming regarding pathologists, even in the United States. According to [7], the availability is as 5.7 per 100 000 population and may drop from 5.7 to 3.7 per 100 000 people till 2030 [8] and the ultimate burden will be shifted to computer-assisted diagnostic procedure with least human intervention and expectations for optimum performance.
With the advent of technology in today's modern world, the working limitations of human doctors are one of the constraints that limit their performance. Even a minor mistake may lead to major loss to the patient in form of cost and life. In this regard, the CAD approaches are being appreciated for disease diagnosis.
With the current advancements in the technology paradigm, the capacity of disease prognosis is being enhanced and augmented by the collaboration of tasks among human medical experts and computer machines. In the last couple of decades, computer assistive diagnostic methods based on artificial intelligence are being adopted. Considerable studies are being carried out and still underway to reach an optimal performance solution using machine learning and deep learning algorithms. Screening of breast cancer among women is a complicated and costly process. An expert oncologist requires a lot of manual investigative procedures to reach a final decision about the disease and treatments. Although computer-aided methods made breast cancer detection an easy procedure. But still, a lot of manual work is required before feeding the data to computer machines. In the artificial intelligence domain, machine learning and deep learning-based algorithms are playing a vital role to expedite the diagnostic process with the help of data scientists having the least medical field knowledge. Specifically, the Deep Neural Network (DNN) based algorithms in the field of medical diagnosis have gained attractiveness due to their exceptional performance in challenges. In the majority of artificial intelligence-based research articles for breast cancer classification tasks [9][10][11][12][13][14][15][16][17][18], the model performance evaluation was performed on either breast cancer images taken through different computer-based techniques or numerical features' dataset extracted from analyzed biopsy sample cell nuclei.
In this study, the Deep Stacked Sparse Autoencoder for breast cancer classification is proposed and its validation and efficiency is tested using Breast Cancer Wisconsin (Diagnostic) Data Set (WDBC) which is a numerical dataset publically available [19].
The rest of the paper is organized in sections as follows. The literature review and related work is discussed in Section 2 while Section 3 includes the details of proposed algorithm architecture, and experimental approaches and comparative schemes are discussed in Section 4. Section 5 presents the results and discussions, and finally, Section 6 consists of a conclusion summary.

RELATED WORK
Requirements for reasonably acceptable methodology in terms of performance accuracy have been long searched in the majority of the research work from decades in the field of CAD-based medical diagnostics. Kadam et al. [9] have proposed deep learning model for breast cancer diagnostic which is ensemble learning based on stacked sparse Autoencoders. They have used the Softmax regression with reasonable accuracy measures by using only three hidden layer sizes in their experimentation. Despite the grid search approach for parameters optimization, their performance stuck at 98.60% in terms of true accuracy. To support the breast cancer screening process, several data mining techniques with and without feature selection through genetic algorithm were proposed in [10], where two different Wisconsin Breast cancer (Diagnostic) and (Original) datasets from the University of California Irvine (UCI) Machine Learning Repository were employed to validate the two-stage data mining technique for breast cancer classification task. Extracted optimal features through genetic algorithms were fed to several data mining techniques like Logistic Regression, Decision Trees, Random Forest, Bayesian Network, Multilayer Perceptron, Radial Basis Function Networks, Support Vector Machine (SVM), and Rotation Forest.
Noticeable performance accuracy was found for the Rotation Forest algorithm and they claimed that amalgamation of genetic feature selection algorithm and rotation forest approach was superior in performance. The achieved accuracy was 99.48 %.
In [20], two machine learning techniques, namely Naive Bayes and the K-Nearest Neighbor (KNN) were evaluated on the Wisconsin breast cancer dataset for the tumor classification purpose and compared their performance as KNN achieving 97.51% with the least error rate while NB classifier having 96.19 % accuracy. For breast cancer classification, [12] implemented six machine learning techniques including Support Vector Machine (SVM), Decision Tree Classifier, Naive Bayes, Logistic Regression, Linear Discriminant Analysis, and K Nearest Neighbor, and achieved the highest classification accuracy of 98% with support vector machine through 3-fold cross-validation method. Stacked Autoencoders with sparsity constraints have observable effects in feature learning and classification while using different hidden units. In [21], the effect of several hidden units was observed on the handwriting recognition task as performance accuracy varied while features were being learned from the handwritten images. [13] Proposed a deep neural network technique with a combination of supervised and unsupervised algorithms for WDBC dataset classification and achieved an accuracy rate of 99.68%. In [14], a feature selection based KNN approach is presented to classify Wisconsin breast cancer datasets. Appreciable classification accuracies are achieved while selecting appropriate feature and K values. Achieved accuracy of 99.42% with corresponding Area Under the Curve (AUC) value of 1.0 through Manhattan distance function having k value as 1 and Chi-square used for feature selection in WBC dataset. For the 2 nd dataset WDBC from Wisconsin, an accuracy of 98.62% with an AUC value of 0.999 is claimed to be achieved through the feature selection function of Chi-square, where the distance value k picked as 7 or 8 through the Manhattan and the Canberra distance functions. In [15], the Grid search parameter optimizing technique was employed to enhance the KNN classification accuracy for Wisconsin breast cancer detection and achieved 94.35% classification efficiency with tuned parameters as compared to the model with default parameters settings that is 90.10%. In [16], 12 distinctive machine learning techniques were applied for breast cancer classification on WBC original dataset and it was concluded that approximately 99% efficiency was achieved through the lazy and Tree classifier approach only. In [17], contrary to a typical convolutional neural network (CNN) whose fundamental popularity is regarding the classification of unstructured image data, a new architecture composed of a Fully Connected Layer First before the convolutional layer (FCLF-CNN) is proposed for the binary classification of Wisconsin breast cancer datasets. These structured datasets are difficult to be classified by conventional CNN but promising accuracies are achieved through FCLF-CNN with fivefold cross-validation for the WDBC database and the Wisconsin breast cancer database (WBC). In that paper, Breast cancer classification accuracies are achieved as 99.28% and 98.71% for WDBC and WBC datasets respectively.

MATERIALS AND METHODS
CAD-based diverse classification approaches were established for the diagnosis of breast cancer and their validation accuracy was being tested on the Wisconsin datasets for breast cancer but the majority is striving for the ultimate accuracy of classification so far. Our proposed Deep Stacked Sparse Autoencoder (SSAE) for breast cancer classification has anticipated the best accuracy and surpassed all the best methodologies cited in the literature inclusive [10] with an accuracy of 99.48%.
The network architecture of the proposed SSAE is built from single-layer autoencoder (see Fig. 1) by successive stacking and connections of single layer Sparse Autoencoder (SAE) and finally connected to a Softmax classifier (see Fig.2) where SAE is essentially Autoencoder [22] with Sparsity constraint inclusive. These typical architectures are generated through simulation. The Stacked Autoencoder has the distinctive ability to generate the output features from the unlabeled input and accurate classification is produced from these features by the Softmax layer. With these combined properties, these networks outperform the conventional classifiers used in binary classifications of breast cancer.
In general, the encoder maps the input features x into the corresponding coded representation on 'h' and the hidden layer 'h' can be viewed as a new feature representation of the input attributes [23,24]. The output layer decodes this coded representation 'h' to approximate the output values x with the input values as x, through the identity function. Mainly, the need of training the network is to achieve the set of optimal parameters of weights w and biases b from learned identity function so that approximation error may be reduced between and x.
The cost function of single layer Sparse Autoencoder with Softmax Classifier (SAE-1) comprises of three terms is given as equation (1) [24]: The error between input and its approximation is taken as an average summation of squared errors over the entire data N, j is the summing over the hidden units and n denotes the number of units in hidden layers. Kullback-Leibler (KL) is the divergence function between ρ and ρ which are desired activation and average activation over index j respectively. To avoid overfitting, weight decay term is added in the cost function, which is given as equation (2)  (2) where nl is the number of layers and sl is the number of neurons in layer l. 6 7,8 9 is the connection between i th neuron in layer l-1 and j th neuron in layer l. To learn the high-level feature in the WDBC, during the training of SSAE, finding the optimal parameter θ "W, b :, b ; & through minimizing the error difference between input feature levels and approximated feature learned at the output of the encoder. This feature learning may be accomplished by setting the number of neurons in the hidden layer less than the previous layer and this type of learning forces the network to learn, is a compressed representation while the second option is to choose more neurons in the hidden layer than the previous input layer and it is known as enforced learning where the network learns expanded features. There are no hard and fast rules to follow during training for the selection of the number of hidden layers and number of neurons in each hidden layer, rather use the hit and trial method to achieve the best approximation of input features at the output of encoders [21,25]. After completion of high-level feature learning, these learned features and target labels are fed to the output layer.
For the classification of multiple class data, Softmax classifier is used which is the advanced version of logistic regression [26] to generalize the logistic regression [24] as given by equation ( where w > are parameters of sigmoid function f = > (.). After the feeding of learned high-level features of the SSAE as input to the SMC layer, its parameters w > are trained with training set Eh * k , y k H for minimization of cost function by using the gradient descent approach [27], the cost function is minimized and the parameters w > are found.  [19]. The characteristics of cell nuclei are represented through computed features from the image.

Comparative Effectiveness
Experiments have been performed to measure the comparative performance effectiveness of Deep Stacked Sparse Autoencoders (SSAE) as breast cancer classifier, against several other state of the art classifiers like SAE-1, SSAE-2, SSAE-3, SVM, NB, KNN, ANN, etc., and the classification efficiencies of several models from the literature implementing the Wisconsin breast cancer datasets for model validation purpose is recorded (see Table 1) for comparison purpose.

Learning from Deep Neural Network (DNN) Parameters
The training of the proposed Deep Stacked Sparse Autoencoders with Softmax Classifier (SSAE) is done using a greedy layer-wise learning algorithm [28] Table 2).

Validation Method
To validate our method and dataset performance for breast cancer classification, two options are chosen.
As first option typical model parameters' configuration from experiment No. 5 (see Table 2) are used for simulation and run this model for two standard numerical breast cancer datasets WDBC and WBC obtained for UCI machine learning public repository. This repository is used by the majority of the research community in literature and results are   [12] 98 ---LR [12] 97.23 ---LDR [12] 95.73 ---KNN [12] 94.73 ---NB [12] 93.46 ---DT [12] 91.  Table 3). The former dataset details are discussed in Section 4. The later dataset was used to validate the SAE-2 performance, which contains 699 samples with 9 characteristics of each for classification purposes. WBC dataset is distributed as 458 trials as non-cancerous and 241 as cancerous trials. In contrast to the first option, a new model of Artificial Neural Network (ANN) with hidden Size 10 is chosen, whose performance on both datasets is evaluated and recorded (see Table 3). All the performance metrics of the above four experiments are measured in terms of confusion matrices, Receiver Operating Characteristics (ROC), error histograms, best training performance curves (see Fig. 3-6), and the area under the curve (AUC) (see Table 3).

RESULTS AND DISCUSSION
In the first performance comparison strategy, the performance using the WDBC dataset of the three types of Sparse Autoencoders having different internal architecture is evaluated based on variable DNN parameters (see Table 2). The first two classifier models exhibit a 100 percent correct classification rate for 8 out of 10 experiments, while model 3 (SSAE-3) achieves only 40 percent correct detections in 10 experimental instances. It may be concluded that the stacking of more layers will deteriorate the efficiency of the classifier model. Moreover for the SAE-1 and SSAE-2 models, precision is almost maximum as 1 against Recall which has a minor error in two experiments, which is less than 2% only while in the SSAE-3 model which exhibited maximum precision as 1 in 7 experiments with error up to 5% and 33% error in Recall in 6 experimental instances (see Table  5).
For a typical case with Hidden Layer Size 03, L2WRP 0.03, SRP 3, and SPP 0.3, three experiments are performed with 100% accuracies in the first 3 instances out of 3 while going deeper by stacking up to level 5 results got worse. Likewise, classification performance is evaluated for some other state of the art classifiers discussed earlier (see Table  1) and they were lagging-in accuracies as compared to the proposed SSAE (see Table 4). The classification efficiency in the SAE-1 model improves with the increasing DNN parameters and is observed with less than 0.5% error in two experiments out of 10 experimentations and which reaches 100 % accuracy after 3 rd experiment and remain stable in all other 7 experimental instances. The SSAE-2 classification model has 0.5% classification error in the first experimental instance while 2.5% error is observed in the last experiment while 100% classification accuracy is maintained in 8 experiments with increasing parameter values. The breast cancer classification model SSAE-3 has worse accuracy cases starting with 62.7% and improving till the 5 th trial and reaching 100% after the 5 th trial and remain stable but again deteriorate in the last two tests.
The quantitative measurements of performance for SSAE with other comparative schemes models discussed in Table 1 were evaluated in terms of Precision (Pr), Recall or True Positive Rate (TPR), False Positive Rate (FPR), and F-measure (see Table  5). The definitions of these metrics are given in equations (4-7) (7) The Precision (Pr) is a maximum of 100% for three types of Autoencoders while the other 3 models have a reduction of up to 9%. In comparative models only model that achieves 100% True Positive Rate (TPR) for classifying all 357 cases as Benign is SSAE-2 while all other 5 classifiers exhibit error in TPR up to 4.2%. The False Positive Rate (FPR) is zero for two proposed models (1-2) (see Table 5) as no benign case is wrongly classified and F-measure is simply 100% for proposed model SSAE-2 which outclasses all other classifiers.
We further validate the performance of our proposed model against the artificial neural network (ANN) model and WBC dataset for breast cancer classification. SSAE and ANN models both are tested with WDBC and WBC datasets respectively. SSAE model with parameters selection from experiment no 05 (see Table 2) is chosen for simulation. The model of artificial neural network (ANN) with hidden Size 10 has experimented on both datasets. All the Performance metrics of the above four experiments. are measured in terms of confusion matrices, receiver operating characteristics (ROC), error histograms, best training performance curves depicted in Figure (3-6), and the area under the curve (AUC) which is listed (see Table 3) and where it can be observed that our proposed model has outperformed the ANN both on WDBC and WBC datasets. These two datasets have  also been used in research articles for validation purposes and we have also cited several of them along with their performance accuracies regarding breast cancer classification tasks as recorded (see Table 3) where comparative efficiency of Deep Stacked Sparse Autoencoders (SSAE) has outclassed all of them.

CONCLUSIONS
In this paper, we have proposed the Deep Stacked Sparse Autoencoders SSAE for Breast Cancer Classification whose performance is tested on the WDBC dataset specifically, and the WBC dataset is used as performance contrast measurement. Our model is also compared against shallow and deep Autoencoders as well as the other state of the art classifier models discussed earlier. It is found through several simulation experiments which are analyzed to conclude that our proposed model can perform optimistically with a reasonable choice of deep neural network parameters and model architectures. It is also observed through literature review and experimentations on the other state of the art machine learning classifiers that their breast cancer classification accuracy remain low as compared to the SSAE model. In the future, the proposed model may be applied to other disease types for binary and multiclass classification tasks.