Explainable extreme boosting model for breast cancer diagnosis

ABSTRACT


INTRODUCTION
Machine learning has been widely applied to the diagnosis of breast cancer. Different models, namely extreme boosting (XGBoost) [1] and random forest (RF) [2] are applied to develop a model for breast cancer diagnosis. The explanation of diagnosis results remained not interpretable although the XGBoost and random forest have achieved encouraging prediction accuracy of 97% and 98.3% respectively.
Several studies applied machine-learning algorithms to breast cancer diagnosis. Dhahri et al. [3] investigated the application of decision trees (DT), random forest, support vector machines (SVM), Gaussian naïve Bayes, K-neighbors, and linear regression methods to breast cancer diagnosis. The study emphasized that encouraging result is obtained with supervised learning algorithms. Adaptive boosting achieves the highest accuracy of 97.44% on the breast cancer dataset.
Kabiraj et al. [4], studied XGBoost and random forest for analyzing the risk of breast cancer. The simulation on the XGBoost and the random forest was conducted with 275 observations by training on 12 features. The result indicates that the random forest algorithm achieves 74.73% accuracy, while XGBoost achieved 73.63% accuracy.
Liang et al. [5] improved the performance of XGBoost, k-nearest neighbor (KNN), Adaboost, DT, RF, and gradient boosting decision tree (GBDT) for breast cancer diagnosis. The study tested and compared the performance of the model for breast cancer diagnosis. The result revealed that the XGBoost scores a higher accurateness of 90.24%, and KNN revealed the lower accuracy. Derangula et al. [6] developed an optimum XGBoost model for breast cancer diagnosis. The study applied feature selection for optimization of the XGBoost and obtained an accuracy of 96.49%. The result shows that feature selection improves the accuracy of the XGboost model for breast cancer diagnosis. The developed model improves the accuracy of the XGboost model presented in the previous study [5] with a similar model by 6.09%. Hence, the feature selection significantly improves the model's prediction accuracy to identify the breast cancer.
Inan et al. [7], evaluated the synthetic minority oversampling technique for optimization of the XGBoost model for breast cancer diagnosis. The study developed the XGBoost with a dataset collected from the UCI repository. The experimental result demonstrates that the XGboost achieves a validation score of 98.4% on breast cancer identification. The validation score of 98.4% proves that XGBoost model's effectiveness in identifying malignant and benign cases of breast tissues. However, the study does not show a method to intemperate the predicted result to the patient or oncologist.
Phankokkruad [8] developed a cost-sensitive XGBoost-based model for breast cancer diagnosis. The study suggested that the cost-sensitive model achieves better accuracy compared with the non-costsensitive XGBoost model. In addition, the result on breast cancer (Wisconsin) reveals that the sensitive XGBoost model achieves 99.12% on breast cancer diagnosis. The investigation of the performance shows that the XGBoost model achieved higher accuracy. However, the model does not explain the diagnosis result produced.
Additionally, Vamvakas et al. [9] studied the effectiveness of the XGBoost model and improved the diagnostic performance of the model, in differentiating breast cancer lesions. Methods: The study experimented with 140 samples of which 70 are benign tissues and 70 malignant tissues. The XGBoost model achieved different performances in diagnosing breast cancer. The XGboost and light gradient boosting models achieved higher accuracy of 95% and 94% respectively. However, the study does not investigate the explain ability of the model.
Most of the machine learning models applied to breast cancer diagnosis are black box model [10]- [15]. Contrary to the interpretable model the black box model does not provide mechanisms to explain how the model makes decisions in breast cancer diagnosis. Thus, the development of models that make the behaviors and the diagnosis outcomes of a model understandable to humans requires much research effort.
Developing a model for breast cancer diagnosis with machine learning has been effective. For instance, several studies [16]- [19] have suggested that the use of a machine learning model for breast cancer diagnosis is encouraging. The machine-learning model achieved 97.77%. While the 97.77% accuracy shows encouraging achievement, the prediction outcome explanation and the interpretation of the internal working of the supervised model requires much research effort to develop a practical trustable and transparent model for the diagnosis of breast cancer.
Recently, different researchers have proposed methods for explaining complex models for breast cancer diagnosis. As a result, several approaches exist to explain the prediction outcome of the machine-learning model. One of the approaches for explaining the prediction outcome of a machine learning model for classification problems such as breast cancer diagnosis is the Shapley additive explanation (SHAP) [20]- [22].
This research aims to develop an XGBoost model with SHAP to explain the output for breast cancer diagnosis. The study is motivated to investigate the interpretability of XGBoost model output with SHAP. The objective of the study includes the following: i) to study the relation between breast cancer features and target labels with SHAP; ii) to investigate how the XGBoost model is making decisions in breast cancer diagnosis; iii) to analyze the features that have an impact on the XGBoost model decision-making process. The organization of the rest of the study is as follows: section 2 discusses the method, section 3 presents the result, and section 4 provides the conclusion, and discusses the implication of the result obtained.

METHOD
This research employed the University of California Irvine (UCI) dataset for experimental analysis. Several studies [23]- [25] have employed the UCI dataset for the development of machine learning models for breast cancer diagnosis. An open-source scientific learning kit (sci-kit-learn) was employed to develop the XGBoost model. The 569 samples collected from the UCI data repository were split into a training set (70% of the total sample) and a testing set (30% of the sample). Then, the model is tested on the testing set and the prediction result is explained by SHAP values produced by the force plot for the samples used in the experiment. Figure 1 indicates the research chronology and the procedures employed in this research.  Figure 2 shows SHAP values for the benign tissue, which is the first in the training set. The SHAP values are beneficial for interpreting individual results of the XGBoost model output for breast cancer diagnosis. The explanation provided by the SHAP values lowers the generalization of the XGBoost model to a single prediction that makes the interpretation intuitive, resulting in explanations that are more useful. The individual prediction result of the XGBoost model is produced with force plots, the horizontal axis showing the SHAP value of the final prediction, and each feature's contribution is shown in a block: left of the result, the positive SHAP values (forcing the probability higher) and lower SHAP values in the right (forcing the probability lower). We can see how the final SHAP value (which can be transformed from a continuous domain to probability using a sigmoid) is 2.29 and higher than the base value, therefore predicted as benign. The representation shows how, despite the texture having anomalous values, the concavity and perimeter feature finally turn the prediction to the positive side. The highest contribution is the one by concave point worst.

SHAP explanation for malignant tissue
The experiment considered malignant tissue to analyze the explanation provided by the SHAP values on the positive prediction outcome of the XGBoost model. The result of the SHAP explanation for the malign tissue is shown in Figure 3. As shown in Figure 3, the concavity, texture, and perimeter of the cells make the model predict it as a negative case. This model can make accurate predictions, but thanks to these SHAP interpretations, also its explanations are available, which provides practitioners with useful information to understand the model.  Figure 3. SHAP explanation of XGBoost model output for malign tissue

SHAP explanation for malign and benign tissue
The experiment considered malignant tissue to analyze the explanation provided by the SHAP values on the positive prediction outcome of the XGBoost model. The result of the SHAP explanation for the malign tissue is indicated in Figure 4. As shown in Figure 4, the concavity, texture, and perimeter of the cells forced, the XGBoost model predict it as a negative case.

CONCLUSION
This study investigated the effectiveness of SHAP in explaining the XGBoost-based model for breast cancer diagnosis. Moreover, the study developed an explainable XGBoost-based model that assists in the diagnosis of breast cancer as a preliminary examination. The study also suggests that SHAP explains the diagnostic outcome of the XGboost model with a forced plot, providing insights into the impact of each breast cancer feature on the XGBoost model output. Furthermore, the explanation XGBoost model provided by SHAP is vital to develop an explainable model to diagnose breast cancer. The developed XGBoost model for breast cancer diagnosis achieves 98.42% accuracy. Additionally, the SHAP explanation provides an interpretation of XGBoost model diagnosis outcomes. The study recommends the investigation of the explanation of other machine learning models such as support vector machine, random forest, and deep learning to develop an explainable model for breast cancer diagnosis in the future work.