Random forest age estimation model based on length of left hand bone for asian population

Soft Computing and Intelligent System (SPINT), Faculty of Computer Systems & Software Engineering, Universiti Malaysia Pahang, Kuantan, Pahang, Malaysia Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Parit Raja, Johor, Malaysia Faculty of Industrial Technology, Universitas Ahmad Dahlan, Yogyakarta, Indonesia College of Computer and Information Technology, Albaha University, Albaha, Saudi Arabia


INTRODUCTION
Age estimation serves a vital role in identifying the individual's details due to the increase of human trafficking, asylum seekers, refugee, criminal responsibility, child pornography, and the falsification of age. The traditional age estimation model using left hand bone as input such as Tanner Whitehouse (TW) [1] and Greulich and Pyle (GP) [2], are based on qualitative data which is the observation of bone morphology from a radiograph of left hand by forensic anthropologist. These models have disadvantages in which the models are considerable intra and inter-observer variability where the estimated age definitely relies on the forensic expert. Therefore, different experts with different experience levels will most likely produce a greater variety of intra-observer, which is different predicted age will be produced by the different experts. Thus, the accuracy of bone age assessment is very important.
There are several case studies that used measurement of hand bone (quantitative data) as input for age estimation [3][4][5][6][7]. All the case studies have proved that the measurement of the hand bone is able to be utilized as indicator for age estimation. This study chooses the length of hand bone as an alternative parameter for age estimation. On the other hand, Soft computing models such as ANN, SVM and RF are proved reliable to be used on quantitative data especially for prediction, classification and optimization problem [6][7][8][9][10][11][12][13]. Thus, this study adopts and implements the RF model on the length of hand bone data to perform age estimation. In order to evaluate the performance of the proposed model, this work compares the accuracy results from the experiments with existing ANN model and SVM model that used the same dataset introduced in [6].

RESEARCH METHOD 2.1. Research materials
A sum of three hundred and thirty three X-ray scans of Asians' left-hand bones, 166 of them are male and 167 are female, were taken from the online dataset [14]. The ages range between newborn to 18 years old. Age distribution for these subjects is illustrated in Table 1. This online dataset consists of four populations which are Asian, Hispanic, African American and Caucasian and has been used for many case studies such as in [15][16][17][18]. The dataset comprises of individuals below 20 years old without any record of bone problem or bone disease, for instance fractures, osteoarthritis, rheumatoid arthritis, bone cancer or other problems associated with genetic. Due to the fact that bones with such problems have higher tendency to be weak, bristled, misshaped and broken easily that could lead to inaccurate measurements, those bones were excluded in the study. The source of these x-ray scans was from Children's Hospital Los Angeles together with demographic data of patients and reading by radiologists, assigned into 19 groups (new-born, 1 to 18 years old) for both male and female. The details of each subject; the image name, the race, the gender, the chronological age, the date of birth (DOB), the exam date, the height (cm), the weight (kg), the trunk (cm), the reading 1, and the reading 2, were perfectly documented for reference and validation purpose. For the record, several previous case studies also have used this dataset to develop age estimation model [6,7,17,18].  -3 Total  Female  16  19  23  41  38  30  167  Male  17  20  18  44  37  30  166 According to the structure of a hand bone, it is categorized into four parts, to be specific, proximal phalanx, the distal phalanx, metacarpal and middle phalanx. Three out of four groups consist of five bones each while another group, middle phalanx group has four bones. Therefore, the sum of bones found in a hand is 19. Throughout childhood and adolescence phases, the left hand's bone development can be partitioned into six important stages. The first stage would be the infancy (newborn to 10 months for female, newborn to 14 months for male), followed by the second stage which is the toddler (10 months to 2 years for female, 14 months to 3 years for male), and the third stage which is pre-puberty (2-7 years for female, 3-9 years for male), and the fourth stage which is the early and mid-puberty (7-13 years for female, 9-14 years for male), then the fifth stage which is the late puberty (13-15 years for female, 14-16 years for male) and, lastly the sixth stage which is the post-puberty (15-17 years for female, 16-19 years for male).
To gauge every length of bone in each stage, software of photo manager was utilized to quantify all the nineteen bones by making a line in each bone, beginning from the base-center point to the end-center point of the bone on every X-ray image, and it consequently created the length of the line in centimetre (cm). The line was made by disregarding the epiphyseal (if it happened) in the bone for infancy stage. The lines were made for other phrases by incorporating the epiphyseal regardless of just a small epiphyseal illustrated in the pictures. Figure 1 demonstrates a case of measuring the length of the bone which belongs to a male subject for each stage from his X-ray image with the help of the software. For experiment and analysis purposes, every single measured data from the images was then organized in a spread sheet. Before the proposed model developed, the normalization of data need to be done. The details of the data normalization are described in the next section. Data normalization is frequently conducted prior to the process of testing and training starts. It is feasible to standardize the input and output to a standard range, for example, -1 to 1 or 0 to 1. Fundamentally, while nonlinear transfer functions, for instance, the logistic sigmoid function are utilized at the output nodes, the desired output values need to be changed into the scope of the initial output of the system. Regardless of the possibility that a linear output transfer function is utilized, it is yet beneficial to normalize the outputs and additionally the inputs to prevent computational issues. To standardize the gathered length of the bone, the normalization equation from previous studies [19,20] was used which is illustrated in Equation 1, where refers to the ith input/output data, refers to the minimum value of the input/output data and refers to the maximum value of the input/ output data. Table 2 illustrates an instance of the measured data gathered from the late puberty x-ray scans in Figure 1 after normalization. In this study, the sum total of X-ray scans for both male and female is 333, it means each X-ray scan has 333 distinctive tables, like the one appeared in Table 2. Small data sets are insufficient for investigations. To overcome this circumstance, it comes to the use of k-fold cross-validation algorithm on RF model. The use of this algorithm aims to separate the entire experiments into two sections which are the training and testing sets. The previous one is utilized to construct the model whereas the latter is to validate the model. The two parts need to traverse in progressive iterations. Ten similarly (roughly) sized sets were divided from the entire sample information in each group, in which every group confirmed a few impact places in every division. For every iteration, only one set was selected for testing whereas the rest were chosen for training. The MSE values were delivered from the testing data in each iteration, and the average of MSE was then figured as the performance for each of the soft computing model. The MSEs from each model were then presented in a table for summarization. After that, comparison was made to determine the best model that can be used for age estimation. The MSE value is chosen because the previous case study that used ANN and SVM also applied MSE as performance function in his work [6].

Random forest
Random Forest (RF) model is developed by Leo Breiman [21] where the RF has turned into a standard information analysis device in bioinformatics. It has demonstrated outstanding performance in settings where the quantity of observations is much smaller than the number of variables in which complicated interaction structures can be coped well with as well as immensely correlated variables and returns measures of variable importance [22]. RF is a regression and classification model in accordance with the collection of an extensive quantity of decision trees. In particular, it is an aggregation of trees built from a training data set and internally verified to produce a forecast of the reaction provided the predictors for future observations.
The flow of the development of RF model for age estimation is shown in Figure 2. RF model needs selecting parameter m, the number of variables (a subset of available P predictor variables) which is utilized to identify the decision at a node of the tree. Several studies [23,24] have used the square root of the number of input variables to determine the value of m, as suggested by Breiman. Breiman also suggested the value of m to be the first integer less than , where P is the number of input variables. The optimal value for m can also be identified by the tuneRF function of the R-software. R-software is a free software that offers a wide assortment of statistical linear and nonlinear modelling, classification, time-series analysis, clustering, graphical techniques and classical statistical tests, and is very extendable. TuneRF function starts with the default value of m, and then searches for the optimal value (with respect to out-of-bag error estimate) of m for RF model. In this study, the m value was chosen using the three methods above. The second parameter required by RF is the number of trees, t. This study uses four values for t, which are 100, 200, 500, and 1000. The combination of these parameters will develop RF's structure used to estimate age. So, the combinations of the parameters used will produce twelve RF's structure which is shown in Table 3. Then, each RF's structure will produce R-square value and MSE value for both male and female. All these values also are shown in the same table. The best RF's structures according to the R-square and MSE values were then selected by comparing each RF's structure and then compiled into Table 4 for comparison purpose with the ANN and SVM produced by the previous case study in [6].  Table 3 illustrates the results of age estimation in the form of R-square values and MSE values for both male and female, for each RF's structure. The table reveals that RF's structure with number of tree 200 and number of variable 4 is the best RF's structure which produced the greatest R-square value of 0.914 and lowest MSE value of 1.958 for male, while for female, RF's structure with number of tree 1000 and number of variable 4 is the best structure which produced R-square value of 0.846 and MSE value of 3.438. Figure 3 shows the graph of predicted age produced by RF model and the real age for both genders. The graphs show good relationship between the predicted age and the real age. Table 4 shows the results of the best RF's structure selected before, together with the ANN and SVM's results taken from the previous case study. The table shows that the SVM produced the greatest R-square value of 0.916 and the lowest MSE value of 1.917, for male, while for female, the RF model shows the greatest R-square value of 0.846 and the lowest MSE value of 3.438. Forensic anthropologists are ceaselessly endeavouring to enhance the approaches of assessing age through skeletal identification [25]. In a study on sexual dimorphism in carpal bones conducted by Sulzmann et al. [26], the authors claimed that right-hand dominance is a very common occurrence among human populations. Additionally, more prominent practical loading on the dominant hand leads to bigger bone. Therefore, the left hand's bone was utilized in this study since it is relatively less used, which results the growth of these bones to be almost similar among every participants. Participants who are under 19 years old were also investigated in this study. This is because a few research have demonstrated that the methods of age estimation are undependable with an error of more or less twelve years after the 30 years old [27].

RESULTS AND ANALYSIS
For routine forensic application, Rösing and Kvaal [28] expressed that a model that produce standard error of regression of more than 5 or 7 years cannot be accepted for age estimation. 95% confidence intervals of around 14 years and above need to be taken into account in estimating age, given that there is a standard error of seven years. Selecting the less reliable methods can be troublesome as the estimation of an -apparent age‖ usually conducted by the investigating teams may not be enhanced by the results, for instance in living human and fresh corpse. Therefore, it will be a waste of money and time. There are many reported methods that used bones for age estimation and they are subdivided into three main categories: image processing [29][30][31], by comparing with bone age atlas [32][33][34], and statistical regression analysis [35][36][37][38]. As being compared with the atlas, as the name proposes, X-ray image of the subjects are being made comparison with an atlas which contains a set of radiographs of identified gender and age. Different bone features are reliably extricated in the image processing method. It is being accounted that this model is fit for accomplishing more solid information for age estimation. In comparison with the other two methods, regression analysis is a well-known decision because of its comparative and simplistic accuracy. The primary purpose is to find out the relationship between one or more independent variables and a dependent variable through the R-square value produced by the models used. The independent variables are also known as explanatory or predictor variables.
Soft computing models such as RF model can be utilized as option model because it provides advantages such as knowledge of internal system variables is not required, factual calculation and simpler solutions for multiple variable problems. Soft computing is a creative approach in developing computationally savvy frameworks. According to Zadeh [39], soft computing is a developing strategy towards computing which corresponds to the important capacity of the human intelligence to comprehend in a domain of imprecision and vulnerability. In this study, measurement was made on a total number of 19 bones in the left hand and RF soft computing models were conducted on all the bones to estimate age. For comparison purpose, for male, the best soft computing age estimation model according to the performance measurement produced is SVM model where the R-square and MSE value produced is 0.916 and 1.917, respectively. For female, RF is the best soft computing model where the R-square and MSE value produced is 8.46 and 3.438, respectively as compared with the other models.  Figure 3 shows the graphs of the predicted output produced by the RF model and the actual age, of both male and female. The graphs show that for male aged from new-born up to 16 years old, the predicted ages of all both models were consistent with the actual age. However, for male above 16 years old, the predicted age seemed to deviate from the actual age, showing inconsistency in output from the both models (see the black line between age 15 years old and 17 years old). For female, the predicted age from the both models showed consistency with the actual age, but after 15 years old, the predicted age also seemed to have deviated from the actual age, same as the male (see the black line between age 14 years old and 16 years old). These findings were similar with the previous case study chosen where the graph produced by ANN and SVM also the show the similar case. Ritz-Timme et al. [40] stated that the validation of age estimation of most morphological methods is the least accurate in adulthood. Santos et al. [41] in their study on age estimation using the Sempé method built for computer -Maturos 4.0 (MT) program showed that the MT program only produced reliable results for age under 16 years old. Molinari et al. [42] in his study also stated that the growth of the skeleton has practically stopped for the skeletal development at the bone age of 16.5 years and 15 years for boys and girls respectively. After that age, the evaluation of age tend to be inaccurate, resulting to a vast deviation between the real and the estimated age. From these supported literatures, we can say that the best range of age for age estimation is between new-born to 16 years old for male, and new-born to 15-16 years old for female. In addition, based on our graphs, our RF model can predict well for both male and female in that range of age.
Generally, different contributing variables such as different methodology approaches, diverse racial backgrounds, or dissimilar environmental conditions, could clarify the contrasts between multiracial investigations of skeletal development [43]. Furthermore, a lot of factors such as nutrition, occupation, endocrine factors, genetic, overall lifestyle and health, growth, and activity significantly influence these indicators in an unforeseeable way [44,45]. Due to these reasons, the limitation of technical application need to be done to the targeted population from which the bones were gathered. Estimating age from a particular population should be exceptionally analyzed in which the applied regression models or mathematical functions may differ because of these differences.

CONCLUSION
According to this study, the number of X-ray of the left hand from a set of data of Asian children were used for age estimation is 333. One soft computing model was used which is RF model to be compared with the ANN and SVM model developed in the previous case study. Based on the findings, RF model is comparable with the ANN and SVM model especially for female where RF model produced better results than ANN and SVM in term of the performance measurement used. However, for male, the RF model is less efficient than SVM model but better than ANN model.
According to the graph produced by the RF model and the supported literature, RF model can estimate well the age for range of age between newborn to 16 years old and between newborn to 15 years old, for male and female, respectively. This finding also proves that the length of bone is reliable to be used as age indicator for age estimation. To conclude, the RF model is still comparable with the other models and suitable to be used for age estimation. However, further study will limit the subject age from new-born to 16 years old for male and new-born to 15 years old for female, for age estimation, according to the supported literatures and the findings. The future study will improve the results of the age estimation by studying other algorithms used by various other case studies available such as by Lenin, Reddy, and Kalavathi [46], Ismail et al. [45], Ismail et al. [46], Khaleel et al. [47] and all other classification methods [48][49][50][51][52][53][54][55][56][57].