Extracted features based multi-class classification of orthodontic images

Received Jul 27, 2019 Revised Dec 2, 2019 Accepted Feb 1, 2020 The purpose of this study is to investigate computer vision and machine learning methods for classification of orthodontic images in order to provide orthodontists with a solution for multi-class classification of patients’ images to evaluate the evolution of their treatment. Of which, we proposed three algorithms based on extracted features, such as facial features and skin colour using YCbCrcolour space, assigned to nodes of a decision tree to classify orthodontic images: an algorithm for intra-oral images, an algorithm for mould images and an algorithm for extra-oral images. Then, we compared our method by implementing the Local Binary Pattern (LBP) algorithm to extract textural features from images. After that, we applied the principal component analysis (PCA) algorithm to optimize the redundant parameters in order to classify LBP features with six classifiers; Quadratic Support Vector Machine (SVM), Cubic SVM, Radial Basis Function SVM, Cosine K-Nearest Neighbours (KNN), Euclidian KNN, and Linear Discriminant Analysis (LDA). The presented algorithms have been evaluated on a dataset of images of 98 different patients, and experimental results demonstrate the good performances of our proposed method with a high accuracy compared with machine learning algorithms. Where LDA classifier achieves an accuracy of 84.5%.


INTRODUCTION
Image set classification problem can be solved by combining different fields such as computer vision algorithms, machine learning classifiers and image processing [1][2][3][4][5][6]. With their three well-known steps: transform and present images appropriately, extract meaningful information from images, and using these extracted features to build a model for classification [7][8][9][10][11][12]. It can be classified into two categories: parametric and non-parametric methods. And it can be also categorized by type of the treated problem of classification: binary classification or multi-class classification.
Many methods for image set classification are proposed. Mefraz et al. presented a novel sparse nonparametric support vector machine classifier [6]. Also, Wang and Zhang [13] presented a method for image recognition and classification based on improved Bag of Features (BOF). In addition to that, Affonso et al. compare deep learning architecture with machine learning techniques: Decision tree induction algorithms, Neural Networks, Nearest neighbours and Support vector machines [14]. Dahmane et al. presented a technique to carry out head pose estimation based on the orientation of the symmetry axis that indicates the roll angle of the head. The symmetrical area of the face with respect to this orientation offers them features such as the width of region which allow them to classify and then, to predict yaw angles [15]. Moreover, according to the exhaustive review [16], most face detection methods can be categorized into four major classes: knowledge-based methods, feature invariant approaches, template matching methods and appearance-based methods. Knowledge-based methods are based on rules deducted from knowledge about the components of a typical face and how a face can appear in the image. Generally, the rules based on the relationship among the facial structures [17]. Feature invariant approaches are designed to find structural features that exist even during the pose, the view angle or the illumination conditions change, and then they use these invariant features to locate the positions of faces [18]. Template matching methods; the main idea of these approaches is to create standard models able to describe a face or a face portion. Then, the correlation between the input image and the model is computed. Thanks to this, faces are identified in the image [19]. Appearance-based methods [20]; the principle of these methods is to consider the problem of face detection as a classification problem. It is to classify a captured model in one of two classes: face class and non-face class.
The aim of our study is to design recognition techniques of extra-oral, mould, and intra-oral images, in order to provide orthodontists with a solution for the classification problem based on type of patient's images to evaluate the evolution of patient status using images of previous visits. This will give efficiency and save the time in their manipulations. These techniques are based on extracted features and simple mathematical methods. Therefore, in this paper, we present three local algorithms based on extracted features organized with a decision tree representation to classify and recognize six-teen classes of orthodontic images: an algorithm for intra-oral images, then a one for mould images and an algorithm for extra-oral images. Then we combine those algorithms to have one structure for all classes of orthodontic images and evaluate the confusion of the decision between all classes. This hierarchical representation can be interpreted as a set of hierarchical types stored in leaves tree structure. By using several extracted features from orthodontic images, such as invariant facial features and skin colour using YCbCr colour space. Then, we compared our method by implementing six classifiers of machine learning; Quadratic Support Vector Machine (SVM), Cubic SVM, Radial Basis Function (RBF) SVM, Cosine K-Nearest Neighbours, Euclidian KNN, and Linear Discriminant Analysis (LDA). By using the Local Binary Pattern (LBP) to extract features from images. After the features extraction step, we applied the principal component analysis (PCA) algorithm to optimize the redundant parameters in order to classify LBP features with that six classifiers. The presented algorithms have been evaluated on a large dataset of images of different patients, and experimental results demonstrate the good performances of our proposed method with a high accuracy.
The rest of this paper is organized as follows. The materials used to establish the experiments are presented in section 2. In Section 3, the proposed method of classification of the three main classes (mould, intra-oral and extra-oral) of orthodontic images is presented. Then, the research methods are presented in section 4. Then, the experimental results and corresponding analyses are provided in Section 5. At last, concluding remarks are presented in Section 6.

MATERIAL 2.1. Data sample description
All presented approaches are trained using 1207 image of 98 different patients as cited in Table 1 for each main-class. the images are acquired with digital cameras respecting the protocol described in [21] with resolutions varied between 1024*768 and 3888*2592 in RGB colour space in JPEG format. Tables and figures are presented center, as shown in Table 1 and Figure 1, and cited in the manuscript before appeared.

PROPOSED METHOD
In this section, we describe our proposed approach to extract features and classify orthodontic images. So, to identify an image type among six-teen different types of orthodontic images illustrated in Figure 1, we define a hierarchy of these types as illustrated in Figure 2 of classification tree based on

Skin colour detection using YCbCrcolour space
The orthogonal colour spaces reduce the redundancy present in RGB colour channels and represent the colour with statistically independent components. As the luminance and the chrominance components are explicitly separated. The YCbCr colour space is one of the most widely used choices for skin detection [22,23]. Thus, we use the ranges of Cr and Cb values that correspond to skin-colour reference map in the CbCr plane defined by Chai and Ngan [24]

Classification method of mould images
In this subsection, we have five types of moulding images to classify: maxillary, mandibular, front, right, and left. So, to distinguish these classes, we use ten features in algorithm 1 as follows: -The mean of the pixels in the four corners of the image with a size equals to s²: With I is the matrix of pixels in grayscale colour space, h is the image height, and w is the image width.

Classification method of intra-oral images
As for mould images in this subsection we have five classes for intra-oral images: intra maxillary, intra mandibular, intra right, intra front and intra left. We propose an approach based on algorithm 2 to classify these classes. We differentiate between the intra maxillary and the intra mandibular by studying the monotony of the function characterized by: With i and j are the pixel coordinates of teeth by detecting their exterior edge in the left half of the image by using morphological filter on segmented teeth by thresholding. If this function is increasing, it is an image of intra mandibular type. Else, if this function is decreasing, it is an image of intra maxillary. Else, if this function is increasing on a part of the left half of the image and decreasing on the other, it is an image class of the FLR (Face Left Right) node of the tree illustrated in Figure 2.
For the FLR node, we use the location of pixels that correspond to the skin colour. For the intra right images, the dental retractor is present in the left side of the images. While in the right side, there is the interior part of the mouth. Therefore, we have in the left side a lack of pixels that correspond to skin colour unlike the other side the pixels having the skin colour are in abundance. The same thing for intra left type, except that in this case the retractor is in the right side, which leads to a presence of pixels that correspond to skin colour in abundance in the left side. In addition, for the intra front images, we distinguish this class by using the fact of presence of pixels that correspond to skin-colour with approximately similar distribution in the both sides of the image.  ISSN: 2088-8708

Classification method of extra-oral images
In this subsection, we propose an approach of classification detailed in algorithm 3 for extra-oral images. We have six classes; left portrait profile, right portrait profile, face profile, Three-Quarter profile (45° orientation), face profile with smile, and three-quarter with smile. We distinguish these classes of extra-oral images by computing the number of pixels corresponding to skin-colour in the left upper quarter of the image ( 1 ) and the number of pixels corresponding to skin-colour in the right upper quarter ( 2 ). If the quotient of s1 by s2 is less than a certain threshold α5, it is a class of right portrait profile. Otherwise, if this quotient is greater than a certain threshold α4, it is a class of left portrait profile. Otherwise, if this quotient is between both thresholds α5 and α4. It is about FTQ (Face Three-Quarter) node of the tree illustrated in Figure 2.
For the FTQ node, we distinguish the portrait face from three-quarter by performing a method that locates the pixels belonging to the left and right eye. By using some features such as presence of the skin around the eye, presence of crystalline and white area inside eye's area. Then we base on the distribution of pixels corresponding to skin colour in the left side of left eye and right side of right eye to differentiate portrait face from portrait three-quarter. Then we locate the mouth by using the method of skin detection based on YCbCrcolour space. Especially on the mouth's area, Cr value increases, nearly the double, from other parts of face. Then we look if there is a presence of teeth in this area to know if it is about a smile case or not.

Classification method of orthodontic images
To resume, we differentiate between these three main classes (extra-oral, intra-oral, and mould) by using algorithm 4 that distinguishes extra-oral images from others by using the fact of the presence of background and skin colour in the image. Unlike extra-oral images, in intra-oral images there is no background, while there is a massive presence of pixels that correspond to skin colour. In contrast, for the mould images there are no pixels that correspond to skin colour. So in each case we call the appropriate algorithm to work.

RESEARCH METHOD 4.1. Linear Discriminant analysis
Discriminant analysis is a classification method.The model for discriminant analysis assumes samples have a Gaussian mixture distribution. For linear discriminant analysis (Fischer discriminant), the model has the same covariance matrix for each class, only the means vary.

Support vector machines
Support vector machines (SVM) is a supervised machine learning classifier that can be used for classification and regression. SVM utilizes a priori information as a cluster labels for the training step. Then, it generate by using the support vectors an optimal hyperplane so that the expected classification error for unlabelled test samples should be minimized [27].
SVM classifier is usually used for binary or multi-class classification problems. In this paper, we implement three types of SVM classifier based on the following kernels: -Polynomial -Gaussian or Radial Basis Function (RBF) With γ and c are kernel parameters.

K-Nearest Neighbours
K-Nearest Neighbours is one of the oldest non-parametric learning algorithm. KNN is based on similarity calculation between examples. KNN classifier classify an instance by using the classes of the k closest training instances in feature space of the dataset [14]. In this paper, we implement two types of KNN classifier based on the following distances: -Euclidian distance with number of neighbours equals to 1.
With U and V two n-dimensional vectors.

RESULT AND DISCUSSIONS
In this section, we evaluate the performance of the presented methods. At first, experimental settings are presented. Then theclassification results will be presented. And finally,an evaluation of all presented methods will be shown. For evaluation of the classification results, different measures are used, such as Accuracy, Recall, Precision, F1-measure, Sensitivity, Specificity, and Cohen's Kappa. Accuracy is the number of correctly classified images relative to the total number of classified images. Recall(akaSensitivity)is the rate of real positive cases that are correctly predicted positive. Contrariwise, precision is the rate of predicted positive cases that are correctly real positives. F1-measure, is the weighted harmonic mean of Precision and Recall. Itcomputes the quality of classification. These measures can be expressed as follows: Where TP is the total number of positive images, FN is the total number of false negative images and FP is the total number of false positive images. Cohen's Kappa is computed from a confusion matrix as follows [28]: Where hii is the number of true positives for each class in the diagonalof the confusion matrix, n is the number of examples, m is the number of classes and Tri and Tci are the rows' and columns' total counts, respectively.

Xperimental setting
Our proposed method is implemented using C programing language and Libjpeg library to decompress jpeg images and we use the JPEG/YCbCr colour space for skin detection.For the algorithm 1, we setα1 as 60 to examine the means of intensities in the four corners of the image when they are lower than α1, and we set α2 as 50 to examine the difference between the left and the right quarter of the image to differentiate between the left and right mould classes.For the algorithm 2, we set α3 to 35 because of the dental retractor is present in the left side for the intra right class and it is in the right side for the intra left class while for intra front class it is present in the both side. We use α3 to locate the retractor's position by computing the mean of pixel's intensities of Cb component in the both side of the image. For algorithm 3, we set α4 as 1.25 which means that in the left side of the image, the pixels corresponding to skin colour are in abundance compared to right side. And we set α5 as0.75 which correspond to the opposite case of α4. For algorithm 4, we set θ as 100 because of in real for mould images N should be equal zero but skin colour detection method has its incertitude [26,29], that why we use θ to englobe the false detection error. We set θ1 as 600 to verify if pixels that correspond to skin colour are in abundance or not for intra-oral and extra-oral images.

Classification results
In this subsection, we merge the three samples of images (mould, intra-oral, and extra-oral) in one sample to extract features by usingthe LBP algorithm.After that, the principal component analysis (PCA) algorithm was applied in this paper to optimize the redundant parameters obtained with the LBP. Then we train thegenerated dataset with the SVM, KNN and LDA classifiers as shown on  Figure 4shows the confusion matrices of the experiment results for KNN, LDA, and SVM classifiers.In the confusion matrix, the rows represent the models for each image class while the columns represent the ground truth classes of images. After we apply our proposed approach described in section 3.4. We construct a confusion matrix of sixteen class of orthodontic images to evaluate the performance of classification. Table 2 shows a confusion matrix between the sixteen classes of orthodontic views.   Table 2. Confusion matrixof the proposed approach representing  the predicted class according to the true class   class  1  2  3  4  5  6  7  8  9  10  11  12 13 14 15 16  1  88  2  2  3  0  0  3  0  0  0  0  0  0  0  0  0  2  1  26 1  2  0  0  0  0  0  0  0  0  0  0  0  0  3  1  0  7  0  0  0  0  0  0  0  0  0  0  0  0  0  4  1  3  2

Evaluation of classification
According to Table 2, we succeeded to classify the images with an average rate value of 94.28%. Therefore, we compute the acuuracy, kappa, sensitivity, specificity, and F1-mesure to evaluate the classification methods presented in this paper. Table 3 shows performance measures of classification of all presented methods. Overall the proposed method has led to the best scores of performance measures used in this paper. Despite the Euclidian KNN has led to the worst result of classification, it has led to the hiegest value of specifity.

CONCLUSION
In this paper, we proposed an automatic approach to classify orthodontic images (mould, intra-oral and extra-oral) without intervention with operators by presenting an algorithm for each main class of orthodontic images based on a decision tree structure by extracting and selecting features to classify these images. Then, to compare our result, we implement six algorithms of machine learning (Quadratic SVM, Cubic SVM, RBF SVM, Cosine KNN, Euclidian KNN, and LDA) based on textural features extraction using LBP descriptor and the PCA algorithm to reduce the redundant parameters in the extracted features. Which demonstrate an excellent performance of classification with high accuracy and Kappa values as well as the experimental result of our proposed approach except the Euclidian KNN classifier that gives a worst results. In future work, we will improve this approach to lead the development of a framework that can provide the evolution of the patient's condition using image analysis of previous visits in the same class, especially for the case of giving a patient his best smile.