Application of Multiple Kernel Support Vector Regression for Weld Bead Geometry Prediction in Robotic GMAWProcess

ABSTRACT


INTRODUCTION
In today's industrial world, increased competition in the global economy has led to increasing need forquick delivery of customized products of manufacturers to meetcustomer demands [1]. To this end, the era of rapid prototyping emerged in 1988 and quickly developed based on rapid advancesof digital computing systems and advanced technologies such as laser. This novel innovative technology was successful to overcome the shortages of traditional prototyping methodsand resulted in remarkable reduction of prototypes production time [2]. In rapid prototyping, construction or assembly of theconsisting parts is usually performed by application of additive manufacturing technology, in which a three-dimensional object is created byaddition of layer-upon-layer of materialsunder computer control.
In recent years, plenty of studies have been presented with regard to rapid prototyping of metallic parts based on gas metal arc welding (GMAW) [3]. In this process, an electric arc between a consumable wireelectrode and the metallic workpiece is generated, which heats the workpiece and causes them to melt and join.A robotic system is applied in some of the GMAW processes to control the deposit of welded material, position of welding torch and someother parameterssuch as slope and rotation of the torch. A schematic view of the robotic GMAW process is depicted in Figure 1 [4].
In welding processes, weld bead geometry, namely weld width and weld bead height, as shown in Figure 2 is one of the important characteristics of the welding line. Especially, in the GMAW process, the beadgeometry has a significant effect on the layer thickness, surface quality, and dimensional precision. The most important parameters which affect thebead geometry are welding current and voltage, type and percentage of inert gas, and the distance of nozzle to the workpiece. As appropriate weld bead geometry results in high weld quality, proper selection of welding parameters so as to obtain a desired weld geometry is of great attention in GMAW process. To this end, it is greatly important to develop a global model of weld geometry based on the process parameters. Due to the highly non-linear and coupled multivariable effect of the input parameters on the weld geometry, this model cannot be defined through an explicit mathematical expression and advanced modelling techniques are investigated for this purpose. Machine learning is a subfield of computer science, in which the study and construction of algorithms, capable of learning from and making predictions based on a limited set of observed data is explored. In such algorithms a model is built from example inputs in order to make data-driven predictions or decisions. Supervised learning is the machine learning task of inferring a function from a set of labeled training data [5]. The algorithms in this field can be used to establish a model based on a limited set of observations for making predictions in cases which have not observed. Therefore, these algorithms can be used for modelling and prediction of weld geometry in GMAW process and several researches have been performed in this field which are mainly based on neural networks and fuzzy systems [6]. In field robotic GMAW process, a global database of process parameters and the corresponding weld geometry has been provided by [7] and predictive modelling has been performed by both the neural network and second order regression analysis methods, which proves the higher accuracy of theneural networkapproach over the second order regression. Support vector machine (SVM) is a state-of-the-art approach to supervised learning, used for classification and regression analysis, which has been proven as a powerful method in many practical applications [8]. Structural risk minimization alongside with empirical risk minimization is the main advantage of the SVMs over the neural networks resulting in a better generalization capability in many  [9]. The accuracy of SVM-based modelling can be enhanced by a newer approach known as multiple kernel learning, which is introduced in Section 3 [10]. Regarding to the high degree of accuracy required in prediction of weld bead geometry in robotic GMAW process, application of multiple kernel support vector machine for this prediction has been discussed in this paper and this approach has been proven to provide more accuracy and generalization capability.

SUPPORT VECTOR MACHINE
Support vector machines (SVMs) are supervised learning models with associated learning algorithm swhich analyze data and recognize patterns, used for classification and regression analysis [11]. A linear SVM-based classifier system finds the hyper-plane which leads to the maximum margin between the samples of the two classes in the training dataset, while minimizing the classification error. Such a classifier can be described by Equation (1), in which x is the input vector and w and b are the weights and bias vectors, respectively [12]. The optimum values of w and b are obtained by minimization of the risk function R(w) expressed in Equation (2), subjected to the constraints of Equation (3), for the N samples of the (x i , y i ) in the training dataset [13].
In the risk function of Equation (2), the first term stands for the structural risk, i.e. the margin between the two classes and the second term stands for the empirical risk, i.e. the training error. The parameter C, is the regularization factor and it trades off the relative importance of maximizing the structural and empirical errors. Figure 3 shows the SVM-based classification. In case of data with nonlinear border between the two classes the original feature space can be mapped to some higher-dimensional feature space where the training set is separable, through a nonlinear function known as the kernel function, as depicted in Figure 4 [14]. Therefore SVM-based classification can be expressed as: In which w and b are obtained by minimizing the risk function R(w) in Equation (5) subjected to the constraints of: The concept of SVM classification can be generalized for the purpose of regression by introducing the margin of tolerance for the function to be estimated, based on the permitted estimation error. Given a limited number ofobservations from the function f(x) with the permitted margin of tolerance , SVM-based classification between f(x) + and f(x) -can be considered as estimating f(x) in the permitted margin of tolerance, as depicted in Figure 5 [15]. In other words, in SVM-based regression (SVR), the input space is mapped into a highdimensional feature space via the kernel function and then a linear optimal regression is performed in this space. Therefore, the formulation of support vector machines can be generalized for the purpose of regression as: Figure 5. Generalization of SVM-based classification to SVM-based regression [15] The optimal regression is obtained by maximizing the ( ) function in Equation (7) subjected to the constraints given by Equation (8) [15].
According to the Mercer's theorem [14], the inner product 〈 ( ) ( )〉 can be defined through a kernel functionas ( ) 〈 ( ) ( )〉.Therefore, the ( ) function in Equation (7) can be expressed as: The optimization problem can be solved via quadratic programmingoptimization and the estimated function is expressed based on the optimal values as Equation (10) [16].
The most common formulations for the kernel function are listed in Table 1.

SUPPORT VECTOR REGRESSION BASED ON MULTIPLE KERNEL LEARNING
In SVM-based regression, the performance of the learning algorithm highly depends on the data representation, which is chosen through the kernel function. Kernel function measures the nonlinear similarity between samples, so an efficient kernel should represent data adaptively. In addition, an appropriate regularization term is defined for the learning problem in terms of the kernel function's parameters. In most cases, the parameters of a single kernel function is tuned for the whole data sets. Although the kernel parameter can be optimally chosen to enhance the generalization capability, learning with single kernel is not very data-adapted or discriminative. Multiple kernel learning (MKL) provides a more flexible framework than single kernel and mines data information more adaptively and more effectively [17]. In the MKL framework, the kernel function is formed based a linear convex combination of M functions which satisfy the Mercer's conditions, formulated as: where is the weight of the m-th basis kernel function and must satisfy the conditions of: The combining weights are considered as a vector of weights, namely [ ] . The multiple kernel learning (MKL) problem can be described as learning the combining weights and the solutions of the original problem, for example, the solutions of and for SVR problem in Equation (15), in a single optimization problem. By substitution of Equation (11) into Equation (9), the optimization problem of MKL-based SVR is obtained as maximization of ( ) in Equation (13) subjected to the constraints of Equation (14) [17].
Recently, a simple and efficient algorithm for multiple kernel learning has been proposed by which solves the optimization problem by application of the gradient descent method [18]. This approach, known as Simple MKL, is based on the fact that the objective function L in Equation (13) is convex and differentiable. Therefore, the optimum vector of weights d can be obtained by means of updating it on the gradient descent direction of L. In this method, the gradient of objective function is computed by the derivatives of L as In progress, the descent direction D of gradients is found and d is updated as: (16) where is the step length. The gradient of the objective function is only updated when the objective value decreases. This update procedure is repeated until the stopping criterion is met [18].

RESULTS AND DISCUSSION
In order to establish the MK-SVR predictive models, a database of measured values of weld bead geometry together with the corresponding process parameters, provided by Xiong, et al.was utilized, which is shown in Table 2 [7]. From this database, the first thirty one samples were used to train the MK-SVR models and the predictable accuracy of the established models was evaluated based on the next twelve samples, marked in bold. To improve the accuracy, all the input and target values were normalized between −1 and +1 as: where, max and min are respectively the maximum or minimum value of the input or the output among the whole dataset, is the input or output and is the corresponding normalized value.Based on the normalized dataset, the single kernel and multiple kernel SVM models were implemented by the SVM-KM [20] and the SimpleMKLMatlab toolboxes, respectively. Training the models and calculating the predicted normalized outputs, they were scaled to their original range, as: where ̂ is the predicted output in the original range and is the normalized predicted output.For predictive modelling of the bead width, the kernel function was selected as combination of 401 Gaussian basis kernel functions with parameters varying from 8 with increment of 0.01 to 12 and in case of the bead height it was selected as combination of three polynomial basis functions with parameters of 1,2,3 and 81 Gaussian basis kernel functions with parameters varying from 0.2 with increment of 0.01 to 1. The single kernel models were implemented based on the Gaussian kernel function. The parameters of single kernel and multiple kernel models are listed in Table 3. Accuracy of the final models was evaluated based on the root means square error (RMSE), normalized root means square error (NRMSE) and mean absolute percentage error (MAPE) statistical indices, defined as: In these equations, and ̂ are the corresponding measured and the predicted outputs, respectively, N is the corresponding number of training or testing samples and ̅ is the mean value of the total measured outputs. The calculated values of the indices are listed in Table 4. Besides the superior performance of MK-SVR over the SK-SVR, this method has a better testing mean absolute percentage error (MAPE) than the ANN-based approach proposed by [7] which has reported a MAPEof 2.013% for the test data.
The SK-SVR testing root means square error can be further reduced to 0.1493 by changing the SVM kernel and model parameters, but this value for kernel parameter cannot be obtained from the training database. In other words, the SK-SVR model is overlearned in training process and cannot be trained to make the best predictions for the test data besides the training data. Therefore, the MK-SVR method benefits from better generalization capability as well as higher precision for the test data. The measured outputs together with the outputs predicted by the MK-SVM method are depicted in Figure 6 and a good agreement can be observed between them.

CONCLUSION
In this paper, application of multiple kernel SVM regression analysis has been proposed for modelling and prediction of the weld geometry in robotic GMAW-based rapid manufacturing process based on the input parameters of wire feed rate, welding speed, arc voltage and nozzle to plate distance. In this analysis, the kernel function is formed based on a linear combination of basis kernel functions and using the Simple MKL algorithm, the optimized combination of the kernel function and the solutions of the SVR problem are obtained.Based on the results, it has been concluded that the best prediction results cannot be obtained in single kernel SVR for both the training and test data, while application of multiple kernel SVR results in the best results for both of these databases. Prediction results also prove higher accuracy of the multiple kernel SVR besides its enhanced generalization capability over the single kernel SVM and ANN regression approaches. Based on the multiple kernel SVM models, the input parameters can be tuned to obtain a desired weld geometry in this manufacturing process with a higher degree of accuracy.