Support vector machine-based object classification for robot arm system

ABSTRACT


INTRODUCTION
The continuous development of science and technology has had a great impact on people's lives and working habits. People are increasingly demanding more about the working environment, income and other social benefits. Automated machines have gradually replaced humans to perform dangerous, toxic or repetitive boring jobs. In particular, automation systems using robotic arms have been strongly developed in recent years. Robotic arms with flexible movement, tireless continuous operation have been applied in many different industries. Robotic arms can perform both tasks that humans will not undertake and tasks that humans cannot [1]- [3]. The use of robotic arms to replace humans has brought many benefits such as increasing product quality, productivity, reducing material waste, and costs [4]- [8].
Nowadays, with the support of sensor technology, especially vision sensors, the robot's functionality has been expanded [9], [10]. Robots not only execute pre-programmed tasks in a stationary environment, but they can also execute different tasks in unstructured environments by using sensors to sense their surroundings. Visual sensors with the ability to extract complex information have been widely in robot systems and machine vision-based robotic manipulators have been employed recently in automatic systems [11]- [14]. The computer vision system provides the necessary information for robots to respond to changes in the environment [15], [16]. It captures images of the target objects and processes images by using algorithms to extract data from images. These data are transmitted to the robot system. Then, the robot system uses this data to control the motion of robot manipulators and other devices [17]- [20]. For example, in an automatic pick and place system, the camera will capture images of some objects in the workspace. The images are processed to determine the coordinates of objects. These coordinates are transmitted to the robot system. The robot arm is controlled to the objects' position and grasps objects to another position. There have been many other studies that have used vision systems in robotic systems for many industrial automation applications.
The problems of detection, identification and classification are very common in automation production [21], [22]. Using cameras for classification has provided benefits such as high repeatability, consistent accuracy, and high speed of classification. Many image processing algorithms have been proposed for classifying different objects. Support vector machine (SVM) is a machine learning algorithm that is widely used for classification applications [23]- [26]. Many studies have revealed that the SVM algorithm can performs better than an artificial neural network (ANN) [27]. In this paper, we train an SVM model to classify objects in an application using a robot arm to sort objects. The system consists of a 3-DOF robot arm to grab objects, and a camera to capture images of objects. The SVM is trained on a personal computer and the trained model is run on a Raspberry Pi computer. The SVM will classify objects with different shapes. To train an SVM model for classification, we took images of the objects and processed them to obtain a dataset. The dataset is expanded by using a rotation augmentation to improve the accuracy and generalization of the SVM model. The rest of the paper will describe in detail the robot system and the SVM algorithm for object classification.

RESEARCH METHOD 2.1. Proposed system
This paper uses a developed robot arm as shown in Figure 1. The hardware and software to build the robot arm are described as follows: The hardware system composes a 3-DOF robot arm for grabbing objects, a camera fixed to take pictures of objects, and a Raspberry Pi computer to process images and classify using SVM model. Three stepper motors mount at each joint of the robot arm to provide the rotation movement for links. The motors are controlled by an Arduino Uno board. The Arduino board sends the pulses and direction signals to stepper motor drivers. These drivers convert the signals from Arduino to the current signals to rotate the motors. − Stepper motors: the stepper motors are bipolar stepper types equipped with a high-precision planetary gearbox with a gear ratio of 5:1. It can provide a maximum holding torque of 30 kgf.cm. The resolution of the motor is 1.8 degrees. The dimension of the motors is 42×42×97 mm. − The A4998 drivers: The A4998 is used to control the small-size stepper motors. The supply voltage for the driver is from 8 to 35 V, the maximum output current to each phase of motor is 2 A. It supports multiple operating modes of bipolar stepper motor with five different step resolutions: full-step, 1/2, 1/4, 1/16, and 1/32. Figure 2 shows the motor and driver used to build the robot arm, Figure 2(a) is the Nema 17 motor with a gearbox and Figure 2(b) is the A4998 driver.  OpenCV now supports a multitude of algorithms that is easy to implement. OpenCV-Python is a library of Python bindings designed to solve computer vision problems.

Support vector machine
Support vector machines (SVMs) are a well-known supervised machine learning algorithm for classification and regression. It has been shown to be highly accurate and generalizable compared to other data classification algorithms. SVMs were first presented by Vapnik and have since been applied in many applications thank to many attractive features and high empirical performance.
SVMs are a binary classification that separates data into two classes by a hyperplane plane. Two parallel margin hyperplanes lying on each side of the hyperplane that separates the data are constructed. The data belongs to only one side of these two planes. The optimal position of the hyperplane for separating data is determined where the distance between two margin hyperplanes is maximum. For complex data sets, the data is mapped to a higher dimensional space using kernel functions and constructs a hyperplane in this space to separate data.
Given a set of data points ( , ), = 1 ÷ , ∈ is the input vector, ∈ {−1,1} is the output classification. The SVMs are required to solve the optimization problem defined as (1) [28]: Subject to ( . + ) + ≥ 1 where is the weight vector, b is the bias, is the error for a given training point , C is a constant to adjust the error. The optimal problem of SVMs is reformulated into the dual formulation by using the Lagrange method [29].
If the data is mapped into a higher dimensional space by the function φ and ( , ) = ( ) ( ) is the Kernel function, we obtain the nonlinear support vector machine [29].
Subject to ∑ =1 = 0, 0 ≤ ≤ There are many kernel functions in SVMs, and the selection of kernel function have a significant effect on the performance of SVMs. There are some popular kernel functions [22]. − Linear Kernel − Sigmoid kernel To apply SVM for multi-classification, we can use either a multi-class version of the SVM algorithm or combine binary classifiers using a decision function. The common methods for multi-class SVM are one vs one (OvO), one vs all (OvA), direct acrylic graph (DAG), error-correcting output codes (ECOC), and Binary tree architecture (BTA) [29]. In this paper, we will use SVM to classify multiple objects and the method One vs one (OvO) is chosen. Figure 3 is the steps to implement the SVM algorithm for classifying objects. First, the image of objects is taken by the camera in the RGB color space. It is transformed to a grayscale image. The image is then filtered noise before converting to a binary image. Gaussian filter is used to remove high-frequency content (e.g., noise and edges) from the grayscale image. A fixed threshold is used to convert the grayscale image to a binary image. If the gray value of a pixel exceeds a threshold, then the new pixel value is set to 1. Otherwise, the pixels are set to 0. By choosing an appropriate threshold, the objects in the binary image are white blobs. We identify these blobs by using contouring technology. Contouring is a method that simply joins all continuous points of the same color and intensity [20]. The image moment of the contour can be calculated to determine its centroid: = 10 00 = 01 00 (8) where ( , ) is the centroid of the object, is the image moment. OpenCV library has defined functions to find the contour of an object and calculate its image moment. influenced by the selection of features. Selecting suitable features to represent the images as well as suitable kernel and parameters value will increase the performance of classification. In this work, we use an area zoning feature. The object images are divided into 10×10 identical zones. The size of each zone is 20×20 pixels. In each zone, the sum of all white pixels is taken to create a feature. The feature vector with a dimension of 100 is formed from the feature of each zone. Finally, the feature vector is used to train the SVM model for classification. We used Python's scikit-learn to implement SVM training and prediction. Scikit-Learn contains the SVM library, which contains built-in classes for different SVM algorithms.

RESULTS AND DISCUSSION
In the experiment, the robot will sort four objects with different shapes as shown in Figure 4. To train an SVM model for classification, we took images of the objects and processed them to obtain a dataset. The dataset consists of 100 images of objects (25 images per object) located at different positions in the workspace. The dataset is then augmented by using the rotation technique. For each cropped image of the object, it is rotated clockwise by a given number of degrees from 0 to 360 to create 72 different images. So, we have obtained a new dataset of 7,200 images. The data augmentation improves the accuracy and generalization of the SVM model. This dataset is split into a training set and a testing set where the percentage of the training set is 80 percent. The features in the training image set are extracted to train an SVM model with a Gaussian kernel. After training, we use the testing set to evaluate the accuracy of the SVM model. The results in Table 1 show the classification accuracy is 99.72%, 99.4%, 99.4% and 99.88% for four objects, respectively.  To grasp an object and place it in the correct position, we need to determine its 3D coordinate related to the robot base. The 2D image coordinates of the object have been extracted. To transform the 2D image coordinate into the 3D coordinate, it is necessary to know the camera's intrinsic and extrinsic. So, we conduct the camera calibration by using a chessboard and some functions to calibrate the camera in OpenCV. First, we capture 16 images of the chessboard from different viewpoints. To find the corner points in the chessboard, we can use the function, cv.findChessboardCorners(). The size of a chess square can be measured by a ruler. So, we have the 3D corner points and image corner points. Then, we can use the function, cv.calibrateCamera() to calibrate the camera. This function returns the camera's intrinsic and extrinsic matrix. Using the pinhole model, we can determine the 3D coordinate of the object as follows:  (8), the X and Y values of the 3D coordinates can be determined. These coordinates are utilized to compute the rotation angles of the joints of the robot by using the inverse kinematic equation of the robot. Finally, the robot is controlled to move from the initial position to the object's position for grasping. Table 2 shows the accuracy when using the proposed algorithm to calculate the 3D coordinate of objects.

CONCLUSION
This paper has trained an SVM to classify four objects with different shapes and a computer vision system is developed to grasp objects thank a 3-DOF robot arm. The dataset consists of 100 images of objects located at different positions in the workspace captured by the camera and processed by the proposed algorithm to extract features for SVM training and calculated the 3D position. The experimental test is performed to verify the possibility of the proposed method. The SVM can classify objects with the number of misclassifications is 9 objects in a total of 1,440 objects.