Fruit tree disease classification system using generative adversarial networks

Received Jul 3, 2020 Revised Dec 10, 2020 Accepted Dec 26, 2020 Smart farm refers to a farm that can remotely and automatically maintain proper growth and management of crops and livestock by integrating technology with agriculture. Currently, smart farms are concentrated in the field of smart horticulture, and although spreading research is being conducted in limited spaces. In addition, it is difficult to obtain a sufficient amount of data to be used for learning, and there is a problem that data imbalance occurs because it is difficult to obtain a similar amount for each class. In this paper, we propose a method to amplify a small amount of data and to solve the problems of imbalance data by using a feature that can learn to mimic the data of a generative adversarial network. The proposed method can create dataset of various crops and also show high hit rate. Dataset generated from crops would be used to solve problems of data imbalance by learning.


INTRODUCTION
Smart farming is evolving with the application of the 4 th industrial revolution technology. It refers to a technology that enables remote control as well as services such as automation and intelligence by integrating new technologies such as ICT, IoT, big data, cloud, and AI into the growth and environment of crops or livestock [1][2][3]. The need for smart farms is deeply linked to global climate change and food shortages as populations grow. Food demand is on the rise due to an increase in the world's population, but there is a growing shortage of people who will grow because of the shrinking and aging population. In other words, as the population increases, urbanization reduces the area of crop cultivation, and farmers on the production site are aging. Even in Korea, as of 2017, the average age of farmers' chief executives is 67, and this average age is only increasing with time. As a result, smart farms are becoming increasingly important. However, smart farms are concentrated in the field of smart horticulture, and although they are recently spreading, research is being conducted in limited spaces, such as facilities. In addition, it is difficult to obtain a sufficient amount of data to be used for learning, and there is a problem that data imbalance occurs because it is difficult to obtain a similar amount for each class [2].
In this paper, we propose a method to amplify the amount of data through generative neural networks and to solve data imbalance between classes by generating data of various classes. The proposed technique improves the speed by reducing the amount of computation by preprocessing the data and highlights the features to help learning. The preprocessed data uses a generative neural network to generate new data and filters out the low quality data through filtering for data integrity. In case of using the proposed technique, it is possible to amplify a small amount of data to generate a sufficient amount of data for learning, and to apply it to an existing system as it is. Based on these advantages, the proposed method can be used to compensate for the limitations of the recent smart farm.

RELATED RESEARCH
In this chapter, we analyze the region of interest extraction method and the generative neural network used in preprocessing and the problems that can be caused by data imbalance.

Generative adversarial networks (GAN)
GAN is a paper written by Ian Goodfellow in 2014 and various studies are being conducted based on it [4][5][6][7]. The generative hostile neural network learns and produces the result through the competition of two neural networks as the name of the hostile neural network. In (1) represents the formula of the generative host neural network In the above formula, x~Pdata (x) represents an image generated by the probability distribution of the actual image, and x~Px (z) represents an image generated using noise. D (x) is a discriminator, with a value between 0 and 1 indicating the probability that the image is real. In the case of D (G (z)), the image generated by the creator is distinguished through the discriminator, which also has a probability between 0 and 1 that the image is real. To maximize the above equation, the value of D (G (z)) should be close to 1 and the value of D (x) should be close to 1. In this way, constructors and discriminators learn by solving the Minmax problem.

Crop region of interest (RoI)
The region of interest means an area of interest on the image. When the object is detected or detected by processing an image [8][9][10], the detected area may be referred to as a region of interest [11]. There are three main reasons for specifying a Region of Interest. The first is to remove unnecessary images of the area around the object, and the second is to reduce the amount of computation and resources. Finally, the accuracy of learning can be improved. By eliminating unnecessary information in advance, only necessary parts are used for learning, which increases accuracy.
The most representative ones in crop region of interest may be referred to as RGB separation and contour extraction. In the case of RGB separation, a region of interest can be obtained by specifying a range of channels and outputting pixel values within a range by using properties having different values for each channel of an image. Contour extraction means extracting a point where the brightness of the image changes from a low value to a high value or vice versa. Contour detection is a technique to find the pixels corresponding to the contour, and calculates the slope based on the calculation of the partial differential operator [12][13].

Data imbalance
If there is a large difference in the amount of data that each class has in the data, it is said that there is a class imbalance. Most of the data in the real world has this data imbalance problem, and it rebalances the classes by sampling again before training the model. Undersampling is selecting parts of a lot of data, such as the image on the left, and aligning them toward the smaller data. Undersampling can reduce execution time by reducing the size of the data set, but it is also possible that the useful data is not extracted or biased to one side [14][15][16][17]. Oversampling involves copying less data and fitting it toward more data. In the case of oversampling, the data set has more oversampling, so learning is better. In other words, it performs better than undersampling. However, there is a possibility of overfitting when the same data is extracted repeatedly [18][19].

Problems and solutions
Problems between smart farms and data include: It is difficult to obtain enough data to be used for learning, and even the provided data has an adverse effect on learning due to unbalanced data per class [20][21][22][23]. Based on these problems, this paper proposes a method of amplifying the amount of data through a generative antagonistic neural network and generating data of various classes to solve the data imbalance between classes. Using a generative adversarial network not only amplifies a small amount of data to generate enough data for learning, but also has the advantage of using oversampling with less redundancy of the data used.

SYSTEM DESIGN 3.1. Overall system design
The proposed system can be divided into the data processing stage that manages data preprocessing and postprocessing, and the network stage that defines the model for generating data and classifying images. Figure 1 shows this structure. In the data processing stage, it is divided into preprocessing and postprocessing. In the preprocessing step, image size adjustment, contrast adjustment, and region of interest extraction are performed. Image resizing reduces the size of the image, lowering the memory footprint of the system and reducing computation. The size of the image is changed to the n power of 2 to adjust the size of the image so that it does not give a real value when training the model. Region of interest extraction reduces the amount of computation by reducing data outside the region of interest. Contrast makes the characteristics of the data stand out by making the contrast and hue clearer. Post-processing involves a data filter that either stores the results or filters the data. The data filter filters out low quality data from images generated by genetic adversarial networks to ensure the quality of the generated image. Data storage is responsible for sending data to the classification system when the user gathers more than a predetermined amount. At the network level, the generator is divided into a generator, a discriminator network, and a classification network that divides the image. Genetic adversarial networks take preprocessed data as input and generate new images based on it. In the classification system, the generated images are learned and the actual images are determined based on the generated images. In the image discrimination step, the validity of the proposed system is verified by comparing the proposed system with the existing system. Figure 2 shows the flow chart. The flow of the proposed system proceeds from the size control of the input data. By adjusting the size, it plays a role in speeding up the future work. The resized image is then extracted from the region of interest. This plays a role of improving the speed of learning by reducing the amount of computation later. The final preprocessing is to give the image a contrast. The image is then generated through genetic adversarial networks. The image you create is based on an uninfected image, which allows you to create an image similar to the diseased one from the original. The generated image is saved or discarded according to its quality and is repeated until the image reaches a certain number. When more than a certain number of images are created, the generation stops and the classification model is used to train the classification model.

Crop region of interest design
The region of interest was extracted using some Sobel mask method based on RGB extraction. Figure 3 shows the flow chart. First, the image is searched and the average value is calculated. The average value is calculated using a Sobel mask. Find the average and start checking every pixel. If the average value is not exceeded, the next pixel is checked. If the average value is exceeded, the next pixel is checked to see if the pixel is stored. If not, the pixel is stored and moved to the pixel. At this time, the inspection direction of the pixel proceeds constantly. If the pixel is stored, the search ends and a mask is created from the stored pixels to extract the point.

GAN design
The designed generative adversarial networks is designed based on DCGAN. The constructor uses the existing model of generative adversarial networks, but receives 64x64x3 variables which are the size of the image to be generated as input values [24][25]. Figure 4 shows the model of the discriminator network.

2511
The discriminator was modified according to the image given as input, which has a certain form of leaf, the disease occurring in the leaf, and the size of the image is 64x64x3. The size of the convolutional layer was changed from 5x5 to 3x3 to reduce the amount of computation and save memory. In addition, the activation function is designed to generate images of various classes using softmax rather than sigmoid.

Classification system design
The designed model was designed based on CNN. First, in the input phase, the image is changed to code and the data for each class is adjusted regularly. There are 5 convolutional layers, which change according to the size of the input image. The smaller the size of the image, the smaller the size of the feature, so there is no need to build a deeper convolutional layer. Therefore, the image in this paper is 64x64x3, so 5 convolutional layers and 1 fully-connected layer are used. The convolutional layer is a filter that is characterized by a 3x3 filter and a Max Pooling layer that ignores minor changes. Max Pooling creates a small output image by extracting only the main values from the filtered image. The Max Pooling size is used to reduce the size of the image by half every 2x2. In the fully-connected layer, the image which has repeatedly passed the convolutional layer is extracted only from the feature, and it changes the image into one dimension and distinguishes the image in order to transfer the extracted feature to the precombined layer.

SYSTEM IMPLEMENTATION
To implement the system, the Windows 10 Pro operating system was installed on a computer equipped with an i5-6500 processor, 16 G memory, and a GeForce GTX 1060 graphics card. We also installed CUDA 10.0 to use the GPU and implemented the system using Python3.5 in Jupyter Notebook.

Implement preprocess
The crop region of Interest is responsible for finding key parts of the image before the generative hostile neural network proceeds. First, we search the entire pixel to find the average value, and then search through the pixel values of the image as designed. After searching, it prints and saves only the image inside the newly created mask. Figure 5 shows the results of the region of interest.
In Figure 5, from left to right, the original, crop of interest extraction 1 time and interest region extraction 2 times. As can be seen from the Figure 5, it is necessary to proceed two times in order to see the extraction of the region of interest. This occurs when the average value is calculated. The first time, the region of interest is extracted by calculating the average value of the background, the shadow, and the leaves. Get it. In addition, the shadow is caused in the picture, but this problem occurs when the focus is blurred at the edge of the leaf. Therefore, in this paper, we implemented the region of interest twice using the crop region of interest to concentrate the region of interest. Figure 6 shows the preprocessed image.
In Figure 6, From the left, the original, the extracted region of interest, and the contrasted image. In the last image, you can see that the image of the affected area is more intense in color. The image is then generated through a generative hostile neural network. The image you create is based on an uninfected image, which allows you to create an image similar to the diseased one from the original. The generated image is saved or discarded according to its quality and is repeated until the image reaches a certain number. When more than a certain number of images are created, the generation is stopped and the classification model is used to train the classification model.

Implement GAN
Genetic adversarial networks play a role in learning preprocessed data and generating new images. The generator network has four convolution layers to normalize the output and activation values via batch normalization. The discriminator is implemented with three convolutional layers and one fully-connected layer, and sigmoid is used as an activation function to discriminate images of various classes. Figure 7 is an image created by applying the disease called Black Rot to the blueberry leaves. In Figure 7, the black rot leaves are well formed. In addition, by limiting the region of interest during the pretreatment process it can be seen that the location of the disease is properly seated on the leaves.

Implement proposed system
We implemented and compared the designed system with the existing system. Figure 8 shows the classification per epoch when sorting without preprocessing (system 1) and oversampling by increasing data through rotation (system 2). This graph shows the accuracy comparison.
The axis, epoch, represents 10 times. In Figures 8 and 9, system 1 took a long time to maximize accuracy due to lack of data. System 2 was maximized in the epoch similar to the proposed system, but the accuracy was confirmed to be low. In contrast, the proposed system reached its maximum at a faster rate than system 1 and showed an average 2% accuracy improvement over system 2. Table 1 shows the comparison of accuracy when classified using the test set. Experimental results show that system 1 has a 77.99% probability of lacking a data set, while system 2 has a 91.14% probability that some data sets are small. On the other hand, the proposed system increased the amount of data used for learning using generative antagonistic neural networks, and provided the same amount of data for each class, showing high accuracy of 98.17%.

CONCLUSION
Smart farm refers to a farm that can maintain and manage crop and livestock growth conditions appropriately remotely and automatically by using ICT in agriculture. In addition, it was possible to improve the productivity and quality of agricultural products by creating an optimal growth environment based on data on crop growth information and environmental information. However, crops to which smart farms are currently applied are limited to horticulture or house crops, and there is a problem in that a large cost is required to introduce smart farms. In addition, the provided data may show a big difference from the actual data, and since the amount of data is different, there is a problem that the data imbalance problem must be solved in order to learn.
In order to solve this problem, This paper proposes a method that learns the data through genetic adversarial networks and generates data similar to the actual data by using the input data. In order to reduce the resources used when learning data, this system proposed a preprocessing method to reduce the amount of computation and speed it up. We then used genetic adversarial networks to increase the amount of data used for learning and to resolve data imbalances by class. This enabled them to generate data for smart farms across a wide range of crops, increasing the accuracy of learning by 98 percent. Future research should generate data that can be used not only in the classification system used in this paper but also in logistics and automation systems. As a result, it is considered that various smart farms can be constructed.