Detection of citrus leaf diseases using a deep learning technique

Received Apr 29, 2020 Revised Sep 11, 2020 Accepted Nov 8, 2020 The food security major threats are the diseases affected in plants such as citrus so that the identification in an earlier time is very important. Convenient malady recognition can assist the client with responding immediately and sketch for some guarded activities. This recognition can be completed without a human by utilizing plant leaf pictures. There are many methods employed for the classification and detection in machine learning (ML) models, but the combination of increasing advances in computer vision appears the deep learning (DL) area research to achieve a great potential in terms of increasing accuracy. In this paper, two ways of conventional neural networks are used named AlexNet and ResNet models with and without data augmentation involves the process of creating new data points by manipulating the original data. This process increases the number of training images in DL without the need to add new photos, it will appropriate in the case of small datasets. A self-dataset of 200 images of diseases and healthy citrus leaves are collected. The trained models with data augmentation give the best results with 95.83% and 97.92% for ResNet and AlexNet respectively.


INTRODUCTION
In Iraq, citrus is one of the most valuable items, the 2019 survey of the production of citrus trees was completed by the Directorate of Agricultural Statistics, a survey contained in the Central Statistics Bureau's annual plan and covering five main types: orange, sour lemon, sweet lemon, mandarin and bitter orange. Because of inadequate care and lack of pesticide usage, many diseases affecting citrus trees have spread, such as; Phyllocnistis citrella, lack of elements, scale insects, etc., realizing that the consequences of diseases have killed large numbers of citrus trees and low productivity. Where the average productivity of the orange tree in Iraq was estimated at only 13.5 kg, which is a very low amount, considering it to be the first tree of citrus fruits in Iraq, and where the average productivity of other citrus trees is similar to the amount of orange production [1].
In this paper three diseases of citrus and healthy leaves discussed and detected; the first type of disease is the Phyllocnistis citrella disease which is a significant pest of the worldwide commercial citrus production. In distinctive serpentine mines, eggs are laid on young leaves and larvae feed inside the leaf tissue, eventually pupating in a pupal cell at the leaf margin with developmental period varying from 13 to 52 days depending on the temperature [2]. The second disease is the lack of element disease which is happened because the supply of such elements such as Zn, Mn and Fe is related to soil-Ph, deficiency symptoms of these three elements may also appear concurrently within a canopy of the tree and often cover one another  [3]. Finally, insect scale disease which within the superfamily Coccoidea is referred to as a broad community of insects, Scale insects feeding on young, developing tips may cause warped foliage. Feeding on leaves can turn them yellow and plants can look water-stressed. Strong infestations can cause the branches and stems to die back [2].
DL is a form of ML, based on a deep neural network with several hidden layers. It is one of the latest examples of research into ML and artificial intelligence (AI) [4]. Today, DL is becoming one of the most relevant identification techniques. Convolution neural network (CNN) is DL's basic method, it increases accuracy by programming a large amount of data for extracting features and multiple hidden layers using an ML model [5]. In [6] Krizhevsky implemented a deep CNN to identify 1.2 million images with ImageNet and for the first time achieved the top-1 and top-5 error rate in the Image Recognition Competition, after which the researchers caught the interest of this field. DL has used plant illness diagnosis and detection. Kaur et al. [7] Used Google Net CNN model to detect and classify healthy and disease for different kinds of plants and achieve 97.82% accuracy.
In [8] Sahidan et al. proposed a leaf recognition by using a convolutional neural network and bag of features, they used a public data set named Folio. The experimental results indicate that bag of features achieves better accuracy compared to basic CNN with 82.03% accuracy. In [9] K. P. Ferentinos used an open dataset of 87.848 images contains healthy and disease plant leaves applied to different CNN methods named VGG, Alex Net, Google Net and Overfeat CNN and the result show that VGG has the best value with 0.47% error and 99.53% accuracy for the tested set. Xing et al. in [10] introduced a recognition model for citrus disease and pests by using weakly dense connected convolution network, he used a self-dataset for citrus and applied it to different CNN models the experimental results show that NIN-16 achieved 91.66% test accuracy which was higher than the SENet-16 model with 88.36%. WeaklyDenseNet-16 have the higher accuracy of 93.33% than NIN-16, VGG-16 achieved the second highest classification accuracy 93% with the most computing model size resources of 120.2 MB. The goals of this study are identified and classified healthy leaves and different type of disease occurred in the citrus leaves by using two models of conventional neural network which is AlexNet and ResNet with data augmenter and different parameters to achieve the best accuracy. This work is proposed using PC, Core i7, and MATLAB r2019 b.

MATERIALS AND METHOD 2.1. Dataset
In this work, a dataset of 200 images for healthy and Phyllocnistis citrella, lack of element, and scale insects disease each with 50 images. The data set divided into 70 percent for training, 20 percent for validation, and 10 percent for training of the proposed method, Figure 1 indicates the three types of citrus leaf diseases. The dataset is resized by format (height X width X number of channel), for AlexNet the size become (227x227x3) and (224x224x3) for ResNet model. Then, the data augmentation applied for the resized images. Although CNN is very powerful, the result may be become in overfitting and cannot achieve the goal results because the number of images used is not enough so it artificially enlarges the dataset using labelpreserving transformations [11]. Data augmentation involves the process of creating new data points by manipulating the original data. This process increases the number of training images in DL without the need to add new photos [5,6], in this work the augmentation is done by: -Random reflection in the left-right direction.

CNN artechitcter
CNN is one of the DL architectures and its most common in solving the image classification problem, it is the most effective and powerful DL technique. CNN's are an evolution of traditional artificial neural networks (ANN), focusing primarily on applications with repeating patterns in various areas of modelling space, in a particular image. Their main characteristic is that they drastically reduce the number of structural elements (number of artificial neurons) required as compared to traditional feedforward neural networks with the methodology used in their layering. CNN is feed-forward and is a highly influential detection method. The structure of the network is simple; has fewer training parameters. CNN represents a very effective detection process. On the other hand, the network model's complexity and weight numbers are diminished. Figure 2 shows the main structure of CNN that contains mainly five layers; the input layer, convolution layer with activation function, pooling layer, fully connected layer, and finally the softmax layer [12][13][14][15]. In the convolution layers that consist of a series of convolutionary kernels in which each neuron behaves as a kernel, the convolution process becomes a correlation process while the kernel is symmetrical [16]. The process of convolution has three primary advantages. In the same function map the weight sharing method reduces the number of parameters and hence the number of operations. Local connectivity allows the analysis of associations between adjacent pixels. Lastly, invariance to the object's origin allows to locate the target independent of the object's place in the picture [17].
The pooling layer is used to minimize the measurements of the function maps and network parameters increasingly. Pooling layers are therefore invariant in encoding, since their computations take into account adjacent pixels. Two major types of pooling layers, max pooling layers and average pooling layers occur [18]. The most used techniques are average pooling and maximum pooling. Most implementations use max-pooling because it can lead to faster convergence, pick superior invariant features and enhance generalization [19].
Fully connected (FC) layers comprise about 90 percent of a CNN's parameters. Using this, the neural network is fed into a predefined-length vector. We may either feed the vector into a variety of image classification groups, or take it as a function vector for follow-up processing. Though changing the structure of the fully connected layer is uncommon, some effort has gone in to make it more efficient. The FC layer is the higher-level representation of the input signal, the output resulting from the convolution, activation, and pooling layers previously added. These layers are not supposed to provide estimates of classification. The FC layer is used at this stage to identify the input picture according to the training set by looking at the features [19,20]. After each convolutional layer, the ReLU activation layer is conventionally used. It allows introducing non-linearity within the network. ReLU was more computationally effective than tanh or sigmoid function without significant change inaccuracy [21].

PROPOSED WORK AND EXPERIMENTAL RESULTS
In this paper, the input 180 images are divided into training, validation, and test images. The images first, resizing and training by one of the two CNNs models to classify the disease type of citrus leaves. The Figure 3 shows the main flow chart of the image classification with data augmenter. Figure 3. The flow chart of image classification with data augmentation using CNN

AlexNet
Alex Krizhevsky is the creator of the AlexNet platform, a state-of-the-art pre-trained CNN [22]. It has used for numerous comparisons in several different fields. For this reason this model architecture has been used for image classification in several different experiments [12]. It is a deep CNN which is consists of twenty-five layers including one input layer, five convolution layers, seven ReLU layers, two cross-channel normalization layers, one SoftMax layer, and finally one output layer. The rectified linear unit (ReLU) which is the nonlinear activation function thresholds the value of input less than zero and sets them to zero. It can be described mathematically as follow [23].

Residual network architectures (ResNet)
ResNet is a deep CNN, with a specially built residual structure that can support a very deep network. Classic deep convolution neural networks can't be quite large, even as the complexity rises, the accuracy decreases. ResNet's author conjectures the identity mapping is hard to remember. The deep residual learning system is suggested to solve this problem, and the network learns the residual rather than direct mapping [24]. This model won the ImageNet competition in 2015. ResNet's fundamental breakthrough was that it allowed us to successfully train incredibly deep neural networks with 150+layers. The ResNet 50 proposed in [25] with 50 residual network layers by He et al. The height of the convolution layers is 33 filters and this model has an input size of 224*224 [20]. Each model is used to train the images with SGDM and max Epoch of 10 with mini batch size=6 and initial learning rate 1e -4 . Table 1 and Table 2 show the validation and test accuracy for the two models in the case of data augmenter and without, AlexNet model gave the best test accuracy with small elapsed time compared with ResNet model. Figure 4 and Figure 5 shows the training process for AlexNet and ResNet, the blue line indicate to training accuracy and the black line indicate to validation accuracy, while in the second shape the red line indicated to training loss and the black line indicate to validation loss.    (a) shows the test confusion matrix for AlexNet with data augmentation, the Phyllocnistis citrella images are predicted wrong once as scale insect. Figure 6(b) shows the results of AlexNet confusion matrix, it is show that the heathy images are predicted wrong as lack of element diseaes and the citrella diseaes predicted wrong once as scale insect diseaes. The blue cell indicated to the correct prediction and the pink one indicated to the wrong prediction. Figure 7(a) shows the test confusion matrix for ResNet with data augmentation, the healthy images are predicted wrong twice as lack of elements and the Phyllocnistis citrella images predicted wrong once as scale insect, Figure 7(b) shows that the heathy images are predicted wrong three times as lack of elements, and the lack of element predicted wrong once as heathy one.

CONCLUSION
In this paper two models of deep CNN named AlexNet and ResNet, each model used to test a set of images consist of healthy and different types of citrus leaves diseases Phyllocnistis citrella, lack of element and scale insects. The results show that AlexNet gives the best accuracy with data augmentation 97.92% and ResNet gave 95.83% while the results without data augmentation give less accuracy with 95.83% for AlexNet and 93.75 for ResNet, from the results we conclude that training DL neural network models on more data will lead to more skillful models, and augmentation techniques will generate image variations that can boost appropriate models' ability to generalize what they have learned to new images. The elapsed time for training shows that AlexNet is the simplest structure than ResNet with a training time of 14 min 9 sec. Also, for farther analysis the confusion matrix for the two models done for the test images as a real test for the models. All the work is done with MATLAB R2019b.