Optimization techniques on fuzzy inference systems to detect Xanthomonas campestris disease

Received Jul 16, 2020 Revised Nov 10, 2020 Accepted Dec 5, 2020 This paper shows the outcomes for four optimization models based on fuzzy inference systems, intervened using Quasi-Newton and genetic algorithms, to early assess bean plants’ leaves for Xanthomonas campestris disease. The assessment on the status of the plant (sane or ill) is defined through the intensity of the color in the RGB scale for the data-sets and images to analyze the implementation of the models. The best model performance is 99.68% when compared with the training data and a 94% effectiveness rate on the detection of Xanthomonas campestris in a bean leave image. Therefore, these results would allow farmers to take early measures to reduce the impact of the disease on the look and performance of green bean crops.


INTRODUCTION
Cropping demands a great amount of effort on its observation and care from farmers; for them, one of the most time and resources consuming tasks is the traditional early detection and treatment of diseases that would ensure the final production and quality of the crop. Bearing in mind that "currently, disease transmission is easier and new ones might appear in places they have never been detected before," [1] great economic losses based on the belief that a crop looks is its most important quality sign [2] are common; furthermore, the accelerated increase of the population does not favours any slow down in the agronomical processes as farmers must keep an steady production rythm to satisfy the demand [3]. Therefore, to reduce the amount of poor looking and poor quality crops it is of great importance to detect diseases caused by bacteria, like the Xanthomonas campestris, in their early stages to avoid evident sequels [4] on the final product [5]. That need for constant monitoring can be reduced with the usage of digital image processing (DIP) [6] to identify the most relevant characteristics on the grown plant; Convolutional neural networks [7] are usually used in concordance with DIP due to its capacity to evaluate multiple depth layers and filters that show precise outcomes, however, if the conformation of the model is unknown its outcome loses significance. A second manner to approach DIP is to use classification methods like those offered by fuzzy inference systems [8], in which the input is part of a normalized or previously processed dataset [9] and the output is quantifiable information for the analysis and detection of ill leaves [10] that would later impact the plant classification into ill (desease present) and sane (desease free).
This paper intents to propose an alternative to the full-time professional companion to crop monitoring process through an early identification model based on Image sampling for Xanthomonas campestris disease in green bean leaves. To achieve that goal, through the performance in terms of the mean  [11], and optimized with heuristic (genetic) [12] and exact (Quasi-Newton) [13] algorithms, are evaluated. Classification variables are the ones defined in the RGB scales to assess each pixel in different layers [14,15]. Then, the fuzzy inference systems' input and output sets, as well as their rules, are defined to relate sets with the color intensity scale for sane and ill plants. Later, the proposed systems are optimized by modifying the sets' membership functions for fuzzy inputs and outputs [16] through training that uses an RGB pixel value dataset for both sane and ill pixels. Therefore, the outcome search is focused on the correct definition of the membership functions for the fuzzy sets to assess accurately the crops' status.

RESEARCH METHOD
This research is classified as experimental and it is focused on the provided dataset analysis to establish the critical variables, and the most relevant tools for pattern and behavior recognition [17,18], related to the complex system phenomena "plant status" [17,18]. For this research, the tool defined to evaluate such patterns is a fuzzy logic system even though is not common to apply them in field like the one described above.
The developmental process starts with the designing of disease detection models and ends with the evaluation of the outcomes related to the different identification techniques resulting from each model as shown in Figure 1, as: i) pixel level color components' identification for green beans leaves' images, ii) processed dataset load and normalization (pixel color components and green bean leaves' images), iii) design of the initial models for the disease detection which were trained and optimized using genetic algorithms or BFGS algorithms with the training dataset (pixels), iv) evaluation of the optimized model through an established error threshold: if the threshold was met, the model is validated using the classification rubrics Precision, recall, f1-score, and Accuracy in images (validation dataset) of leaf of green bean plant (ill or sane).

Models' design and implementation
The models for detection Xanthomonas campestris on green bean leaves are designed after the variables related to color intensity in an image [14,15] and different chromatic color scales must be taken into account [19], however, the RGB scale is selected due to its capacity to provide homogeneous tones by combining red, green, and blues hues [20]. Afterward, the pixel color values are set for normal (sane leaves) and the ones affected by Xanthomonas campestris (ill leave) using the white circled segments in Figure 1 as the reference for both pixel values using the white circled segments in Figure 2 as the reference for both pixel values and status. The values for each component are normalized on an interval between 0 and 1 as shown in Table 1. There are 45-pixel values related to the status (sane or ill) of a green bean leave that would configure the training dataset.
Normalization is required to easily define the membership functions for the fuzzy system input and output sets in values between 0 and 1, instead of 0 and 255, making the process more efficient [21,22]. The Mamdani type fuzzy inference system problem to identify a sick leave is modeled after the RGB value indicators, each one having 3 input sets (R,G,B color components) and the pixel status output set [23]: the first model has nine rules set representing the equivalence of a sane or ill pixel correlating it to the system's input and output, there are three triangular membership functions in the input sets established to depict low, medium, and high intensities for each RGB component as shown in Figure 3(a), the second system is defined by 17 rules and six membership functions in the input sets corresponding to the value intensity in the RGB scale (low, low-mid, mid-mid. Mid-high, high) as shown in Figure 3   As a general rule, fuzzy inference systems are characterized by their high interpretability and low accuracy, hence, it was necessary to implement genetic [24] and Quasi-Newton BFGS [25] optimization algorithms to every model to adjust the membership functions for input and output sets so an effective logical correlation with the system rules and inference system could be established among all four models developed. This led to a closer identification of the pixel status: being classified as ill when the value tends to 0.3884 and sane if the value sits around 0.62. The Quasi-Newton algorithm implementation is done in MATLAB through the optimset function [26] and the genetic algorithm through the Gaoptimset [27]. The fuzzy systems, set with the training data (45-pixel values for sane o ill), were tested to verify a correct recognition of the pixels' value status through the MSE calculation between the predicted outcome and the obtained results. The latter shows a medium to low accuracy (as expected on a fuzzy inference system), as a consequence the membership functions definitions for input and output sets are modified using the genetic and Quasi-Newton algorithms stated above; for such modification, it is necessary to have the input from the fuzzy system and the expected outcome and the outcome obtained from both algorithms [28]. The aim is to get a global minimum to reduce the difference between the fuzzy inference system expected outcome and the current outcome through the adjustment of the membership functions' limits in the input and output of the fuzzy systems.
To analyze most of the solution domain, according to the rubric established in [29], and meeting the minimum iterations value to access the majority of the solution domain, a total of 30 iterations are carried out using the optimization through a genetic algorithm; each iteration is made in different points of the solution domain and therefore different minimums can be achieved for each repetition. Otherwise, the Quasi-Newton algorithm was executed once because it uses as input the already established, global or local, minimums and thereafter the algorithm outcome always tends to the same minimum value. Each initial system's membership functions are adjusted generating four proposals for fuzzy inference systems that, based on a test image dataset taken from [30], are assessed by presenting an image of a green bean leave for them to identify if the image shows or no signs Xanthomonas campestris based on a pixel level segmentation of the image. Once tested, the arithmetic mean is calculated for each pixel determining whether the plant is sane or not.

Genetic algorithm
Having established the MSE as a performance indicator, after 30 iterations the results show that the fuzzy inference system type Mamdani with three membership functions in the input set achieved the best possible configuration by the 22nd repetition while the fuzzy inference system type Mamdani with six membership functions in the input set did it in the first iteration. The results' comparison between the simulated and expected values for the best iteration in each system is shown in Figure 4. The error Figure 5 shows the difference between the expected and obtained results and the MSE for the best-configured iteration in each system. The fuzzy inference system with three membership functions had 7.83x10 -5 MSE while the fuzzy inference system with six membership functions 1.01x10 -5 .

Quasi-Newton algorithm
A comparative analysis, between the expected and obtained results, is achieved through the implementation of the Quasi-Newton BFGS algorithm in a single point in time for both systems, three and six membership functions in the input set in Figure 6, the comparison shows significant differences with the proposal intervened by a genetic algorithm. Figure 7 shows the difference between the expected and obtained results and their MSE values; the fuzzy inference system with three membership functions MSE was 1.53x10-4 while the fuzzy inference system with six membership functions MSE value was 9.65x10-3. Both cases evidence greater values for MSE when compared with the genetic algorithm used.

System proposals' performance related to the training dataset
This excerpt presents the best system configurations achieved in each proposal in terms of the minimum and maximum error and the MSE. Table 2 shows the iteration in which the best configuration was obtained for each model; according to the results, the best proposal based on the training dataset to identify the status (sane or ill) of a pixel corresponds to the fuzzy inference system Mamdani with six membership functions intervened with a genetic algorithm that has an MSE 1.01x10-5 and 99.68% effectiveness values.

System proposals' performance related to the test dataset
After the attainment of the intervened fuzzy inference systems Mamdani with six membership functions in the input set, their correct configuration is evaluated using a set of 100 images taken from [30] representing green bean leaves; 50 sane leaves and 50 ill leaves with Xanthomonas campestris disease. The defined verification ranges for the subsets are 0.3884 to 0.59 for the first one, and 0.6 to 0.7 for the second one. Table 3 shows the results of the tests: the model with the greater effectiveness rate on the detection of the disease is the Mamdani fuzzy inference system with six membership functions intervened with a genetic algorithm showing as results 94% for Accuracy, 97.96% for precision, 90.57% for recall, and 94.12% for f1score. The result values for all of the rubrics were obtained from the confusion matrices for each model. Figure 8 shows the results for the Mamdani model using 6 membership function and optimized with a genetic algorithm (best model).   Figure 9 shows the best performing model (genetic algorithm intervened fuzzy inference system, Mamdani, with six membership functions in the input set) rule set and the rules obtained after the color, tone and pixel analysis of sane and ill leaves for Xanthomonas campestris. The inference process taking place in the model, based on the established rules, can be shown through the correspondence of the established rules and the membership functions in its input and output sets as presented in the Figure 10. Figure 11 shows the relationship between the input and output sets according to the inference process. This was made by contrasting the input sets related to the red (X-axis) and blue (Z-axis) components to the feasibility of a sane pixel (Y-axis); this evidences the convergence values for ill leaves (0.3884) and sane leaves (0.62).

Comparison between the best model attained and related research works
The model developed in this research has proven to have a high accuracy level (94%), similar to the outcomes achieved through methods based on neural networks (90% for unsupervised neural networks, and up to 99.42% for a deep neural network). One of the most important characteristics of neural networks is their high accuracy to the cost of interpretability, nevertheless, the best model for this research developed using a fuzzy inference system does not sacrifice interpretability. Table 4 shows a comparison among the techniques most commonly used in the research field (disease detection on leaves) and the best performing model obtained in this work. Our model does not only equalize the accuracy standards but shows high interpretability, feature that represents a novelty in this application field. Deep neural networks 96.3% X Plant leaf disease diagnosis from color imagery using co-occurrence matrix and artificial intelligence system [32] Unsupervised neural networks 90% X

CONCLUSION
The implementation of quasi-Newton and genetic optimization algorithms over Mamdani fuzzy inference systems increased the effectiveness rates (90.17% to 99.68% on the most precise system) in the pixel level assessment of green bean leaves for Xanthomonas campestris disease. Such systems are accurate and highly interpretable, furthermore, fuzzy inference systems allow the Figureical representation of its inference processes through the correspondence between the set rules and the membership function in the input and output sets, and a process outcome surface Figure. The tested fuzzy systems obtained from the optimization process can detect the presence of the Xanthomonas campestris disease in green bean leaves through the image input (100X100 pixels each) and have an effectiveness rate to classify 100 leaves' images that range from 84% (less accurate model) to 94% (most accurate model). The Mamdani fuzzy inference system with six membership function in the input set, optimized using the genetic algorithm, has the greatest effectiveness in terms of the adequate classification on pixel level based on the training dataset, 99.68%, and the detection of the Xanthomonas campestris disease based on the test dataset, 94%. This indicates the model has high accuracy and interpretability when optimized and a greater capacity to detect the existence of the disease in a plant.
In comparison to other proposals to detect diseases in plants through computerized techniques, the outcome of the fuzzy logic system with the greater accuracy developed in this research is within their ranges, 90% to 99.42%. Furthermore, as this model is based on a fuzzy logic system, its conformation can be visualized allowing a high level of interpretability, allowing farmers to understand the dynamic of the model and take early measures to reduce the impact of the disease on the look and performance of green bean crops.