Feature selection for sky image classification based on self adaptive ant colony system algorithm

ABSTRACT


INTRODUCTION
The classifying of the cloud type from ground-based sky images is continually receiving attention.The different forms of cloud have an impact on both weather prediction and the exchange of energy between the atmosphere and the Earth's surface [1], [2].The variations of cloud images which depend on various atmospheric circumstances are the primary distinction between cloud images and other images.A cloud does not always have a definite spatial distribution.Even clouds of the same genus can vary in size and shape.Additionally, sophisticated examples of curving shapes, crossing borders, and angles can be seen in the structure information and cloud distribution [3]- [5].The various identification technology equipment to collect sky photographs include meteorological balloons, satellites-based, and ground-based [6], [7].The meteorological balloon and satellite-based approach's cloud-system enable the direct observation on how clouds affect the earth's radiation at the top of the atmosphere.The purpose of a ground-based approach is to use the local area and observe cloud bottoms in order to get whole data of the  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 7037-7047 7038 cloud.These ground-based sky images are readily available, relatively inexpensive, and of high spatial resolution [8].
Creating significant features that can be used to distinguish between different cloud types in ground-based sky image is a crucial issue in this field.Different algorithms have been used to extract many visual attributes from sky images, such as texture, color, and shape, which will be taken into account when determining the type of cloud.The work in [9] was the first to accomplish automated cloud classification and combines spectral and textural attributes using the grey level co-occurrence matrix.The k-nearest neighbor (KNN) classifier with random test sample successfully categorizes seven different sky patterns with an accuracy of 87.52%.In [4], a method for feature extraction that uses the average ranking of occurrence patterns of all rotation-invariant patterns offered in the local binary pattern (LBP) was described.A cloud image becomes more robust when the occurrence rates are in various changing patterns.Four patch-LBP and region LBP technique have also been proposed in [1].Support vector machine (SVM) and linear discriminant analysis are used to classify the various cloud types represented in a histogram.However, insignificant extracted features occur during the selection of the most important feature set.
Feature selection technique is the preprocessing process that aims to minimize the dimensionality of the data which gets only the significant feature subset as small as conceivable that are supplied to the learning algorithm [10].The majority of researchers have addressed the feature selection problem for image classification task by using the bio-inspired algorithms [11].The grey wolf optimization (GWO) [12], ant colony optimization (ACO) [13], and bat algorithm (BA) [14] are examples of these techniques, which are used for prioritizing features and improving classification accuracy.The development of an intelligent transportation system with traffic sign detection and recognition system based on the ACO algorithm was proposed by Jayaprakash and KeziSelvaVijila [15].This work has achieved significant progress when tested on public road sign database.Liu et al. [14] has presented a feature selection method to increase the final detection accuracy for image steganalysis.The relevant binary feature subset was retrieved from the whole feature set by using BA.A levy flight based and GWO was also proposed in [12] to address feature selection for image steganalysis.
A modified version of the ant lion optimizer (ALO) algorithm with a wavelet SVM classifier was proposed in [16] for overcoming the high correlation bands in hyperspectral image classification.This method performed better than previous algorithms because of its ability to leverage the global best ants in the local search.In [17], a feature subset of computed tomography images was generated using the ACO algorithm with a rough dependency measure.SVM and naïve Bayes classifier were trained on the selected subsets to predict lung cancer disease.These techniques experience early convergence and are drawn towards regional optimum areas.ACO-based feature selection has also been proposed in the high dimensional space of image classification problem [13].Nevertheless, the ACO algorithm has low classification accuracy due to premature convergence and poor balance between exploitation and exploration mechanism.
The ant colony system (ACS) was first presented by Dorigo and Stützle [18].The ACS improved the ACO algorithmic in three different ways [19], [20].The state transition rule for determining the route to the next node is first replaced with an aggressive action choice between exploitation and exploration.Second, the procedure for updating the global pheromone rule will deposit pheromone on the routes of the best ant's tour.Finally, the ants ignored some pheromone trails along the previously traveled path to increase deliberation of other remaining paths [21].The accuracy and effectiveness of ACS are considerably higher than those of other algorithms because there are not many parameters that have to be modified to aid in the identification of significant features.This ability dealt with the nonlinear global optimization problem and was initially developed for finding the shortest path of the travelling salesman problem [22].Other combinatorial optimization problems that have been tackled using ACS, including vehicle routing [23], job scheduling [24], network communication [19], [25] and image processing [26].However, the ACS's exploring mechanism is ineffective.In the search step, which is carried out using the trail-and-error method and is error prone while the fundamental of the local pheromone update parameter are fixed and maintained constant.Additionally, effective results might not always be reproducible.In this paper, the ACS exploration mechanism for selecting significant features is also proposed.The remainder of the paper is organized as follows: section 2 describes the method of the proposed algorithm and dataset collection while section 3 presents the experimental findings and discussion.Finally, section 4 gives the conclusion and recommendations for future works.

METHOD 2.1. Proposed algorithm
This section provides the self-adaptive ACS (SAACS) algorithm for feature selection.

7039
and iii) the classification process.The first component will produce the weighted directed graph data to SAACS.The SAACS will provide the significant feature subset to be used in the classification process.The SAACS algorithm uses the sigmoid function to adaptively control the amount of local pheromone update value, drawing inspiration from the activation function of neural networks [27].The experiments are assembled using a different number of features in order to determine the influence of these different selections on accuracy results.First, the sky images were rescaled in fixed size 500500 pixels and converted into greyscale.Next, we randomly select 70% of sky images for training and 30% of sky images for testing.The classifier is SVM, multilayer perceptron (MLP), kernel support vector machine (KSVM), KNN, decision tree (DT), and random forest (RF).The n features are presented as  1 ,  2 ,…,   in the graph modeling  = (, ), where node  1 is presented as feature  1 and e is the edge link to the nearest node.The directed graph is formed by connecting the two nodes   and   if the weight of the related features of two nodes is more than 1.
The SAACS for feature selection starts by initialization of the parameters as shown in Table 1.Nodes   are related to pheromone values   .Each ant k is distributed to a particular node in the first step, which can travel and consider any other node in the graph   ← {  }, where   is the subset by ant k.Ants carry out a forward selection in which each ant k grows its subset   incrementally by adding new features.Each ant  explores all features in the set  −   and selects the following feature to include in   based on the ACS-based feature selection algorithm.Ant transition in SAACS algorithm is where each ant will choose a feature subset at random from a total of  features.In the first stage, each ant selects a node that corresponds to its direction in time t.The random transition utilizing the proposed in ( 1) and ( 2), where parameter  0 is specified as 0 ≤  0 ≤ 1 and q is a random variable with uniform distribution in the closed interval [0,1].   is established using: where α and β are two parameters for balancing the weights between the pheromone and heuristic value, and    refers to the neighbor nodes of node i which ant m has not yet visited.The total number of features is given by the value of N. The amount of pheromone for each feature in the initial iteration was assigned a minimal random value.Heuristic value also influences a feature's productivity.The mean decrease in impurity, which offers a relative feature importance, is used to set the value of   .The relative by ordering the relevance of the features is provided by this feature significance score [28].
The ACS-based selection process included the ant transition and local pheromone update rule as two crucial steps to improve the exploration mechanism.The transition probability of ACS is used to determine whether the current λ is dominated or not.As a result, the ants determine which features are sufficiently significant to include in the feature set based on the value of a parameter λ.In other words, these important steps are guided by the appropriate λ value for direct subset formation which can maximize dependency and minimize redundancy among the features.
In the global pheromone updating process, the quantity of pheromones of each chosen node and feature is updated once each iteration  of the graph has been fully traversed by all the ants.The updating rule is carried out by utilizing the (3) [29] to update the quantity of pheromone for each feature.
where ρ is the evaporation factor,   corresponds to the current quantity of pheromone on the link (, ),   is the iteration's best tour so far, and ∆   is the reward provided to the best tour.

7041
Each node that has been visited will undergo the local pheromone update procedure by the corresponding ant [29].As a result, choosing an appropriate pheromone level is essential to navigate the search space and discovering the most comprehensive optimal solution.In the early stages of the evolution process, all individuals are motivated to thoroughly explore the entire search space.The ants are then encouraged towards convergence to the global optimum and establish the optimal solution in a later stage of the process.The local pheromone update is proposed to be adaptively controlled using an adaptive weighting strategy through the proposed (4).This is done by employing a sigmoid function based on feedback collection and reward mechanism for determining the significant features to be included in the final subset.Significant features are determined according to (5) [27], where  is weighted activation function and  0 is the initial pheromone value, where e is the natural logarithm and  is the input to the function which is determined by (4).The evaluation function is the core component of any feature selection method.The function evaluates the quality of engaging features based on their abilities to distinguish between various classes to determine the optimality of subsets.The wrapper technique is used in this study's evaluation process.This approach evaluates various combinations of features using a learning algorithm, and after a number of evaluation rounds, the best optimal features are shown.
The experiments are repeated 50 times and the average accuracy is used for comparison.This implies some sort of classification decision feedback mechanism and evaluation criteria to modify the searching of significant features.The average of all the fitness functions has been calculated.Following convergence, the best ant's relevant feature set was chosen to prune the feature dimension.The fitness function () of solution  is defined as in (6) [28].

Dataset
The widely accepted sky image datasets i.e., Kiel, Singapore whole-sky imaging categories (SWIMCAT), MGC Diagnostics Corporation (MGCD), and greatest common divisor (GCD) were used in this study.Experiments are conducted on different benchmark classification problems available in the literature to demonstrate the performance of the proposed algorithm.The description of the datasets is shown in Table 2. Figure 2 displays the images and classes of all four datasets.The Kiel and SWIMCAT is a small dataset, but MGCD and GCD are very large datasets.
A calibrated ground-based WSI created by [30] in Singapore between January 2013 and May 2014 is presented in Figure 2(a).The automatic wide angled high-resolution sky imaging camera collected image patches for the whole-sky imaging categories database.This database included images from each of the 5 cloud categories.A total of 784 image patches were chosen.Each image patch size is 125125 pixels.
Kiel dataset categorizes sky images into 7 classes with a resolution of 2,2721,704 pixels as shown in Figure 2(b).The sky images were captured when a German researcher's team was studying a project named "Polarstern" during a transit of a research vessel.The sky images are different in illumination and intra-class variation.Different climates, seasons, and sunray angles impact upon the various images.This sky image dataset is programmed by capturing one sky image every 15 second [9].Heinle et al. [9] selected approximately 1,500 sky images from a total of 75,000 based on independent time and with respect to predefined universal cloud classification system.
The MGCD dataset as shown in Figure 2(c) contains 8,000 cloud samples in JPEG format.This dataset was captured using a sky camera with a fisheye lens of size 1,0241,024 pixels.The WMO's taxonomic classification guidelines and similarities in cloud appearance were used by the MGCD to categorize the sky conditions into 7 sky classes."Mix cloud" are a category of cloud where at least two distinct cloud types are typically presented.Additionally, sky image with 10% or less cloud coverage is regarded as clear sky [31], [32].
The largest GCD dataset in Figure 2(d) which comprises 19,000 cloud images, was collected by camera sensors in nine Chinese regions.GCD has a lot of variation in sky condition because it was gathered over an extended period.GCD includes 7 different types of clouds according to the WMO's classification rules.The resolution of cloud images in GCD is 512512 pixels, and they are recorded in JPEG format [8], [32].

RESULTS AND DISCUSSION
Experiments were carried out on the four datasets to evaluate the SAACS algorithm.The classifiers that were used in the classification process are SVM, KSVM, MLP, RF, KNN and DT.Six benchmark bio-inspired algorithms have been successfully used for performance comparison namely, road sign detection and recognition (RSDR) [15], levy flight-based GWO (LFGWO) [12], binary BA (BBA) [14], ACO [13], modified ALO (MALO) [16], and ACO with rough dependency measure (ACO_RDM) [17].The classification accuracy, number of selected features, similarity score, precision, recall and f-measure metrics are used as the algorithm performances.Calculation of the accuracy is as given in ( 7) [33] which indicates of the correctly classified cloud type from the total number of samples in the dataset.

%Accuracy = ( number of correctly classify samples total number of samples
) ×100 (7) Precision detects the rate of true positives among all positive values while recall is used to compute the capability of the positive case.A harmonic mean of recall and precision is measured using the f-measure.Precision, recall and f-measure are computed using ( 8), ( 9) and (10)  ) ×100 (9)

F-measure=2× Positive predictive value × True positive rate
Positive predictive value+True positive rate (10) Cosine similarity is used to determine the angle between two features' cosine value.This measurement provides details regarding the direction of two feature vectors without taking into account their magnitudes.The similarity is measured using (11) [17] as: where   ,   are any two features in  feature vectors.The performance of the algorithms was also compared using Friedman test and Mann-Whitney U test.Friedman test is used to generate a ranking across multiple algorithms.This nonparametric statistical test is to ascertain the average categorization accuracy rank of the algorithms.The smallest value for the rank implies the best performance [34].Mann-Whitney U test is used to show a significant difference between two independent groups.The  value of the Mann-Whitney U test reveal whether there is significant difference in the algorithm's performance [35].
The results of number of selected features and cosine similarity of the algorithms is shown in Table 3.In summary, SAACS has achieved best similarity value in Kiel, SWIMCAT and MGCD datasets while for the GCD dataset, the SAACS has obtained the smallest number of selected features.Table 4 tabulates the average classification accuracy for all four datasets which are combined to form a single dataset.The figure in parenthesis is the performance rank.The results demonstrate that SAACS reaches the best average accuracy compared to other algorithms.Through the use of the Gaussian smoothness standard deviation, the possibility of capturing the dominant information in sky images as well as applicable to the accurate classification is increased.This technique contributes the mean information of pixel distribution in each of the patch hierarchical properly in the sky image.Then, graph modeling technique is introduced to create the relationship between features for highlighting the high relevance features.Furthermore, the final feature subset is selected by formulating activation function in local pheromone update value and heuristic information to selecting the significant features in every step of ant.The benchmark image feature selection algorithms have two main drawbacks i.e., trapped into local optimum and immature convergence in later stage [12]- [17] due to incomplete search space exploration.In other words, those algorithms rely on the positive feedback principle to reinforce the best solution which converges prematurely before the best solution is founded.Thus, the concept of local pheromone update of ACS has proved to be effective to search the solution in wider feature space.ACS provides the local pheromone update parameter which can avoid stagnation and premature convergence by decreasing the pheromone value on previously used edges and makes them less attractive for other ants.The SAACS parameters were designed considering the input from the feedback of fitness function and are all automatically adaptable when the solution is generated.In this fashion, the path that leads to significant features may be chosen by the ants using the pheromone level and heuristic information.Tables 5 to 8 show the results of three performance metrics including average precision, recall and F-measure on each dataset.It can be concluded that the SAACS provides the best average performance for all datasets.In Table 10, Mann-Whitney  test shows the result of average accuracy of the first and second place algorithms (SAACS and LFGWO) are tested for any significance difference.The method used for this test is the nonparametric Mann-Whitney  test with confidence interval of 95%.Any  value which is less than 0.05 indicates a significant difference.The result of Mann-Whitney  test indicates that SAACS is significantly better than the LFGWO benchmark algorithm in all classifiers.

CONCLUSION
This paper has proposed a new feature selection algorithm that improves the local pheromone update value and heuristic information of the original ACS in classifying the cloud type from ground-based sky images.The level of local pheromone update value and heuristic information are adaptively controlled by employing a sigmoid activation function based on feedback information and reward mechanism.Therefore, the most significant feature subset of the extracted features is generated.The benchmark comparison of six image feature selection algorithms on four sky image datasets has shown that the SAACS algorithm outperforms the benchmark algorithms.SAACS was able to leap out from the local optimum because of its ability to explore a wider feature space, which significantly increases classification accuracy with a small number of features.This paper emphasizes the exploration mechanism.However, as a recommendation for future work, the balance between the exploration and exploitation mechanisms has also to be improved.The SAACS can also be used to determine the most important features in other domains, with the purpose of selecting significant information from the image.Disaster management, medical diagnosis, industrial inspection, sports management, and content-based image retrieval are examples of these domains.
Figure 1 displays three main components i.e., i) the feature extraction and graph modeling, ii) the SAACS algorithm, Int J Elec & Comp Eng ISSN: 2088-8708  Feature selection for sky image classification based on self adaptive ant colony system … (Montha Petwan)

Int
Feature selection for sky image classification based on self adaptive ant colony system … (MonthaPetwan)

Figure 2 .
Figure 2. Representation of four sky image datasets (a) SWIMCAT, (b) Kiel, (c) MGCD, and (d) GCD Feature selection for sky image classification based on self adaptive ant colony system … (MonthaPetwan)

Table 1 .
Parameter setting for SAACS algorithm

Table 3 .
Number of selected features and cosine similarity of each algorithm for each dataset

Table 4 .
Average classification accuracy for combined datasets

Table 5 .
Precision, recall, and f-measure for Kiel dataset

Table 6 .
Precision, recall, and f-measure for SWIMCAT dataset

Table 7 .
Precision, recall, and f-measure for MGCD dataset

Table 8 .
Precision, recall, and f-measure for GCD dataset

Table 9
depicts the performance rank of 6 benchmark algorithms and the proposed algorithm in different classifiers using the Friedman test.This statistical test computes the rank for the algorithm in terms of the classification accuracy.The smallest value for the classification accuracy indicates the highest rank.SAACS has obtained the first rank for all the classifiers.
Feature selection for sky image classification based on self adaptive ant colony system … (Montha Petwan) 7045

Table 9 .
Performance rank of all datasets for each classifier

Table 10 .
p values from Mann-Whitney U test