Land scene classification from remote sensing images using improved artificial bee colony optimization algorithm

ABSTRACT


INTRODUCTION
The rapid growth in satellite development has led to an increase in the number of high-resolution remote sensing images.High-resolution remote sensing photographs provide more detailed geographic information than low-resolution ones, which is advantageous for the sectors of agriculture, defense, geology and atmosphere [1], [2].Remote sensing images further exhibit images with fine resolution, multi-source, expanded range, and quantitative data due to the developing trends in remote sensing satellites, imaging radars, and unmanned aerial vehicle technology.As a result, the classification of vast volumes of remote sensing image data illustrating land areas is a crucial subject of study, researchers have focused on extracting a variety of efficient feature representations throughout the last few decades to enhance the performance of remote sensing image scene categorization [3]- [5].Remote sensing scene classification can be characterized as a technique that divides remote sensing scenes into a number of categories in accordance with their contents [6], [7].Areas with a lot of unstructured data require varying levels of annotation to prevent categorization oversight.However, the majority of remote sensing images that are now in use, lack properly created ontological structures [8], [9], making it impossible to include learned features with high-level semantic interpretations from category labels [10].
High-resolution remote sensing images are frequently used in a range of sizes and include a variety of content, much like multidirectional targets set against a complicated background [11], [12].Classification of remote sensing scenes is categorized into Handcrafted features, midlevel features, and deep features [13], [14].The method of feature fusion has emerged as a potential means of enhancing the performance of the deep features extracted from pre-trained convolutional neural networks (CNNs) [15], [16].A successful CNN with deep learning features can extract various layers of image data for scene semantic classification, and has in-depth domain expertise for identifying and interpreting remote sensing image features [17], [18].To extract features from images and subsequently create feature representation for scene categorization, deep CNN models are frequently employed as feature learning techniques [19].An improvised artificial bee colony algorithm is used in this study to choose the pertinent elements for remote sensing scene classification.The related works based on classifying the remote scene are presented as follows: Xu et al. [20] developed an enhanced classification system that utilized a recurrent neural network combined with a random forest (RNN-RF) classifier to classify the images of land scenes obtained from the satellite.The input images were obtained from the UC Merced land (UCM) dataset which consisted of 21 classes of labeled high-resolution images.The RNN-RF classification system provided an effective visual analysis of various features, and also enhanced classification to detect objects.The combination of RNN and RF obtained higher optimization during cross-validation on the UCM dataset and provided multi-scale views with better accuracy during classification.Since UCM is a smaller dataset, the RNN-RF classification system performed well.But for a well-versed dataset, the classification of images using the RNN-RF system remained a challenge.
Li et al. [21] developed a multi augmented attention-based convolutional neural network (MAA-CNN) to classify the land scene from high-spatial-resolution remote sensing (HRRS) images.The MAA-CNN captured the region of discrimination from the images of HRRS and performed attention cropping and augmentation.Attention cropping was utilized to improvise the required regions and perform attention maps.The augmentation was utilized to push the attention map channels to obtain various features.Bilinear attention pooling was used to combine the features, and the interclass distance was narrowed using regularized loss function.The MAA-CNN exhibited poor performance in classifying the images for the NWPU-45 dataset.
Xu et al. [22] presented a framework based on deep feature aggregation (DFA) using graph convolutional network (GCN) to classify the high spatial resolution (HSR) image obtained from remote sensing.Initially, the feature extraction was performed using the pre-trained VGG-16 to obtain multiple layers of features.GCN extracted the feature maps of each layer and provided scenes of HSR images by generating a graph for undirected adjacency.The multiple features produced by VGG-16 were combined using the weighted concatenation method.At last, the images were predicted using a linear classifier.However, the DFA-GCN faced consequences in classifying similar ground objects like man-made structures and roads.
Shen et al. [23] presented a dual-model architecture along grouping attention fusion strategy to improvise the efficiency in land scene classification.The dual model consisted of two CNNs to extract the features, and a grouping attention fusion strategy was utilized to combine the CNN features in a multi-scale way.A loss function was generated for diversities in small intra classes and large inter-class distances that produced an enhanced classification performance.The dual model architecture showed better performance than the single CNN model by providing enhanced feature selection.However, misclassification occurred in classifying similar objects.
Guo et al. [24] introduced a saliency dual attention residual network (SDAResNet) to mine the cross-channels and spatial information, for classifying land scenes in remote sensing images.The SDAResNet consists of two types of attention: spatial attention and channel attention, where the spatial attention was embedded in a low-level feature to highlight and quash information on the background.The channel attention was utilized for combining the high feature levels to mine information from meaningful saliencies.The land scene classification using SDAResNet utilized an attention mechanism and offered a better classification accuracy.However, the challenge was faced in classifying the natural images such as trees, and mountains, since they appeared similar in visual aspect.
The major contributions of this research are listed as follows: a.The artificial bee colony (ABC) optimization algorithm is in remote sensing land scene classification, but it produces poor classification accuracy using probability selection.MSVM verifies the input pattern and performs classification, providing better classification results in lower dimensions.
The remaining paper is organized as follows: section 2 presents a detailed explanation of the IABC algorithm.The results and discussion are provided in section 3. Finally, section 4 represents the overall conclusion of this paper.

IABC OPTIMIZATION ALGORITHM FOR FEATURE SELECTION
The classification of land scenes from the remote sensing images is a challenging process in remote sensing interpretation.This research proposed an IABC optimization algorithm that is employed in feature selection.The features selected using the proposed IABC make classification process easier, by providing better accuracy and removing unnecessary partitions from the image.A detailed description of the overall process involved in land scene classification is discussed in the following sections.The overall process involved in the classification process is diagrammatically represented in Figure 1.

Dataset
This research made use of three publicly available datasets based on land scene classification using remote sensing technology to evaluate the efficiency of the proposed method.The three datasets include aerial image dataset (AID), Northwestern Polytechnical University-Remote Sensing Image Scene 45 (NWPU-RESIS45), and University of California Merced (UCM) dataset.The descriptions of the aforementioned datasets are provided in the following sub-sections:

AID
The AID [25] is collected from Google Earth by Wuhan University which consists of a large-scale land-use dataset for HRRS images with 30 land scenes.The 30 land scenes include airports, parking, stadium, beach, schools, desert, farmland, meadow, bridge, and port.The AID dataset includes 10,000 images with a pixel size of 600×600 for every individual image and these images are labeled by remote sensing image experts.

NWPU-RESIS45
The NWPU-45 [26] is one of the challenging databases for classifying HRRS images.The database contains 45 category HRRS images with a pixel size of 256×256 in the color space of red, green, and blue (RGB) and each category in NWPU-45 consists of 700 images.The NWPU-45 dataset contains about 31,500 images in total.The 45 categories contain commercial plots, forests, mountains, harbors, and these categories consist of abundant images.The images in various categories seem to be similar, so it is quite challenging to attain superior performance with these images.

UCM dataset
The UCM dataset [27] is utilized in classifying the various surfaces of land obtained from high-resolution remote sensing images.The UCM dataset is collected from USGS University of California, Merced which is categorized into 21 classes with a total of 2,100 images.Each image present in the UCM dataset has a pixel size of 256×256.

Pre-processing
After collecting the data from the datasets, pre-processing is necessary to be performed.In this research, the collected data is initially processed by radiometric and geometric modifications followed by clipping of images.The useless portions present in the images are removed, while the noises and discrepancies present in the image are detached using digital filters.

Normalization
Normalization is an essential process that aids to ease the extraction of features and classification of remote-sensing land scenes.Normalization is defined as processing the images for varying values of pixel intensities.In this research, the image data is scaled using min-max normalization to improvise the learning speed.The normalization function takes image data  and produces a normalized image.The intensity of the image may be in the range of 0 to 255 and sometimes in negative values, so min-max normalization is applied to the input image data and is transformed in the range of 0 to 1.The min-max normalization is computed using (1).
where the maximum and minimum values of the pixels are represented as   and   respectively.The output from normalization is provided as input for the process of feature extraction.

Feature extraction
The normalized output image is provided as the input for the feature extraction process.In this research, the visual geometry group-16 (VGG-16), which is a convolutional neural network (CNN) model, is deployed to extract the deep features from the image.Since VGG-16 provides better accuracy value for the ImageNet dataset (a vast dataset with around fifteen million images), it is utilized in this research.The VGG-16 has 13 convolutional layers where the size of the filter is 3×3 and the pooling layer has the size of 2×2.Moreover, two fully connected layers along with a soft max function are present in the architecture of VGG-16.In VGG-16, the image data is converted into array data which helps to extend the dimensions of the array with various sizes.Generally, the architecture of the VGG-16 model is deep and helps to adjust the values of pixels, which efficiently aids in extracting the features from the image data.The architectural diagram of the VGG-16 model is represented in Figure 2.

Feature selection using IABC optimization algorithm ABC optimization
There are various types of optimization algorithms introduced from inspiration by nature.Likewise, the ABC algorithm is inspired by the intelligent behavior of bees.Generally, bees keep on trying to search for a better source of food with more nectar.The bees present in the swarm are categorized into three categories namely, employer bee, onlooker bee, and scout bee.In the process of iteration, the bees perform various operations to get a good source of nectar.In the initial stage, the swarm has  solutions, and the  ℎ iteration of solution   to find the food in the swarm is represented in (2).
where  is known as swarm size,   is abandoned solution, and the dimension size is denoted as .The complete operation process involved in the food search of ABC is mentioned below.
where the variable  , * is used to provide candidate food position, the value taken from the whole swarm is denoted as   and the random integer is denoted as  which lies in the range [1, D].The weighted function created among the range -1 and 1 is denoted as  , .When the search for employer bee is completed as per (3), the new solution   is created which is represented in (4).
where the value of  = 1,2, … , .To obtain a better solution, a methodology based on greedy selection is used which is mentioned in (5).

Search phase of onlooker bee
In this stage, the neighborhood search for all solutions is completed using the employer bees.After this, the onlooker bees obtain the information from the employer bees.Some of the best solutions are chosen to proceed with the search at this phase.The better solution is described by selecting the probability   for every individual solution.The probability   is evaluated using the formula mentioned in (6).
where the fitness value of   is denoted as (  ) and it is computed using (7).
(  ) = { where the value of the objective function   is represented as (  ).For every onlooker bee in the swarm, the better solution   is chosen based on   value.A new offspring is created using the (3) and ( 4), and a better solution is obtained among   and   .

Search phase of the scout bee
Based on (5), a new solution is created with the solution from their parents.The search in the neighborhood becomes successful when   is better than   , and the search in neighborhood also denotes when   is worse than   .The search of the neighborhood is monitored using a failure counter   .When the search becomes successful, the   is set as 0 and when the search becomes a failure,   is added as 1.The updated   is mentioned in (8).
When the value of   is greater than the limit, the respective solution   is prohibited.The prohibited   is replaced using (9).
The marginal limitation for the upper limit and the lower limit is denoted as   and   respectively.

Improvised artificial bee colony optimization
The ABC optimization algorithm is improvised using the neighborhood selection method.The conversion of ordinary ABC optimization to an improvised ABC includes changes such as the developing of neighborhood selection, which replaces probability selection.The altered search approaches are created and finally the search phase of the scout bee is improvised using a radius of the neighborhood.

Selection of neighborhood
The development of selection of neighborhoods is made, which is applied instead of probability selection.The solutions present in the colonies create a ring formation where the methodology of k-neighborhood is utilized, and k is the parameter of the neighboring radius.The k-neighborhood methodology consists of solutions of 2 + 1, when the neighboring radius fulfills the condition that 1 ≤  ≤ −1

2
. In every individual solution in the swarm   , the worthy solution   is selected in the k-neighborhood methodology.In the proposed methodology, the selection probability does not need to be chosen for every individual solution.

Altered search approach
The k-neighborhood methodology is used in an altered search approach for employer bees.In the previous section,   is replaced by a worthy solution   .So, the employer bees need not search for the neighborhood for every solution of   , and so the search is performed for the solution   only.In (10) provides an altered search approach as (10).
where the best solution obtained from -neighborhood is denoted as   .According to ( 4) and (10), a new solution   is obtained.For  th onlooker bee,  neighborhood selects the   which is represented in (11).

Altered search phase using scout bee
In the solution  , each status is controlled by   .When the   ≥ , the solution   is neglected.After this a new   is created as a substitute for a neglected solution.When the search space in the swarm of scout bees decreases, a random solution is included to improvise the search region.Neglecting the solution   , three solutions  1 ,  2 , and  3 are produced, and the best solution among the three is chosen to replace the neglected   .The value of  1 is created randomly, same as in the ABC algorithm which is previously described in (9). 2 is created from -neighborhood and the best solution   is selected.This can be represented mathematically as (14). 2, =  , + (0,1).( 1, −  2, ) where  = 1,2, … ,  and the two randomly chosen solutions from the scout bee swarm are denoted as  1 and  2.
In  3,   is chosen as the best neighbor from the neglected solution of -neighborhood.The generation of  3 takes place according to (15).
The values of    and    are denoted as min{ , } and max{ , } respectively.The value of  lies between the range of [1-], and  lies between the range of [1-D].Where the boundaries of the scout bee are represented as    and    .When all the solutions of  1 ,  2 and  3 are created, the best of three solutions is chosen to replace the neglected   .Thus the best feature is selected using the proposed IABC optimization algorithm.The fitness evaluation in IABC is performed based on the neighborhood selection method and is evaluated using (16).

Classification
The high-resolution remote sensing images obtained from the feature selection process undergo the process of classification where the various land scenes are classified.The classification of land scenes is performed using multiclass-support vector machine (MSVM).The features selected using IABC are given as input to obtain an accurate classification of remote sensing images that are acquired from the various datasets discussed previously.The MSVM is created by combining multiple-binary SVM in the classification process.The MSVM is used to create an optimal hyperplane and then classifies it into a linear pattern.MSVM verifies the input pattern and performs classification, which provides better classification results in lower dimensions.

RESULTS AND ANALYSIS
This section provides the results and analysis of this research.The result portion is classified into performance analysis and comparative analysis, represented in the following sections.In performance analysis, the efficiency of the classifier is evaluated and the efficacy of the optimization algorithm inclusive of the proposed approach is evaluated.In comparative analysis, the performance of the proposed approach is evaluated with the existing approaches.

Performance analysis
The performance of the MSVM classifier with feature selection is compared with the existing classifiers namely, k-nearest neighbor (KNN), naïve Bayes (NB), support vector machine (SVM) and multiclass support vector machine (MSVM).The performance analysis of the classifier with feature selection is represented in Table 1.In Table 1, it is shown that feature selection using IABC with the MSVM classifier performs better than the existing classifiers that are, KNN, NB, SVM, and MSVM.MSVM classifier combines multiple binaries during the classification process, and performs classification in a linear pattern, thus acting as a reason to provide better classification in lower dimensions.The MSVM attained an accuracy of 96.96%, which is comparatively higher than that of KNN (94.23%),NB (95.28%) and SVM (95.55%).The performance of classifiers with feature selection is graphically represented in Figure 3.   Table 2 shows the classification performance of the MSVM classifier without the feature selection process.The performance of MSVM is reduced without the feature selection process which helps in providing a selected feature and eases the classification process.Although, MSVM classifier performs well when compared to the existing classifiers such as KNN, NB, and SVM.The analyses from Tables 1 and 2 exhibit the significance of the feature selection process.Therefore, feature selection is important in providing better classification accuracy.The performance of classifiers without feature selection is graphically represented in Figure 4.

Performance of optimization algorithms
In this research, the feature selection process is performed using IABC optimization algorithm.IABC algorithm is an advanced optimization technique of ABC where neighborhood selection is involved, instead of probability selection.Here, the performance of IABC optimization with the existing optimization algorithms such as particle swarm optimization (PSO), ant colony optimization (ACO), fruit fly optimization algorithm (FOA), ABC optimization, and IABC optimization algorithms are evaluated based on parameters like accuracy, sensitivity, specificity, F1 score and error rate.Table 3 provides the performance of various optimization algorithms.
From Table 3, it is seen that the performance of IABC optimization algorithm is better when compared with other optimization algorithms.Due to improved search strategies and selection using the neighborhood method, the IABC achieved better results.It achieved a better classification accuracy of 96.40% and a less error rate of 3.6%.The performance of optimization algorithms is graphically represented in Figure 5.

Comparative analysis
The comparative analysis of the proposed IABC-CNN with the existing methodologies namely, multi-augmented attention-based convolutional neural network (MAA-CNN) [20] and deep feature aggregation framework driven by graph convolutional network (DFAGCN) [22] is discussed in this section with regard to classifying land scenes in remote sensing data.The comparative table of the proposed IABC-CNN with existing MAA-CNN [20] and DFAGCN [22] is represented in Table 4.
From the above comparative Table 4, it is shown that the proposed IABC-CNN exhibits better performance when compared with existing methodologies, MAA-CNN [20] and DFAGCN [22].The CNN model (VGG-16) is used for feature extraction and IABC algorithm is used for selecting the optimal features.The IABC is utilized in selecting the features which are extracted from the CNN model.The extracted features reduce the redundancy and increase the learning speed.These extracted features are selected using the proposed IABC algorithm, which aids in better classification of land scenes in HRRS images.The IABC-CNN provides better classification accuracy of 96.40%, which is comparatively higher than the MAA-CNN (91.67%) and DFAGCN (88.50%).

CONCLUSION
Remote-sensed photographs play a significant role in monitoring environmental conditions, disaster, mitigation and other remote sensing applications.Due to complex backgrounds, poor imaging conditions and similarities, it is difficult to classify the remote sensing images on various land surfaces.In this research, the input images are obtained from three familiar datasets: AID, NWPU-RESIS45, and UCM dataset.The input images are pre-processed using the normalization technique, where they are adjusted for the intensities of the pixels.The preprocessed images undergo feature extraction using VGG-16 model where the values of pixels are efficiently extracted from the image data.Then feature selection is performed using the proposed IABC algorithm, which selects the significant features and finally, the remote sensing land scene images are classified using the MSVM classifier.The experimental results show that the proposed IABC-CNN model delivered better performance than the existing MAA-CNN and DFAGCN by providing better classification accuracy of 96.40% while MAA-CNN and DFAGCN achieved 91.67% and 88.50%, respectively.In the future, the research work can be extended by using deep learning models to improve classification accuracy during categorizing land surfaces.

Figure 1 .
Figure 1.Process involved in the classification of remote sensing land scenes

Figure 3 .
Figure 3. Graphical representation of classifiers with feature selection

Figure 4 .
Figure 4. Graphical representation of classifiers without feature selection

Figure 5 .
Figure 5. Graphical representations for the performance of the optimization algorithm Feature extraction is performed with the help of VGG-16, where the image data is converted into array data which helps to extend the dimensions of the array with various sizes.c.This research employs MSVM to create an optimal hyperplane and classifies it into a linear pattern.

Table 2 .
Performance analysis of various classifiers without feature selection

Table 3 .
Performance of optimization algorithm