http://ijece.iaescore.com Analysis of combined approaches of CBIR systems by clustering at varying precision

Info 2021 The image retrieving system is used to retrieve images from the image database. Two types of Image retrieval techniques are commonly used: content-based and text-based techniques. One of the well-known image retrieval techniques that extract the images in an unsupervised way, known as the cluster-based image retrieval technique. In this cluster-based image retrieval, all visual features of an image are combined to find a better retrieval rate and precisions. The objectives of the study were to develop a new model by combining the three traits i.e., color, shape, and texture of an image. The color-shape and color- texture models were compared to a threshold value with various precision levels. A union was formed of a newly developed model with a color-shape, and color-texture model to find the retrieval rate in terms of precisions of the image retrieval system. The results were experimented on on the COREL standard database and it was found that the union of three models gives better results than the image retrieval from the individual models. The newly developed model and the union of the given models also gives better results than the existing system named cluster-based retrieval of images by unsupervised learning


INTRODUCTION
The amount of digital images on the world wide web (WWW) has been estimated to be more than millions of hundred. This generates a requirement for the advancement of new techniques for better retrieval of images and efficient storage. Content-based image retrieval (CBIR) seeks at rising methods that carry efficient browsing and searching of vast image digital records based on repeatedly derived image traits [1] and [2]. The visual substances of every image are retrieved in the database and it is explained by multidimensional attribute vectors and from these attributes vectors of the images construct an image feature database. Figure 1 depicts the presentation of the content-based image retrieval system. The exactness (similarity) between the query images and the stored images in the image featured files (database) are evaluated and the retrieval rate is executed by the indexing method of an image [3], [4].
For such problems, unsupervised learning is used for the problems in which the management of data needed. The system recognizes the items here depending on a few distance measurements (similarity) methods [5]. The objects that are identical to each other are grouped together (called a cluster), while the items that are different are divided into different classes. Chen et al. proposed in 2005, an unsupervised cluster-based image retrieval technique known as CLUE [6]. Existing CBIR methods can be categorized into two variations: region-based image retrieval method and full-image retrieval method. In full image retrieval methods, the features are taken out without segmenting into the regions of the entire image. In region-based methods, earlier to the extraction of the features the image is segmented into separate regions [7]- [9]. Few of the existing CBIR methods: i) Blobworld, is based on a region-based image retrieval method, developed by Carson et al. with computer vision group in UC Berkeley [10]. It segments the image into blobs using an EM-algorithm based on the texture and color traits of the pixels; ii) The earth mover's distance (EMD) is a color-based image retrieval method and it is based on a multi-dimensional scaling factor. The distance function used in this method is known as the earth mover's distance (EMD) [11]; iii) NaTra is developed at the University of California, Santa Barbara. In this method, Images are usually segmented into six to twelve non-overlapping equal regions [12]; (iv) PicSOM is a region-based image retrieval system developed in the Laboratory of Computer and Information Science, Helsinki University of Technology. In this system, the features are stored in a hierarchical structure that employs a self-organizing map (SOM) [13]; (v) Semantics-sensitive integrated matching for picture libraries (SIMPLIcity) is developed by Wang et al. at Stanford University. It sections the picture into 4x4-pixel squares and mines a component vector for each [14]; (vi) Unified feature matching (UFM) is a region-based image retrieval method developed by [15].

RESEARCH METHOD
In this work, a CBIR system is developed that combines the three visual features color, shape, and texture. It is based on a mixture of visual characteristics with an unsupervised form of learning. For example, color characteristics are evaluated using a color histogram and a color moment [16]. The principle color moment can be taken as the normal color in the image and it very well may be determined by (1) [17].
Here, N determines the total number of possible pixels in the image, and Pij indicates the value of the color channel ith and the pixel jth of the picture.
The second moment of color is the standard deviation (σ). That is achieved by evaluating the square root of the amplitude of the color distribution.
Here, Ei is the median of the color channel picture ith. Skewness (Si) is the third moment of color. It computes how unbalanced color distribution is, and in this way, it gives knowledge regarding the color distribution shape. It can be calculated by (3). Shape features are considered and captured in terms of edge images calculated using gradient vector flow fields and the images have been partitioned into segments or regions.

5011
In image processing, gradient vector flow (GVF) is often used to determine the number of blocks in the image, it is introduced by [18]. Let f(x, y) is an edge map classified on the image range. The GVF filed is known as the vector field and it is given by that reduces the function of energy: Texture features are used to segment images into blobs (regions) of interest and to categorize those blobs [19]. Texture features are calculated by multi-resolution filtering techniques and statistical Tamura features [20]. The texture measure is defined by the following (5) [21].
Here, W=2w+1, it shows observation window size. The technique of auto-correlation of an image can be used to discern redundant textural blueprints. The auto-correlation equation is given as (6) [21]: The mathematical model of the color-shape CBIR system is given by the combination of (1) and (4).
The mathematical model of the color-texture CBIR system is given as by the combination of (1), (6).
The mathematical model of the Color-Shape-Texture CBIR system is given by the combination of (1), (4), and (5).

GRAPH PARTITIONING METHOD OF CBIR
The initial stage of CBIR process execution is to split the image into pieces. The edges of the query (input) image should be identified to get an image segment. The first step must be followed by applying the edge detection algorithm translated into the grey-scale to accomplish these next two steps. The extraction of features is the second step. By these two phases, an image is divided into many clusters (regions) [16].
The quantities of target images that are extremely near the query image as indicated by similarity measures are chosen as the neighbor of the query image. The two best preparing advances that separate between unsupervised-based CBIR method and other CBIR methods, one is neighboring target images determination and groups of images that are the significant bit of unsupervised CBIR method (CLUE [6]) are depicted in Figure 2.
A weighted undirected graph G=(V, E) characterises a set of X images, where V is the number of nodes that represent images, i.e. V={1, 2… n}, and E is the edges framed between each pair of nodes. In a particular graph G=(V, E) with affinity matrix w, a straightforward route for the expense evaluation of dividing images (nodes) into two disjoint sets S and T (S ∩ T=Φ and S ∪ T=V) is the complete loads of edges that associate the two sets this expense is known as a partition [22].
Here, wij is identified as wij= − ( , ) 2 2 which can also be interpreted as a measure of cluster similarity. By using a bipartition of the graph method to minimize the value of partition is called the minimum partition and given in [23]. The inspiration of a graph partition rules including the Npart is characterized as (12).
The creation of the huge Npart is cost by an uneven graph partition.
By using the eigenvalue problem, Shi and Malik built a model for solving the normalized cut of graph and image segment problems [23].
Where w is a square affinity matrix of n x n, Diag= [a1, a2,…., an] is a diagonal matrix with In a certain graph illustration of images G=(V, E) with affinity matrix w, suppose the number of image clusters is {K1, K2, …, Km}, which is also the partition of V, i.e., Ki ∩ Kj=Φ. Then the representative node (image) of Ki is } , Here, wjk is an affinity matrix value of ith and kth position, and K is the number of clusters, which can likewise be seen as a measure of the between clusters similarity. Let illustrate graph-partitioning by an example: First, it indicates that the number of clusters is not fixed, it depends on some condition. In this work, one condition is applied that if a cluster has less than 100 images, then that cluster can't be further clustered.
Step 1: Suppose that total of 100 images are in the database as shown in Figure 3, N=100; these are the preprocessed target images, furthermore, these nodes (images) are now arranged concerning the query image.
Step 2: First search, the gathering of neighboring target images as for query image using nearest neighbors method (NNM).
Step 3: Next, creates a weighted undirected graph that includes the query image as well as its neighboring target images using (11), and evaluate the sum of the total weights of all target images.
Step 4: To pick a target matched node, use (15) to find the node with the highest sum value of their attribute among the 100 target nodes in each cluster.

5013
Step 5: Criteria of choosing the cluster for partitioning; count which cluster has the maximum number of nodes (images), repeat step 4 (select matched node for this cluster and so on…).
Step 6: Sopping criteria; the condition in this implementation was that if a cluster had more than 100 images, the cluster would be divided into two sub-clusters.
Step 7: Retrieval of relevant images; started to retrieve leaf clusters from left to right (can say inorder traversal way), if get first 100 images then the program is stopped and re-initiate the program and so on.
Step 8: Finally, store each collection of images in a file for all queries (for all iterations) and manually count the precision at different levels of k, chosen k=10, 20, 30, up to 100 [24].
In the color-shape and color-texture CBIR methods, the values of features save in the stored features files after adding the values of color, shape, and texture visual features of an image with a minimum 70% value of each feature in each image [16], [24]. The same is applied to the newly develop color-shape-texture CBIR method. The 70% of the texture features, 70% of the color features, and 70% of the shape features for an image and fuses these feature values and stores that fused feature values into the files. Finally, the union of all three methods is taken by normalizing the value between 0 and 1 at different precision levels and shown in Figure 4.

RESULTS AND DISCUSSION
The results have been performed in MATLAB 9.5 software with a universally useful COREL image database, which contained 10 different classes of images, each class has 100 images of resolution 256x384, and a total approximately contained 1,000 images shown in Table 1 [25]. This experiment provides a Random choice that gives a user an irregular arrangement of images from the image database. At present just the main 25 outcomes are introduced because of space restriction. In this work, precision is evaluated at a given cut-off rank of 0.7 (Threshold of 70%).
In these experiments, the same feature extraction technique is used as given in [6]. The Euclidean distance as the similarity measure is used for computing the similarity between the query and target images.
In the experiments, each image is considered as a query image from every one of the 10 image classes and subsequently, an aggregate of 1000 query images. The top k results have been chosen from the CBIR methods to calculate precision, i.e. known as precision at k. Results of each of the three methodologies (color-shape, color-texture, and color-shape-texture) are shown in Figures 5(a)-(c). After that, the union of all three approaches has been performed by normalizing the incentive somewhere in the range of 0 and 1. The result of the experiments has been performed at an average precision of 100 of all three CBIR approaches and varying precision levels of all three approaches color-shape, color-texture, and color-shape-texture with a union. This is graphically represented in Figures 6-8 respectively. One query image is taken from flower class (Image Id number 600) as shown in Figure 5.  Tables 2-4 respectively. This Figure 7 shows that at taking the average of all classes, a system based on union is better than other CBIR approaches. Figure 8 shows that at the smaller precision value the union of all three approaches Color-Shape, Color-Texture, and Color-Shape-Texture is almost 1 means almost 100% retrieval rate but when to increase the precision level the retrieval rate gradually decreases. Figure 6. Results of comparison of color-shape, color-texture, color-shape-texture, and union of these three CBIR systems with CLUE on the average precision of 100 for each class of image database Figure 7. Results of Comparison of the average of all classes of different CBIR approach with a union of color-shape, color-texture, and color-shape-texture approaches

CONCLUSION AND FUTURE DIRECTION
In this paper, a new CBIR system has been developed by combining the three traits i.e., color, shape, and texture of an image. The experimental results of four unsupervised CBIR approach at varying precision levels namely CLUE, color-shape, color-texture, and color-shape-texture systems have been compared. It is experimentally found that the developed CBIR model in the combination of three visual features produces better results in comparison to the other two CBIR methods in the combination of two visual features (colorshape and color-texture) as well as with an existing method CLUE. It is also experimentally found that the union of all three approaches produces a better performance at a varying precision level of k. It is observed that the quality of the clusters relies upon the decision of the graph partitioning algorithm. In the future, other clustering algorithms can also be tested for possible performance enhancement.