Modified CSLBP

Image hashing is an efficient way to handle digital data authentication problem. Image hashing represents quality summarization of image features in compact manner. In this paper, the modified center symmetric local binary pattern (CSLBP) image hashing algorithm is proposed. Unlike CSLBP 16 bin histogram, Modified CSLBP generates 8 bin histogram without compromise on quality to generate compact hash. It has been found that, uniform quantization on a histogram with more bin results in more precision loss. To overcome quantization loss, modified CSLBP generates the two histogram of a four bin. Uniform quantization on a 4 bin histogram results in less precision loss than a 16 bin histogram. The first generated histogram represents the nearest neighbours and second one is for the diagonal neighbours. To enhance quality in terms of discrimination power, different weight factor are used during histogram generation. For the nearest and the diagonal neighbours, two local weight factors are used. One is the Standard Deviation (SD) and other is the Laplacian of Gaussian (LoG). Standard deviation represents a spread of data which captures local variation from mean. LoG is a second order derivative edge detection operator which detects edges well in presence of noise. The proposed algorithm is resilient to the various kinds of attacks. The proposed method is tested on database having malicious and non-malicious images using benchmark like NHD and ROC which confirms theoretical analysis. The experimental results shows good performance of the proposed method for various attacks despite the short hash length.


INTRODUCTION
Over the last decade, there have been tremendous developments and advances in digital media such as image, audio and video. Various image editing tools are also easily available for modification of original content. Intentionally or unintentionally, these editing operations might change data maliciously. To deal with such problems, blind and non-blind approaches exist to handle authentication of the original content. Blind approaches do not need any extra information to determine change in original content. While non-blind approaches need some piece of information to determine authenticity of data. Watermarking and hashing come under category of non-blind techniques. Image hashing represents the image in an abstract form. This abstract form is obtained by extraction, compression, quantization of important features. In image hashing, unlike watermark, the generated image hash is not inserted in the image data, rather it is stored in the image header. Therefore original content of image remains intact. As hash is stored separately in an image, it must be compact in length. To identify either content-change or content-preserving operation on the original data, the hash code of original image stored in image header is compared with hash of modified image. If the difference of compared hash codes exceeds the set threshold then it indicates malicious operation. Apart from compact size, other desirable property of the hash is discrimination power that is to distinguish between content-preservation and content-change operations [1][2][3].
The extracted image features are large in size due to high dimensional nature of the image. In order to restrict the hash to a small size, it is necessary to extract quality features at various levels like local, semi global and global, in various domains and stored in quantized form. Proposed hashing method 'Modified CSLBP' extracts texture details from an image. Center Symmetric Local Binary Pattern (CSLBP) is textual descriptor used for hashing [4]. The CSLBP covers entire local region in only four pairs, that results in a 16 bin histogram. In addition to advantage of small size histogram, CSLBP captures structural changes in strength and gives rotational invariance. The proposed method used CSLBP in modified form based on position of neighbours. It covers entire local region and represent in only 8 bin histogram which gives out compact and quality image hash code. It generates two 4 bin histogram. Quantization loss on 16 bin histogram (CSLBP) is more than quantization on 4 bin histogram. Proposed method overcome loss of quantization problem by performing quantization of 4 bin histogram. CSLBP uses only sign information. Proposed method improves discrimination capability by incorporating local weight factor such as Standard Deviation (SD) and Laplacian of Gaussian (LoG) with sign information.
Backbone of an image hashing is quality features extraction. Identifying structural changes are important at global and local level. Due to local and global combination, methods are capable of detecting image forgery as well as locating counterfeit area of the image. In following approaches, global features and local features are extracted and used jointly. Local feature with saliency object detection using spectral residual model and global feature with DWT-SVD (Discrete-Wavelet Transform-Singular Value Decomposition) are combined [5]. Local feature by saliency detection and global feature by ring partition on projected gradient non-negative matrix factorization (PGNMF) [6]. Shape detection by zernike moment as a global feature and position and texture are detected by salient point detection as a local feature [7]. Zernike moment represents global feature and Haralick texture extracts 14 local statistics values represents local texture feature [8]. Global zernike moments combined with local MOD-LBP feature are combined [9]. Radon transformed image has both local and global features. Invariant moments from radon coefficients represents global feature and statistical measures such as zero-order moment, variance, singular value, DC component forms local features [10]. DCT (Discrete Cosine Transform) as a global feature and local feature extraction using least-squares line (LSL) fitting of Discrete Wavelet Transform (DWT) coefficients are combined [11]. DCT global feature and Gray Level Co-occurrence Matrix (GLCM) local feature are used in combination [12].
Frequency domain methods are quite popular in hashing as transformed coefficients are invariant to various geometric attacks. DCT is applied on Radon transformed image and various statistical features extracted from AC components to generate hash [13]. Fourier-Mellin Transform (FMT) is applied on an image to get translation invariance. Fourier Transforms is applied on log-polar coordinates of FMT transformed image to obtain rotation and scale invariance. Resultant coefficients are used to obtain hash [14]. Content-change coefficients are generated by applying first DWT followed by Radon transform [15]. Sign component of DCT coefficients carry information about textures and edges which utilized in hash formation [16]. SVD is applied on contourlet HMT transformed image to select most efficient components and followed by randomization to generate final hash [17].
Methods based on matrix factorization provide efficient way of separating most important information carrying components. NMF is applied twice on pseudo random sub images of original image. This method distinguishes between malicious and non-malicious attack but fail for local region forgery [18]. NMF is performed on luminance component of pseudo-randomly re-arranged input image. Hash is constructed based on the concept that adjacent entries in the NMFs coefficient matrix is basically invariant to content-preserving image operations [19].
Other approaches uses various spatial and statistical features. SIFT and Harris detector detects local stable robust feature points. These points are embedded into shape-contexts-based descriptors [20]. Local robust SIFT feature points of the original image and its attacked version are found. These points are matched using distance vector [21].
Texture extraction is a very popular way for an image hashing. Textural changes is an efficient way to discriminate between malicious and non-malicious activities. Various approaches are available for texture detection. Specifically Local Binary Pattern is a popular texture descriptor which extracts texture details at local level and binds them at semi global level through histogram. Problem associated with the LBP is that generated histogram for a local region of size 3×3 is of 256 bin [22]. There are many variants of the LBP's such as MBP, ILBP, RLBP, DLBP etc. which capture texture strength in different ways. The LBP's are also available for color images. Main drawback of the LBP and its variants are large number of the histogram bin, which eventually affects final size of descriptor. To achieve short hash length, CSLBP is a suitable option for hashing. Davarzani  factor. Four weight factors are generated from magnitude difference of four cross-symmetric pairs of CLSBP. Drawback with this method is that hash size is increased by 4 times. Also weight factor contributes very little in enhancing discrimination power [23].
In our previous approaches, we found that CSLBP can be made more robust for discrimination if local weight factor is utilized during the CSLBP histogram construction. Local weight factor captures local strength and it is bind in histogram. In our AQ-CSLBP, SDQ-CSLBP, CoCQ-CSLBP, LoGQ-CSLBP approaches, average of magnitude difference, standard deviation, correlation coefficient, Laplacian of Gaussian is used as a local weight factor respectively [24][25][26][27]. All our mentioned methods has compressed a 16 bin CSLBP histogram to a 8 bin histogram by the flipped difference concept [28]. Without a weight factor, discrimination power of the Q-CSLBP is less desirable.
The proposed method covers the local region of size 3×3 by using two histogram, each histogram having size of a 4 bin, one histogram covers two pairs (opposite) and other one will covers two pairs (cross diagonal). Therefore total bins of first and second histogram are 8 bin. Other advantage is that, uniform quantization with a 4 bin incurs small loss compared to uniform quantization on a 16 bin. The rest of this paper is organized as follows: Section 2 gives detail explanation of the proposed modified CSLBP hashing method. Section 3 discusses the experimental results and analysis. We depicts our conclusions in section 4.

PROPOSED METHOD
The proposed method is designed for gray scale images which are mainly characterized by texture and shape. The size of an input image is set to 256×256 using bilinear interpolation. This is done for the experimental purpose and comparative result analysis. In pre-processing step, an input image is altered by Gaussian filter. Gaussian filtered input image is robust for content-preserving manipulation as well as to reduce disturbance caused by manipulations like noise, lossy compression etc. For LoG weight factor, the gradient image is generated from an input image.
After pre-processing, the modified CSLBP is applied on an entire image. For the modified CSLBP calculation, the local region size is confined to 3×3. After modified CSLBP, each image pixel is represented by two values and are in the range from 0-3. First value is generated from the nearest neighbours and second one is from the diagonal neighbours. For a center pixel gc, eight neighbours are there as shown in Figure 1(a). Neighbours are classified as the nearest and the diagonal neighbours as shown in Figure 1(b) and 1(c) respectively.  (1) and (2) represents the modified CSLBP for the nearest and the diagonal neighbours where T is non-negative value to extract texture for an uneven surface; gc is center pixel; gp is neighbours of center pixel; P is no. of neighbours for centre pixel; gp+(P/4) is sign function of MCSLBP; MCLBP-N and MCLBP-D are Modified CSLBP for nearest and diagonal neighbours respectively. The pixel value varies from 0 to 3 for each neighbour in the modified CSLBP. In the modified CSLBP, like CSLBP all four cross-symmetric pairs are covered. But unlike the CSLBP, all pairs are not combined in one histogram of 16 bin. Instead, the two different histograms are generated, each of four bin by separating neighbours. The generated histogram of modified CSLBP is of 8 bin which shows 50% saving of hash code. Two weight factors, Standard deviation (SD) and Laplacian of Gaussian (LoG) are used for the nearest and the diagonal neighbours. SD weight factor is calculated from an original image while LoG weight factor is derived the Gradient image.
where SDN and SDD is Standard Deviation weight factor of nearest and diagonal neighbours respectively; gi is the set of observations of particular neighbours; g bar is the mean of observations of particular neighbours The Laplacian of an image highlights regions of rapid intensity change and is therefore often used for edge detection. If Laplacian filter is applied directly on a noisy image, the result is an edge image with many small edges which are not more useful. The Laplacian is often applied to an image that has been smoothed first with a Gaussian smoothing filter in order to reduce its sensitivity to noise. The LoG response will be zero for areas where the image has a constant intensity. However, in the vicinity of a intensity change, the LoG response will be positive on the darker side, and negative on the lighter side. This indicates reasonably sharp edge between two regions of uniform but different intensities. The Laplacian of Gaussian filter detects the horizontal and vertical boundaries as well as the boundaries other than the horizontal and vertical ones. The 2D Laplacian of Gaussian (LoG) function centered on zero and with Gaussian standard deviation sigma(σ) has the form.
where σ is standard deviation; x and y are spatial coordinates of an image. The amount of smoothing can be controlled by varying the value of the standard deviation. In the proposed method, LoG of the input image is calculated to generate the gradient image. Weight factor is determined by taking average of LoG gradient information of the nearest and the diagonal neighbours respectively. For example for pixel Gc with 8 gradient neighbours from G0 to G7. where LoGN and LoGD are LoG weight factor of the nearest and the diagonal neighbours respectively. Final weight for the nearest and the diagonal neighbours are given by (9) and (10).
where WN and WD are weight factor of the nearest and the diagonal neighbours respectively. After calculation of the modified CSLBP, histogram is constructed at sub-block level. For every sub-block, two histogram are generated, each of a 4 bin. While constructing the modified CSLBP histogram, particular histogram bin is not incremented by one like CSLBP histogram. However, bin is incremented by weight factor. Equation of the modified CSLBP histogram for the nearest and the diagonal neighbours are given as below. If the image is manipulated maliciously, then weight factor of an original image and its modified version will not be the same. This difference captures perceptual characteristics of hashing. For contentpreserving operations, image hash of an original and content-preserving modified image is different, still difference of hash codes remains within the prescribed limits of the set threshold. If the modified CSLBP histogram is constructed without weight factor then discrimination power which contributes in success rate is low. Histogram constructed with weight factor captures perceptualness at local level and identifies change area of an image.
Uniform quantization is applied separately on each histogram to generate a binary hash. In uniform quantization, the step size between adjacent quantized levels is fixed. All the sub-blocks are processed in this manner and quantized hash code of all sub blocks are concatenated to generate the final hash of the image. On the receiver side, binary hash can be efficiently compared with hamming distance. If hamming distance is less than the set threshold, then it is content-preserving manipulation, otherwise it is treated as contentchange manipulation.

EXPERIMENTAL RESULTS AND ANALYSIS
In image hashing authentication, robustness to content-preserving and sensitivity to content-change are important properties to be evaluated. These two properties are evaluated using two benchmarks. One is Normalized hamming distance (NHD) and other is Receiver Operating Characteristics (ROC) are used. Above mentioned benchmarks are suitable for binary classification that is either authentic or non-authentic. NHD measures how much change happen for both content-preserving and content-change operations. ROC basically checks discrimination capability of hashing methods.

Experimental setup
From original database, two database are created namely malicious and non-malicious. For analysis purpose, the total 36 images are taken from Matlab directory and the internet. To compare performance with other methods, all images are set to uniform standard size 256×256. For every image, total 61 attacks are applied as specified in Table 1. Some of the attacks are content-preserving while others are content-change. Last column of Table 1 specifics acronyms for various attacks. Following paragraph describes various parameter used in the modified CSLBP calculation. Input image is divided into non overlapping sub-blocks of size 3×3 i.e. R = 1 and P = 8 which represent neighbour around center pixel. T is non-negative threshold for texture extraction and it is set to 0.1. The gradient image (G) is generated by applying LoG operator on input image. For LoG operator, σ is 0.9. For the histogram generation, sub-block size is set to 32×32. This sub-block size gives good balance between hash size and discrimination capability.

Perceptual robustness test
Perceptual robustness measure indicates content preserving. It ensures that original image and its attacked version are visually similar. It categorizes such type of modification as non-malicious operations and attacked version is accepted as authentic image. To check for visual similarity, normalized hamming distance is used. Hamming distance is simple ex-or operation. Two hashes, one from original image and other from its attacked version is ex-ored to get hamming distance. Hamming distance is normalized for analysis simplicity. The threshold TNHD is set for Normalized Hamming Distance (NHD). For authentic image, NHD between original image and its attacked version is less than TNHD and for non-authentic images it is greater than the set threshold. TNHD for every method is different. For modified CLSBP, TNHD is 0.14 as shown in Figure 3. Observations: TNHD is set to 0.14. This method almost clearly distinguishes between authentic and non-authentic images except JPEG non-authentic images. Difference between minimum NHD and maximum NHD is also large. Minimum is 0.03 and maximum is 0.31.

NHD results with comparative methods
The proposed method is compared with other existing methods from Method I to VII as mentioned in Table 4. Results clearly shows that Method I and Method III satisfies perceptual robustness. Method I is implemented CSLBP texture operator and generates 16 bin histogram. Method III is same as method I only histogram is compressed from 16 bin to 8 bin using the flipped difference concept. Method II is implemented by author Davarzani has poor perceptual property as it fails to distinguished between content-change and content-preserving. This method used weight factor as magnitude of difference of cross-symmetric pairs of CSLBP. For each pair, they generate separate histogram of 16 bin. This results in 64 bin histogram and subsequently increase resultant hash size. Method IV to VII represents our previous approaches in which we achieved perceptual robustness as well as discrimination capability. For Method IV to VII, all are generated 8 bin histogram using the flipped difference concept. However flipped difference concept compresses histogram but its overall discrimination power is low. To enhance this discrimination power, various weight factors are utilized during CSLBP construction. In our proposed approach, CSLBP equations are arranged according to neighbours which gives out 50% reduction in histogram bins without compromise on quality and without compression.

Discrimination test
Receiver Operator Characteristic (ROC) curve is used to display the performance of a binary classification algorithms at various threshold settings. TPR and FPR indicate robustness and discrimination, respectively. The area under the ROC curve is a measure of how well a parameter can distinguish between two diagnostic groups (authentic/non-authentic). Accuracy is measured by the area under the ROC curve. Table 5 shows TPR and FPR for the proposed method. Observations: For compressed image hashing, success rate is 89%. For almost all attacks, proposed method 'Modified CSLBP' shows better discrimination capability. The proposed method 'Modified CSLBP' shows average discrimination capability only for JPEG attack as JPEG non-authentic images have smooth visual appearance.
For an average database, TPR is 0.89. If weight factor is not utilized, then TPR is close to 0.82, which shows that with the help of local weight factor, the discrimination power of hashing algorithm can be  Figure 4 to Figure 15. From Figure 2 to Figure 12, it shows that the proposed modified CSLBP is quite robust for almost all types of attack with good discrimination capability. Only for decrease contrast and JPEG quality factors, performance is average. Performance is improved for Gaussian noise and rotation attack than existing and our previous proposed image hashing methods.

CONCLUSION
We have proposed the modified CSLBP image hashing method with weight factor. Original CSLBP is modified depending on neighbours location. Modified CSLBP generates two 4 bin histogram for a subblock. With Modified CSLBP, resultant hash code is 50% compact than original CSLBP. Quantization loss is decreased when it is applied on 4 bin histogram. Discrimination power is enhanced by using local weight factor namely, standard deviation and LoG. Desirable characteristics of hashing like compact length, quality features and desirable discrimination power are achieved by the proposed method. Proposed method is robust to variety types of attacks as results are proved by NHD and ROC curve.