Super resolution image reconstruction via dual dictionary learning in sparse environment

ABSTRACT


INTRODUCTION
Super-resolution (SR) is the process of recovering a high-resolution (HR) image from one or more low-resolution (LR) input images.Many areas like satellite imaging, high-definition television (HDTV), microscopy, traffic surveillance, military, security monitoring, medical diagnosis, and remote sensing imaging require good quality images for accurate analysis.Known variables in LR images are less than the unknown variables in HR images.Generally, sufficient number of LR images will not be available.Also, blurring operators are unknown.Hence, SR reconstruction becomes ill-posed problem.Many regularization techniques are discussed for the solution of ill-posed problem [1], [2].
The present work aims to recover the SR version of an image from a LR image.In conventional dictionary learning, one dictionary is used to train LR image patches, and another dictionary is used to train HR image patches.HR image is recovered using sparse representation.In this approach, it is difficult to completely recover high-frequency details due to the limitation of size of the dictionary.To overcome the above problem, high frequency to be recovered can be considered as a combination of main high frequency (MHF) and residual high frequency (RHF).
The proposed method comprises of dual dictionary learning levels.It is a two-layer algorithm.High frequency details are estimated by step-by-step procedure using distinct dictionaries.Primarily, MHF is first recovered from main dictionary learning which reduces the gap of the frequency spectrum.Afterwards, RHF is reconstructed from residual dictionary learning which results in shorter gap of the frequency spectrum.The method is analogous to coarse to fine recovery and yields better results.Orthogonal matching pursuit (OMP) is ISSN: 2088-8708  Super resolution image reconstruction via dual dictionary learning … (Shashi Kiran Seetharamaswamy) 4971 used for generating sparse representation coefficients for patches.K-means singular value decomposition (K-SVD) algorithm is used for training the dictionaries.This paper is arranged as follows.Section 2 revisits the related work regarding dictionary learning.Section 3 introduces sparse coding and dictionary learning concepts.Section 4 presents mathematical basics of dictionary learning.Section 5 discuss the proposed method of SR from dual dictionary learning.Section 6 depicts experimental evaluation and summarizes results.Conclusion is done in section 7.

RELATED WORK
Dictionary learning is one of important approach of single-image super-resolution [3].Dictionary learning for SR was introduced by Yang et al. [4] in which two dictionaries were jointly trained, one for LR image patches and the other for HR image patches.Zhang et al. [5] developed a computationally efficient method by replacing the sparse recovery step by matrix multiplication.He et al. [6] used Bayesian method employing a beta process prior for learning the dictionaries which was more consistent between the two feature spaces.Bhatia et al. [7] proposed a technique that used coupled dictionary learning by utilizing example-based super-resolution for high fidelity reconstruction.Yang et al. [8] presented regularized K-SVD for training dictionary and employed regularized orthogonal matching pursuit (ROMP) for sparse representation coefficients for patches.Ahmed et al. [9] discussed coupled dictionaries in which group of clustered data are designed based on correlation between data patches.By this, recovery of fine details is achieved.Dictionary learning methods use large number of image features for learning and also performance reduces for complex images.This limitation was overcome by Zhao et al. [10] by utilizing deep learning features with dictionary technique.It was difficult to represent different images with a single universal dictionary.Hence, Yang et al. [11] introduced the fuzzy clustering and weighted method to overcome this limitation.Deeba et al. [12] proposed integrated dictionary learning in which residual image learning is combined with K-SVD algorithm.In this, wavelets are used which yields better sparsity and structural details about the image.Huang and Dragotti [13] addressed the problem of single image super-resolution by using deep dictionary learning architecture.Instead of multilayer dictionaries,  dictionaries are used which are divided into synthesis model and the analysis model.High level features are extracted from analysis dictionaries and regression function is optimized by the synthesis dictionary.Each method aimed to improve the reconstructed super-resolution image to the next level by using different algorithms and through various approaches.

SPARSE CODING AND DICTIONARY LEARNING
Sparse coding is a learning method for obtaining sparse representation of the input.Any signal or an image patch can be represented as a linear combination of only few basic elements.Each basic element is known as atom.Many numbers of atoms form a dictionary.A high-dimensional signal can be recovered with only a few linear measurements with the condition that the signal is sparse.Most of the natural images can be represented in sparse representation.If the image is not sparse, the image can be converted into sparse by predefined dictionaries like discrete cosine transform (DCT), discrete Fourier transform (DFT), wavelets, contourlets, and curvelets.But these dictionaries are suitable only for particular images.Learning the dictionary instead of using predefined dictionaries will highly improve the performance [14].In dictionary learning, dictionary is tuned to the input images or signals.
Different types of dictionary learning algorithms are available, namely method of optimal directions (MOD), K-SVD, stochastic gradient descent, Lagrange dual method and least absolute shrinkage and selection operator (LASSO).The process of updating the dictionary is simple in MOD.Performance of K-SVD is better than MOD but it has higher computational complexity for updating the atoms.Stochastic gradient descent is fast compared to MOD and K-SVD.Unlike K-SVD, stochastic gradient descent works well with less number of training samples.The advantage of Lagrange dual method is that it has lesser computationally complexity.LASSO can solve the  1 minimization more efficiently.It minimizes the least square error which yields the globally optimal solution.Based on sparsity promoting function, sparse coding methods are classified into three types: a)  0 norm method, b)  1 norm method, and c) non-convex sparsity promoting function [15].

MATHEMATICAL BASICS OF DICTIONARY LEARNING
Let  ∈   be an overcomplete dictionary of K atoms (K > n).If a signal  ∈   is represented as a sparse linear combination with respect to , then  can be treated as  =  ∝ 0 where ∝ 0 ∈   is a vector with very few non-zero elements.Usually, few measurements  are made from  as in (1) [4]: Two coupled dictionaries are utilized.  is used for LR patches and  ℎ is used for HR patches.Sparse representation of LR patch is obtained from   .These sparse coefficients are used to recover the corresponding HR patch in  ℎ .For the SR of test image, learnt dictionaries are applied to test image.Sparse coefficients of LR image are obtained and are used to select the more suitable patch in the dictionary which will be most appropriate for the patches.

PROPOSED METHOD
The proposed method consists of two stages.First one is dictionary learning stage and second one is image synthesis stage.In dictionary learning stage, dual dictionaries are trained.They are main dictionary (MD) and residual dictionary (RD).Image super-resolution stage takes input image and performs super resolution using the trained model from the previous stage.

Dictionary learning stage
Two dictionaries named as Main dictionary and Residual dictionary are learnt using sparse representation [16].Figure 1 where {  }  are sparse representation vectors [5].Here, assumption is made that patch  ℎ  can be recovered by approximating  ℎ  ≈ .  .Hence, HMD can be obtained by minimizing mean error.
Next, residual dictionary is trained as follows: utilizing the main dictionary and   , HR MHF image is obtained.It is denoted by   , and using   , HR temporary image (  ) is obtained which consists of more details than   and HR RHF image denoted by   .Thus, residual dictionary is obtained by utilizing   and   .Both MD and RD are combinedly called as dual dictionaries.

Image super-resolution stage
In this stage, an input LR image is converted into estimated high-resolution image as in Figure 2. It is assumed that input LR image is developed by HR image by the similar blur and down sampled by the same amount which is done in the learning stage.In the first stage, input LR image denoted by   is interpolated by bicubic method which results in HR low frequency image denoted by   .High-resolution MHF image denoted by   is obtained from   and MD.OMP is employed to obtain {   }  and the sparse vectors {  }  as (6).Also,   is filtered with the similar high pass filters used in the learning stage.
High-resolution patches {̂ℎ  }  are generated by the product of HMD and vectors {  }  as in (5).Let   be defined as an operator which extracts a patch from the HR image in location k.The HR MHF image,   is constructed by solving the minimization problem.
The above optimization problem can be solved by least square solution, which is given by (8).
Afterwards, the high-resolution temporary image,   is generated by summing   with   .Next, by using residual dictionary and   , similar image reconstruction is done resulting in synthesis of   .Finally, HR estimated image,   is generated by adding   and   .Figure 2

EXPERIMENTAL RESULTS
Results of proposed method are discussed in this section.Based on [17], various dictionary sizes are tried, and it was observed after trial and error that size of 500 atoms yielded better results.Hence, number of atoms in the dictionary in main dictionary learning and residual dictionary learning are set to 500.Number of atoms to use in the representation of each image patch is set to 3 [18], [19].Too large or too small patch size tends to yield a smooth or unwanted artifact [20].Hence image patch size is taken as 9×9 and is overlapped by one pixel between adjacent patches.The down-sampling is set to scale factor of two.5×5 Gaussian filter is used for blurring.Convolution function is used to extract features.Experiments are conducted in MATLAB R2018a platform.The dictionary is trained by K-SVD dictionary training algorithm.The trained main dictionary and residual dictionary files are stored as .matfiles.The experiments are carried out on two standard data sets, set 5 and set 14.The test images of set 5 are shown in Figure 3.
The different stages of obtaining super-resolution image from the LR image is depicted in Figure 4 by taking an example of LR image such as 'man' image.The input image of size 512×512 is shown in Figure 4(a).HR low frequency image   is obtained by interpolating low-resolution image by bicubic method which is shown in Figure 4(b).Utilizing the main dictionary and   , HR MHF image denoted by   is obtained which is as shown in Figure 4(c).HR RHF image denoted by   is shown in Figure 4(d).The final super-resolution image is shown in Figure 4(e).It can be noticed that the SR image has less visual artifacts and has sharper results.Table 1 tabulates peak signal-to-noise ratio (PSNR) and structural similarity index measure values for the images of set 5. Table 2 tabulates PSNR and SSIM values for ten images of set 14. Table 3 tabulates PSNR and SSIM values for ten images of B100 dataset.Results of proposed method are compared with state-of-the-art SR algorithms.Table 4 tabulates PSNR values for various methods and proposed methods for scale factor x2 on Set 5 and Set14 datasets.Table 5 tabulates SSIM values for different methods and proposed methods for scale factor x2 on Set 5 and Set 14 datasets.From Tables 4 and 5, it can be observed that the proposed method is superior when compared to other methods in terms of quantitative results.VDSR [29] 0.9587 0.9124 5.
Proposed dual dictionary learning method 0.9614 0.9213 Visual results are evaluated for set 5 images in Figure 5.

CONCLUSION
The paper presented a method for SR based on dual dictionary learning and sparse representation.This method can reconstruct lost high frequency details by utilizing main dictionary learning and residual dictionary learning.The qualitative results given in the experimental section demonstrate that SR image obtained is of higher quality.The improved PSNR of 38.64 for Set 5 dataset and 34.52 for Set 14 dataset as compared to other methods also justifies the improvement in quantitative result.
depicts training stage.Initially, a set of training HR images are collected.To derive a LR low-frequency image   , a HR training image denoted by   is blurred and then down-sampled.Bicubic interpolation is done on   resulting in HR low-frequency image denoted by   .By subtracting   from   , HR high-frequency image   is generated.Afterwards, MD is constructed which is made up of two coupled sub-dictionaries.They are called as low-frequency main dictionary (LMD) and high-frequency main dictionary (HMD).Patches are extracted from   and   to build the training data  = { ℎ  ,    }  .Set of patches derived from the HR image   is  ℎ  .The patches are constructed by first extracting patches from images obtained by filtering   with high-pass filters is    .

Figure 1 .
Figure 1.Process of dictionary learning stage

Figure 2 .
Figure 2. Process of image super-resolution stage

Figure 5 .
Figure 5. Low-resolution and high-resolution images of baby, bird, butterfly, head and woman; (a) LR image of baby, (b) LR image of bird, (c) LR image of butterfly, (d) LR image of head, (e) LR image of woman, (f) HR image of baby, (g) HR image of bird, (h) HR image of butterfly, (i) HR image of head, and (j) HR image of woman is a HR image patch and  is its LR image patch.If  is overcomplete,  =  ∝ is underdetermined for unknown coefficients ∝.Hence  =  ∝ is more underdetermined.It can be proved that the sparsest solution ∝ 0 to this equation will be unique.Hence, sparse representation of a HR image patch  can be recovered from the LR image patch.

Table 1 .
PSNR and SSIM for images of Set 5

Table 2 .
PSNR and SSIM for images of Set 14

Table 4 .
Benchmark results.Average PSNR for scale factor x2 on Set 5 and Set 14 datasets

Table 5 .
Benchmark results.SSIM for scale factor x2 on Set 5 and Set 14 datasets