Efficient denoising approach based eulerian videomagnification forcolour and motion variations

Digital video magnification is a computer-based microscope, which is useful to detect subtle changes to human eyes in recorded videos. This technology can be employed in several areas such as medical, biological, mechanical and physical applications. Eulerian is the most popular approach in video magnification. However, amplifying the subtle changes in video produces amplifying the subtle noise. This paper proposes an approach to reduce amplified noise in magnified video for both type of changes amplifications, color and motion. The proposed approach processes the resulted video from Eulerian algorithm whether linear or phase based in order to noise cancellation. The approach utilizes wavelet denoising method to localize the frequencies of distributed noise over the different frequency bands. Subsequently, the energy of the coefficients under localized frequencies are attenuated by attenuating the amplitude of these coefficients. The experimental results of the proposed approach show its superiority over conventional linear and phase based Eulerian video magnification approaches in terms of quality of the resulted magnified videos. This allows to amplify the videos by larger amplification factor, so that several new applications can be added to the list of Eulerian video magnification users. Furthermore, the processing time does not significantly increase, the increment is only less than 3% of the overall processing compare to conventional Eulerian video magnification.


INTRODUCTION
It is difficult to perceive changes with small capacitance in around us by human eye because of our limited temporal spatial sensitivity [1]. These changes may contain useful information that can be used in many applications, especially in the field of biomedicine. For example, it is difficult for a human to see the arterial pulse in different parts of the human body, but the movement can be magnified to measure heart rate and pulse length [2]. Another example, the blood circulation causes invisible changes in skin colour that can be amplified to measure heart rate [3,4].
Because of the important applications which are mentioned above, many studies have been proposed to for video magnification. The first study by Liu et al. [5] has proposed a motion magnification technique based on the Lagrange perspective to amplify subtle motion in the video sequence in order to detect interesting mechanical behavior. However, the algorithm in this study is computationally expensive, because of it relies on an optical flow and feature tracking algorithms. Moreover, noise in video sequence is significantly amplified. In order to reduce complexity, Hao et al. [6] have proposed an efficient magnification  (EVM) and becomes one of the standards that used in video magnification. The method is used to track liquid voxel properties such as speed and pressure that evolve over time.
Two main types EVM are existed depend on the method of multiple scale decomposition, linear based and phase based. In linear-based EVM (LB-EVM) [6], Laplacian pyramid decomposition method is applied to analyse a source video into multiple-spatial scales, followed by the temporal filter of the specific frequency bands. The outputs of the temporal filter are then amplified by increasing the energy using magnification factor and added back to the original decomposed. Finally, the processed frames are reconstructed by collapse of the Laplacian pyramid. Although LB-EVM succeeds in amplifying motion and colour changes in video clips and eliminates the need for a costly optical flow calculation [5], it supports small magnification factors at high spatial frequencies and increases the noise level linearly as the magnification factor increases. In addition, during a colour is magnify, some unwanted movement also magnified. To solve LB-EVM problems, Wadhwa et al. [7] have proposed a new Eulerian method, based on complex steerable pyramids [8], which is phase-based optical flow methods [9]. The phase-based EVM (PB-EVM) method supports larger magnification factors. However, it is more complex than LB-EVM, so that it requires significant longer time to implement than LB-EVM. In general, the acceptable accuracy of EVM helps to employing it in several applications such as material engineering, mechanical engineering, human health care and so on [10][11][12].
In order to reduce execution time, a new pyramid in [13,14], which is called the Riesz pyramid Liu et al. [15] proposed a way to improve LB-EVM after processing, which is called enhanced EVM (E2VM). The efficient motion magnification system (EMMS) method has been developed to improve processing speed [16], which depends on wavelet decomposition. This method improves the speed of implementation and reduces noise. However, it supports only relatively small magnification factor. This paper proposes an enhanced approach for LB-EVM and PB-EVM in order to reduce significantly the noise of the magnified video. Also the proposed approaches can attenuate the unwanted subtle motion in case of colour magnification. The proposed method superior in terms of magnified video quality compare to conventional EVM methods. The proposed work uses a wavelet transform to detect and remove noise from the magnified video frames.
The rest of the paper is organized as: section 2 provides background information about LB-EVM and PB-EVM. Also, the principles of denosing based wavelet are described briefly. Section 3 explains the proposed approach. The simulation results and discussion are given in Section 4. Finally, conclusions are presented in section 5.

BACKGROUND 2.1. Linear-based Eulerian video magnification
The small movement amplification can be achieved through computer processing [5,17] based optical flow by temporal processing using Taylor first-order series extensions [18]. This technique named LB-EVM and it is linear processing. In this technique, the input video frames decompose into multiple spatial bands using the entire Laplaceian pyramid [6,19,20]. The Laplacian pyramid is a data structure where the size of the image is downsampled in successive sporadic density, until so there is no additional downsampling possible. The temporal filter is then applied to extract the interest frequency bands to be amplified and then multiply by the desired magnification factor. Subsequently, the magnified bands are combined with the frames that are entered to the temporal filter. Finally, the resulted magnified frames are reconstructed by retrieving the original scale from the multiple scales.
The basic disadvantage of this method is the failing with increasing magnification factor. This is because the original noise increases linearly with increasing magnification factor [21]. Thus, this method is efficient in magnifying colour changes when the magnification factor is small. Figure 1 shows the working mechanism of LB-EVM.

Phase-based Eulerian video magnification
LB-EVM supports relatively small magnification factors, which can greatly amplify noise when increasing the magnification factor. As a result of these reasons, the method of motion processing was developed in [7] and is based on complex steerable technique [8,9,22]. PB-EVM is inspired by motion without movement [23] and phase-based optical flow [9]. The basic functions of the transform are similar to Gabor wavelets [24].
This phase-based technique improves the LB-EVM method, it supports larger magnification and has much better noise performance. Because the linear method amplifies changes in the temporal brightness, the noise amplitude is amplified linearly. In contrast, this method modifies the phase, not the amplitudes, which do not increase the amplitude of spatial noise linearly. This method that it increases the differences in phase by the magnification factor that can amplify hidden movements. These pyramids rely on Fourier analysis to analyze the image into sub-domains and phase. The main drawback in this method is the long processing time [21]. Figure 2 shows the working mechanism of PB-EVM [7].

Wavelet denoising methods
In order to reduce noise of EVM, wavelet base denosing is used in this paper. The process of image de-noising by wavelet, consists of the following main stages: 1) wavelet transform, 2) Estimate a threshold, 3) apply the threshold, and 4) inverse wavelet transform. Figure 3 shows the block diagram of the wavelet denoising method. vertical, and diagonal parts of the image. The original image is divided into four elements: LL, HL, LH, and HH through the application of horizontal and vertical filters. The sub-band gives the LL approximately or the average of the original image. The other three sub-bands are details representing wavelet coefficients. The HL1, HH1, and LH1 subdomains represent the detail coefficients, while the LL1 sub-band denotes low-level coefficients [25,26]. The two-dimensional decomposition of the wavelet transform is achieved by additional decomposition of the LL1 sub-band as shown in Figure 4. By determining the thresholding of these detailed wavelet coefficients, the image de-noising is accomplished while maintaining its fundamental features.  After decomposition, it is subject to the wavelet threshold that will select and analyze the specific wavelet coefficients. Wavelet threshold is a technique for estimating the signal that takes advantage of wavelet transform possibilities to de-noising the signal. The basic threshold types are hard and soft thresholding. In hard threshold, the wavelet coefficients are reset to zero if they are less than threshold level, and remain as it is in otherwise [27]. In this method many artificial noise points are produced at the edges of the images, resulting in image distortion. The new wavelet coefficient values (Cn) are determined by (1) that set to original coefficient values (C) if these values greater than the threshold (ε) and set to zeros otherwise.
In soft threshold, the thresholds produce based on a visually interesting of the image [28]. It can overcome the shortcomings of the hard threshold algorithm. So the results processed relatively smoothly. Soft threshold function is given by (2): In our proposed method, the soft threshold method is used to analyze the performance of the denoising system for different levels of DWT decomposition, because of the soft threshold leads to a less severe distortion of the object of interest than the other thresholds techniques. Finally, the inverse wavelet conversion is done to obtain the reconstructed image.

PROPOSED APPROACH
This section illustrates the overall proposed approach. The approach utilizes the same algorithm of the conventional LB-EVM and PB-EVM. However, an important post-processing stage is added in such a way to overcome the problem of noise magnification in the magnified video frames. This is results a significant improvement in the quality of the magnified video. The steps of the proposed approach are as follows.
The video file is read as AVI format, then converting all video frames from RGB space into NTSC (or YIQ) space. The Y component denotes the information of illumination; I and Q denote the information of the chrominance. The YIQ colour system is aim to benefit advantage of human response characteristics to the colors. This step is done by applying (3)  The next step is applying the spatial filter. For LB-EVM method, the Laplacian pyramid decomposition is applied on Y-layer for each video frame in order to decompose the source frames in different spatial bands. While in PB-EVM the steerable pyramid decomposition is applied on each layer (Y, I, and Q) of the video frames individually. The decomposition is used in order to factorize the video frames into scalable images for different levels of decomposition. The steerable pyramid [8] is a transform that analyzes an image based on spatial scale, orientation, and position.
The resulted bands from the previous step entered to temporal filter to pass only the interest bands of frequencies for amplifying. Subsequently, amplification process is applied on the filtered frames. This is attained by multiplying the result frequencies band from the temporal filter by the amplification factor. Then, the amplified filtered frames combine with the unfiltered frames. In order to reduce the amplified noise in each resulted frame from the previous stage, the denoising process based wavelet is applied. Daubechies type 4 is used as a wavelet function with five level of decomposition and soft threshold is applied in denosing stage. Finally, Laplacian or Steerable pyramid reconstruction is applied on the denoised frames depending of the type LB-EVM or PB-EVM and converted the reconstructed magnified frames from YIQ space into RGB space to obtain the original colour of video. This step is done by applying (4) Finally, we get the final video after processing. Figure 5 shows the working mechanism of proposed approach. Five source videos are used in our tests that are shown in Figure 5. All the used videos in our tests have an AVI format. The tested videos that are shown in Figure 5 include: the baby with dimension 960×544×3, number of frames is 301 frame, and a frame rate of 30 fps, the eye with dimension 1152×896×3, number of frames is 120 frame, and a frame rate of 500 fps, the camera with a dimension 512×384×3, number of frames is 1001 frame, and a frame rate of 300 fps, the face with dimension 528×592×3, number of frames is 301 frame, and a frame rate of 30 fps and finally, the guitar with dimension 432×192×3 , number of frames is 300 frame, and a frame rate of 600 fps. In order to verify the superiority of the proposed approach over the conventional approaches in terms of video quality, we measure the measure the magnified video quality and execution time for both of the proposed and conventional approaches using the same computer and the input videos. In order to measure the video quality, several evaluated functions are used, which include the following. a. Peak signal-to-noise ratio (PSNR): The measurement is achieved according to (5) by dividing the square of maximum pixel intensity over the mean square error of each video frame. Subsequently, the average value of the PSNR of the entire video frames is calculated to get the final required PSNR: where MSE is the mean square error, I and Ia are the original and the amplified frames respectively, M and N are the frame dimensions. b. MAXERR: is the absolute maximum squared deviation of the input video to the output video. c. L2RAT: is the ratio of the squared base of the output video to the input video. d. BRISQUE: The BRISQUE algorithm allows for the assessment of perceived quality using a model based on natural images with self-ratings instead of a reference image.
We have achieved the tests by applying octave-bandwidth pyramid for PB-EVM, and IIR, FIR, Butterworth, and Ideal band-pass as temporal filters. Several tests have been attained for the five videos as shown in Figure 6. a. For the first video (baby that is shown in Figure 6 Figure 7 shows sample frames (frames with orders 1, 20, 45, and 60) of the source, magnified frames using conventional LB-EVM and magnified frames using the proposed approach based LB-EVM at magnification factor α=20. Also, Figure 8 shows sample frames (frames with orders 1, 20, 45, and 60) of the source, magnified frames using conventional PB-EVM and magnified frames using the proposed approach based PB-EVM at magnification factor α=200. It is clear the superiority of the proposed over the conventional ones in noise reduction. The proposed overcomes the problem of linear noise magnification in conventional LB-EVM, also reduces noise significantly compare to conventional PB_EVM for large magnification factor. Table 1 shows the experimental results of both LB-EVM and PB-EVM methods for the proposed and conventional ones. b. For the second video (camera that is shown in Figure 6 Table 2 shows the experimental results of both LB-EVM and PB-EVM methods for the proposed and conventional ones. c. In the third video (guitar that is shown in Figure 6(c)), FIR is used as a temporal filter, α has values {40, 50, 60} for LB-EVM and {40, 60, 120} for PB-EVM. The boundary of temporal filter frequencies are {72-92 Hz} for the both LB-EVM and PB-EVM. The sigma has been selected 2 for all the tests. Table 3 shows the experimental results of both LB-EVM and PB-EVM methods for the proposed and conventional ones. d. For the fourth video (face that is shown in Figure 6(d)), ideal band-pass is used as a temporal filter, α with values {50, 60, 100, 150, and 200} using LB-EVM. The boundary frequencies for the band-pass temporal filter are {0.83333-1 Hz}. This video is used in experimental tests in order to examine ability of the proposed approach to detect and magnified the colour variation, also to detect and attenuate movement variation. This is done based LB-EVM by increasing number of decomposition in spatial domain to 7 levels. In our experiments, we see increasing number of decomposition lead to increase the detection of colour variations, while decreasing the movement that we want to attenuate it because it is not our interesting. Based our experimental tests, we conclude that for the videos with colour variations linear-based method is better in order to reduce the unwanted motion in magnification process. Figure 9 shows sample frames of the source, magnified frames using conventional LB-EVM and magnified frames using the proposed approach based LB-EVM at magnification factor α=200. From the figure, we can see clearly the frame quality of the proposed approach better than the conventional one. Table 4 shows the experimental results of for the proposed and conventional LB-EVM methods. e. For the fifth video (eye that is shown in Figure 6(e)), FIR is used as a temporal filter, α with values {65, 75, 85, 120, and 200} using PB-EVM method. The boundary of temporal filter frequencies is {30-50 Hz}. The value of sigma has been selected 4 for all the tests. Table 5 shows the experimental results of PB-EVM method for the proposed and conventional approaches. In all the tests of the tables we see obviously superiority of the proposed approach in terms of magnified video quality. Furthermore, in high magnification factors, the proposed approach resists the noise, while the noise in conventional LB-EVM linearly increases and that's lead to fail of the conventional one with increasing α. Although high improvement in the magnified videos in terms of quality for the proposed approach is verified, the processing time does not increase significantly, where the increment in processing time less than 3% from the entire execution time using same software resources.

CONCLUSION
This paper has presented an efficient approach to reduce noise of magnified videos based EVM. The proposed method employs wavelet transforms as a denoising tool and adds a pos-processing stage for conventional LB-EVM and PB-EVM. The experimental results show the superiority of the proposed approach over conventional linear and phase based Eulerian video magnification approaches in terms of quality of the magnified videos. This allows amplifying the videos by larger amplification factor, so that new important hidden movements or colour variations can be detected. The processing time does not significantly increase; the increment is only less than 3% of the overall execution time compare to conventional EVM. Furthermore, the increasing levels of spatial decomposition in the proposed approach eliminate unwanted movement in colour variation magnification, which causes a distortion in the magnified videos.