Objects detection and tracking using fast principle component purist and kalman filter

Received May 24, 2019 Revised Oct 15, 2019 Accepted Oct 27, 2019 The detection and tracking of moving objects attracted a lot of concern because of the vast computer vision applications. This paper proposes a new algorithm based on several methods for identifying, detecting, and tracking an object in order to develop an effective and efficient system in several applications. This algorithm has three main parts: the first part for background modeling and foreground extraction, the second part for smoothing, filtering and detecting moving objects within the video frame and the last part includes tracking and prediction of detected objects. In this proposed work, a new algorithm to detect moving objects from video data is designed by the Fast Principle Component Purist (FPCP). Then we used an optimal filter that performs well to reduce noise through the median filter. The Fast Regionconvolution neural networks (FastRCNN) is used to add smoothness to the spatial identification of objects and their areas. Then the detected object is tracked by Kalman Filter. Experimental results show that our algorithm adapts to different situations and outperforms many existing algorithms.


INTRODUCTION
The process of tracking is one of the main tasks in computer vision [1]. It has an important turn in many areas of research, such as movement objects, recognition and analysis human and nonhuman Vitality, 3D representation, mobility in vehicles, and others. Object tracking is the most common in automated monitoring applications because the individual human employer cannot manage the controlled area, especially when the number of cameras raises. Also, in the medicinal application, the operator cannot sometimes analyse the video taken by the device. It is also used in security systems, traffic management systems, and others. The tracking system can track single or multiple animation objects in different environments [2]. In general, the object detection and tracking system includes the different stages like background subtraction, object detection, and object tracking as shown in the block diagram in Figure 1.
In many fields of image processing and computer vision applications, the background modeling is an important step, also known as foreground detection, which extracts the foreground of the image for subsequent processing (such as object selection and identification) from background. These are the most important areas of the picture, which are called objects such as humans, cars, texts, etc. This stage may be after the preprocessing phase of the image, which may include noise reduction of images, and before the subsequent treatment phase such as morphology, etc [3]. A more common way to detect moving objects in videos is background subtraction. The basic principle is to detect moving objects from the difference between the current frame and the reference frame, often called a "background image" or "background model". Some common

RELATED WORKS
Since the last few decades, many researchers have proven algorithms for detecting and tracking objects. In this section, we demonstrated some of these algorithms related to the proposed system. According to [6], motion brim is extracted in polar-log coordinate, then the gradient operator is employed to compute the optical flow directly in every motion regions, then the object is tracked. In the proposed work in [7], the active background reconstructed and the object size determined as a preliminary task, to extract and track the object in the foreground .The method in [8] is the object detection is done by Gaussian Mixture Model (GMM), and the tracking is done by Kalman Filter. In this method, Object detection is determined based on the size of the foreground. Therefore Errors will occur in determining the object such as the object and its shadow are merged as an object or representing two adjacent compounds as a single object. The paper [9]  developed; the algorithm includes optical flow and the motion vector estimation for object detection and tracking. The detection and tracking system in [10] is sophisticated depend on optical flow for detection; the object tracking is done by blob analysis.
In Prabhakar et al.
[11], a moving object tracking system using morphological processing and blob analysis, which able to distinguish between car and pedestrian in the same video. In the paper [12], the foreground is extracted from the background using multiple-view architecture. After that, the forward movement date and editing schemes are used to detect the animated objects. Finally, by detecting the center of gravity of the moving object, it is used to trace the object based on the Kalman filter. In the method [13], animated objects are represented as groups of spatial and temporal points using the Gabor 3D filter, which works on the spatial and temporal analysis of the sequential video and is then joint by using the Minimum Spanning Tree. The proposed technique described in [14], split into three stages; Foreground segmentation stage by using Mixture of Adaptive Gaussian model, tracking stage by using the blob detection and evaluation stage which includes the classification according to the feature extraction. The proposed work in [15], it merged Background Subtraction with Low Rank techniques for effective object detection. Finally the suggested algorithm in [16], it employed Background Subtraction and K-Means Clustering techniques to detect the moving object and tracking it. After exploring some of the published research on the detection and tracking of the object, it was found that the discovery and tracking of the object is a complex task because of many elements of dynamic tracking such as determining the type of camera moving or static, the random change of the speed of the object, the intensity of light and darkness, etc.

METHODOLOGIES (MATHEMATICAL BACKGROUND) 3.1. Fast Principal Component Pursuit
FPCP was recently suggested as a powerful alternative to Principal Components Analysis (PCA). This method will be used in various applications, including foreground/background modelling, data analysis, whether in text or video format and image processing. The PCA was formulated initially [17].
Where D ∈R m×n is the observed matrix, ||L||o is the nuclear norm of matrix L (i.e. ∑ | ( )|) and ||S||1 is the l 1 norm of matrix S. Numerous changes have been made to eq. (1) by changing the restrictions on sanctions and vice versa. So that the eq. (1) became: The constraint ||L ||o < t is active, represents a constraint of equality, so it is suggested that the algorithm ranks the same rather than relax the nuclear base, so the function is as follows This adjustment ignored the initial selection of the parameter λ, the background modelling compound L is often low, and in practice, there is no difficulty in selecting the appropriate value for t. A normal process to solve (3) is the alternative minimization: The (3) can be solved by taking a partial Singular Value Decomposition SVD of (D -Sk) with respect to t. while the eq. 4 can be solved by element-wise shrinkage. The background of the videos is supposed to lie in a low-level sub-space, and the moving objects should be in the foreground as if they were gradually soft in the spatial and temporal direction [18]. The proposed method integrates the Frobenius and l 1 -norm base into a unified framework for simultaneous noise reduction and detection. The Frobenius base uses the low-level property in the background; the contrast is improved by the l 1 norm standard.

Noise filtering
Animated digital pictures often overlap with a set of noise based on prevailing conditions. Some of this noise is very disturbing when implicated in altering the intensity of video frames. It spoils pixels randomly and divides into two extreme levels: relatively low or relatively high, compared to adjacent pixels [11]. Subsequently, it is necessary to apply refinement technicalities that are able to handle various types of noise. Morphological processes are performed to extract important features of useful images in the impersonation of shapes in the region and their description. We have used both the morphology of the closure and corrosion, respectively, to remove parts of the road and unwanted things. After the morphological closure process, provided that the appearance of the object is not destroyed, and that many small punctures and separate pixels are filled in the form of one large real object. The following is the definition of morphological closure process and the applicable structural element B. Where: The matrix P, which includes moving object information, is obtained through the detection process. An integral part of the morphological expansion and erosion processes is a structural element of a flat shape. There is a binary flat structure element with a living value, either 2-D or multi-dimensional, in which the real pixels are included in the morphological calculation, and false pixels are not. The middle pixel of the structure element, called the parent, determines the pixel in the image being processed [11]

Fast-region convolution neural network
The reference to the recent advantages of convolutional neural networks (CNN) has been very successfulin a variety of computer vision tasks, especially those associated with detecting objects. In this research, we use CNN networks to identify the commercial object as a supervised learning task [19]. The Fast R-CNN Object Detection tool (Regions with Convolutional Neural Networks) is used to locate detected objects that are returned as a set of peripheral boxes. The detector has high confidence in the discoveries. Annotate the image with the surrounding squares of the detector and the corresponding detection grades. These peripheral boxes were called area suggestions or object suggestions. Zone suggestions were just lists of square boxes with a small probability of containing an object. The frame work of Fast R-CNN as shown in Figure 2 [20]. In many applications where the calculation is time-consuming, one can use point analysis to eliminate points that do not matter based on specific spatial properties and retain relevant points for further analysis The object corresponding to the point area is detected as a composite object and features as a bounded box.

Kalman filter
Object tracking is a way to find and create a path to the object that was discovered. In this search, the Kalman filter method was used to track an object in sequence with captured video [21]. The Kalman filter is a linear approach that operates in two basic phases of prediction and correction (update). The prediction phase is accountable for the scoop of the next state and position of the present object. However the correction phase provides the parameters with their instance, they combine the actual measurement with the previous estimate to improve the trajectory where the object information detected in the previous frame is used and provides an estimate of the object's new position. The Kalman filter has the ability to rating the tracking locations with minimal datum on the location of the object. Initially, the status St and measurement Xt paradigm are determined to predict the next site as shown in Figure 3. The Kalman filter is a repetitive algorithm to estimate the evolving state of a process when making measurements in the process. The Kalman filter is linear when the evolution of the case follows the linear motion model and the measurements are linear functions of the case. The Kalman filter can accurately track motion based on adaptive filtering using the state area model. It is necessary to design a dynamic model of the target movement which is the fixed speed model (CV), which assumes that the speed is constant during the sampling period. This model has been used in various applications for its ease and effectiveness [22]. Where; A -state transition matrix, B -coverts control input, Q -process noise covariance K-Kalman gain, X-measurement matrix, S -measurement error covariance and H -model matrices. The prediction of the next state St+1 is done by integrating the actual measurement with the pre-estimate of the situation St-1.

THE PROPOSED SYSTEM
In this pager, object detection and tracking algorithm, a collection of two famous computer visibility technologies, Fast Principle Component Purist (FPCP) and the Kalman filter, was introduced. FPCP is used in the object discovery phase. It provides quick and delicate object detection on other methods such as background subtraction. FPCP does not provide the path of motion, instead it supplies acquaintance about the orientation of the object and its motion in vector form. The Proposed algorithm of FPCP as shown in Algoritm1. This algorithm has a simpler and clearer style. Since the video background modeling application component L is usually very low, we suggest a simple procedure for estimating the upper limit of r, so we can estimate the contribution of the new single vector; if this input is small enough, we stop increasing the value of r. In our experimental simulations, this corresponds to an estimate of the order made by the inexact ALM. This algorithm is used LMSVD (Limited Memory Singular Value Decompositions). This function is based on  [23]. Our computational results also show that our proposed algorithm has low memory space that can be applied in real-time video applications, faster than modern applications and deliver comparable quality results.
The proposed system as showin in Figure 4. It shows the detail stages of the system are followed to that deals with background separation, object and feature extraction. At first, the video is taken by the stationary camera. The video is only a series of cascading frames, so the object detection manner must first detect the moving object in these cascading frames. Then the algorithm converts the video into two dimension matrices to facilitate handling in mathematical calculations, to reduce the time calculation and memory requirementsThe processing includes removing the noise. Then the process status of all pixel is tested by Fast RCNN and clustered it to detect the object. Initialize the tracking stage and update the tracker in every new frame. This system has many features, including the possibility of tracking more than one object and the speed of response to a change in speed and change in the scale. Figure 4 shows the original video was converted to individual frames to be processed in next stages. The foregrouial nd was extrac ted by the FPCP detection and showed holes and noise on the frame. In order to clarify and soften the frame, the morphology was done. The foreground frame is shown after the morphological process and the final output frame which includes tracked objects with bounding boxes.

RESULTS AND PERFORMANCE ANALYSIS
The algorithm proposed in MATLAB (2018b) has been applied, and their experiments were performed on a Computer type MSI GV63 with Intel Core i7 8750H, NVIDIA RTX 2060 6G, 256GB SSD+1TB and 16 GB RAM. It has three stages are foreground detection, filtering and tracking. The proposed algorithm detects the movable objects accurately and keeps track of their appearance in the sequence video frames. Video data has been used in any format as an input to the proposed work, and good results have been obtained in various article conditions on this indoor, outdoor, light traffic and dense traffic. The efficiency of the proposed algorithm was evaluated, the experiential outcomes were as follows Several sampled video is used in the various environment in order to test the performance of the proposed algorithm. The experiential outcomes are given in Figure 5. The first column as shown in Figure 5(a) shows the sampled frames of the video then the second column as shown in Figure 5(b) shows a clean foreground extracted frames by FPCP detection are given. The third column as shown in Figure 5c) includes the detected and tracked objects by marking it with a circumferential box. Table 1 explains the mean execution times (sec) of the proposed system when implemented on 365 sampled video frames by Matlab version 2018b. Accuracy is a measure of the performance efficiency of the object tracking system. The detection and tracking system precision can be calculated using the next formula:

Accuracy =
The total number of detected objects by system The total number of actual objects in video The proposed algorithm for the different video input has been tested with different methods to evaluate its accuracy. The accuracy of the proposed tracker in different input scenes was compared and compared with other tracking systems, as shown in Table 2. It shows that the detection and tracking accuracy rate using the proposed algorithms is 100%. The results are optically acceptable except for an algorithm that proves that this multi-object tracking method is validated. It was concluded that the proposed algorithm was still competitive, although some results were closest to another, with little degradation in some as a result of the cost of some complex calculations. The detection precision of the suggested algorithm is compared with other known and present methods. The comparison shows the efficient performance of the proposed method on some of the selected frames shown in Figure 6. We compared the proposed algorithm with the most representative algorithms and for different frame sizes and settings for the tested videos; we used gray or chromatic video sequences. The results were comparable to the proposed algorithm with other algorithms. To evaluate the visual performance of the proposed algorithm, we compared the proposed algorithm to 3 algorithms. The videos examined contain different background scenes, and multiple moving objects both outdoors and indoors (pedestrians, vehicles, etc.). We have chosen the following most methods to compare with our proposed method: (1) The Background Subtraction (BG SUB) [15], (2) GMM method [8], and (3) Optical Flow (OF) [10]. Visual results are shown on the videos tested in Figure 6. Individual and group infantry, small dynamic background, and multiple traffic surveys, as shown in Figure 6, the proposed algorithm is closest to Ground Truth (GT). Some of the results of the tested algorithms consider the foreground object as the background. The main reason is that the parts of the object remain static in the video, and that the proposed algorithm has overcome this effect, and obviously the detection effect is better than other algorithms.
In order to enhance the robustness and efficiency of the proposed system, a second comparison was made between Kalman Filter Tracker and Mean Shift Tracker [24] as shown in Figure 7. The Mean Shift algorithm lists the non-parametric density that finds that the picture frame is very similar to the color histogram of the object in the current frame. Mean-Shift tracking frequently increases appearance similarity by comparing the object layout and the window layout around the location of the filter object. This algorithm has the ability to track in real time, the nonlinear moving objects, and has good applicability with object distortion and rotation. However, the Mean Shift algorithm does not use the object's direction and speed information to track an object, and it is easy to lose an object when there is interference (such as light and scattering) in the surrounding environment and cannot deal with scale and clutter differences [25]. To achieve a good trace effect, it is extremely important to improve the performance of the trace algorithm by choosing a distinct feature to create the object model, so that the characteristics of the object and the background are clearly different. Table 3 shows some advantages and disadvantages of both methods.

CONCLUSION
The algorithm provided was implemented in Matlab. Frames are processed in different sizes using msi type Intel Core i7 8750H computer. The proposed algorithm shows its advantage over existing object methods more precisely from the results obtained.The proposed algorithm was compared with existing video file algorithms where the results show the efficiency of the proposed work. The proposed system can process gray and color videos at few seconds per GPU-Matlab application frame. The proposed algorithm can adapt to background changes. Object detection and tracking are main and affront mission in many computer visibility implementations, such as monitoring, car saltworks, routing, and automation. This algorithm presents several benefits, such as multiple object detection and tracking in different environments. The disadvantages of this technique using one method will not produce perfect results because its accuracy is influenced by different operators such as the low resolution of captured video, change in weather. Etc. In the future, we hope to expand our scope of detection and tracking of objects in overcrowded scenery or the appearance of severe contrast in lighting and real-time scenes.