Three-dimensional structure from motion recovery of a moving object with noisy measurement

ABSTRACT


INTRODUCTION
The progress in structure and motion estimation (a.k.a. structure-from-motion) research has been hectic, stimulated by recent breakthroughs in computer vision, the advent of digital photography and the augmented reality [1][2][3][4][5][6]. This progress has the potential to substantially increase the use of the structure from motion technique for a variety of applications, for example the growing application of unmanned aerial vehicles for remote surveying for a numerous of ecological domain [7]. Wide-reaching marine assessments using this technique have recently become possible in some cases like in [8,9] with drone-based application. The structure from motion technique can be used for topographic data collection in field and laboratory studies [10] and as a means of digital preservation and for documenting archaeological excavations, cultural material and architecture [11]. On the other side, structure from motion can be a good low-cost alternative to generate high resolution topography [12], where light detection and ranging data is unaffordable or scarce. Recently in the area of agriculture [13], the use of unmanned aerial systems (UAS) based on the structure from motion technique as remote-sensing platforms have massive potential for obtaining detailed of crop features. The structure and motion field of research is worried with the recovery of 3-D geometry of the dynamic scene (the structure) when observed through a moving camera (the motion). Basically, structure from motion involves three main steps: First extracting features in images and matching these features between images, then modeling the camera-object relative motion and finally recovery of the 3-D structure using the estimated motion and features. Keeping in view the above literature, several works have addressed structure estimation observer based approach where full velocity parameters feedback of the calibrated camera was provided. Such as in [14] where authors designed a nonlinear observer to estimate an unmeasurable state called depth with known dynamics. That last one has been experimented on a mobile robot with an on-board camera. Authors in [15] have introduced a nonlinear observer for a particular case of feature points on the object moving with constant velocities and have approved in many practical scenarios. Although, in [16] a nonlinear observer is defined to recover structure and motion with less restrictive assumptions on the moving object motion. A reduced-order nonlinear observer is presented in [17] to estimate the range from a moving camera to a feature point on a static scene. Furthermore, a design of complete order observers based on nonlinear contraction theory and synchronization is given in [18] where angular and linear velocity measurements are also noisy.
The information of the camera motion parameters has been unavoidable in the preceding cited references. Various studies on structure from motion estimation are also available where the camera motion is not known. Starting with [19], sliding mode observers were presented to estimate the motion parameters and the structure of a moving object with the aid of a change-coupled device (CCD) camera. The advantage presented by the proposed observers is that both rigid and affine motion parameters, constant or timevarying, can be estimated correctly. In the uniqueness context, [20] introduced a developed nonlinear reduced order observer which only requires one camera linear velocity to estimate a stationary object seen by a calibrated camera. The methods described in [21,22] present nonlinear observers based on a Robust Integral Signed Error method (RISE) to estimate the unknown distance between the camera and the object and the moving camera velocities. This problem was also investigated in [23] where a nonlinear reducedorder observer is proposed to recover the feature point depth and camera linear velocity. Only the camera's angular velocity is assumed to be known. Authors described in [24] a new approach based on Extended Kalman Filter to simultaneously recover camera pose and the structure of non-rigid extensible surfaces. In order to extend the problem to a deformable domain, authors defined the object's surface mechanics by means of Navier's equations. A recent paper [25] addresses the case where a novel complete-order observer is designed to estimate the unknown motion parameters and feature depth in the presence of measurement noise. The observer is derived from a differentiator based on the sliding-mode technique.
This paper, tackles the problem of motion and structure recovery for a class of system consisted on a moving camera moving object. Naturally, motions are constructed in continuous time settings and the motion parameters are assumed to be all time varying. The 3D position is estimated by using a set of image data observed through a dynamic camera with varying focal length. The contributions of this paper are first the analysis of the extent to which a scheme can be developed that is guaranteed to converge by observing a single point and having an unknown object motion. In addition, for a more accurate treatment, this paper extensively validates this approach for both static and dynamic object in the presence of measurement noise.
The remainder of this paper is organized as follows: Necessary preliminaries and state dynamics formulation are sought in Section 2. Section 3 presents the design of the Nonlinear Unknown Input Observer NLUIO to estimate structure of a feature point where LMI-based formulation is developed to prove asymptotic convergence. In Section 4 the simulation results are demonstrating the robustness of the approach in the presence of measurement noise. Finally, concluding remarks are drawn in Section 5.

STATE DYNAMIC FORMULATION
In this section an overview of the perspective relationships and basic kinematic is given modeling a camera which moves and observes a moving object. Most of the concepts can be found, for example, in [21] and [26]. Consider a scenario in Figure 1 where the motion of a single moving object is viewed by a moving camera undergoing rotation and translation. The equation of a feature point in the object can be presented in the reference frame as , X Y and Z are the unknown Euclidean coordinates of feature point in the camera's inertial frame. 3 ( ) x t being perpendicular to the camera's image plane is the inverse of an unmeasurable focal distance.
of the the un

3-D str
The equati e moving sce nknown motio  x The overa ulation, the NL Eng, Vol. 10, 3 1 3 3 is the unknow (Observabili of the camera feature po and A the nonlinea r the same poŝ

OBSERVER FORMULATION
In this section, an asymptotically converging NLUIO is constructed, the state of which follows the state of the dynamics system given in (5) as closely as possible further in the presence of an unknown input. For the rest of the study it is going to be assumed that the following conditions [29] are satisfied:  H is assumed to be column rank matrix Where q the number of the unknown input.
With above conditions, the NLUIO for system represented by (5) can be shown as follows is an estimate of ( ) x t and ( ) n z t  is the state observer. Matrices Where K and E are gain matrices of suitable dimensions subsequently designed.
The error equation for system (5) and NLUIO (8) is defined as follows By substituting the system output presented in (5) into the error equation in (10), the dynamic error   e t  will have the following form Then substituting (5) and (8) into (11), the dynamic error can be expressed as follows To obtain matrices, the following steps should be followed: First using (9) Then the equation of the error dynamics in (12) yields to The condition in (13) can be written as After that, a solution exists for matrix E using generalized inverse as follows Finally, by substituting E into (9) the only unknowns are matrices K and Y . The following section presents a theorem that gives a sufficient condition for choosing them.

LMI sufficient condition
Theorem: The error ( ) e t will converge asymptotically to 0 for any initial value (0) e and the NLUIO in (8) Where min  and max  are the minimum and the maximum Eigen values of P . By expanding the Lyapunov candidate function of (18) along the error equation in (14) the following expression is obtained Note that there is no systematic way to obtain the adaptable NLUIO parameters directly from condition (10) and the expression in the theorem given by (17). This allows to reformulate them as LMIs.

T T T T T T T T T A I FC P P I FC A A C G P PGCA C P PC
Variables 1 P PY  and 2 P PK  are given to make the resolution of the nonlinear matrix inequalities easier. Exponential convergence to the object coordinates is achieved.

LMI formulation
For the NLUIO synthesis the following LMIs (23) have feasible solutions for P, K and Y invoking the inequality in (22) transformed with schur's complement.

RESULTS AND DISCUSSIONS
In contrast with previous research that assume noise-free measurements and demand prior knowledge of the object and camera motion, the proposed method assume that the object velocity is unknown. In the following, the performance of the NLUIO is validated through different numerical simulations in the presence of measurement noise for both static and dynamic scenes. As the current simulation results are restricted to tracking a single point feature. Two different object motion models are considered, and the proposed NLUIO performance is evaluated for both cases. Whereas the usual speed of the monocular camera is 30 frames/s, the NLUIO is valid for a continuous-time system. For the simulation results, SIMULINK is used with sampling period . Since initial target feature point is not known at the NLUIO, thus the system and observer start from different initial conditions. Initial condition for the observer is taken as Matrices A , C and H are given by Note, the third component 3 x of the state, which is the unmeasurable distance between the camera and the moving object. Clearly the estimation of the three-dimensional Euclidean coordinates can yield the distance estimation. The comparison of RMS error values obtained with the proposed NLUIO with different values of measurement noise was used to demonstrate the proposed method.

Static scene
In this case, the object and the camera velocities parameters are chosen respectively by Figure 3 shows the structure estimation of the object position in the single camera images. Figure    Next, the measurements of c V as shown in Figure 5 is assumed to be corrupted by adding a Band Limited White Noise (BLWGN) with 5% of power, a correlation time of 0 and a covariance of infinity. Figure 6 shows the structure estimation of the static object position in the single camera images in the presence of measurement noise. The error in the position estimation of the static object is described in Figure 7. Only the third component of RMS error is changed e 3 = 0.0816. In addition to the previous measurement noise of c V , the proposed observer is validated for robustness by the addition of a Band Limited White Noise (BLWGN) with 5% of power to the object velocity. Figure 8 shows the structure estimation of the dynamic object coordinates with noisy object and camera velocities and Figure 9 describes the error in the position estimation of the dynamic. The NLUIO then yields uniformly asymptotically convergent estimates of the three-dimensional Euclidean coordinates of the feature point. In the presence of noise in the motion parameters, the estimated state 3 x is corrupted directly by the source of noise, therefore the third component of RMS error increases and becomes e 3 =0.2048.

Dynamic scene
In this case, only the performance in the presence of measurement noise for both camera and object velocity is studied. The same values of measurement noise are used. The object and the camera velocities are chosen respectively as:   Figure 10 presents the structure estimation of the dynamic object position and Figure 11 shows the error in the position estimation of the object with noisy camera velocity. These results demonstrate that the proposed NLUIO based object structure estimation method can achieve satisfactory performance even with camera velocities. This observer gives better estimates for a significant level of noise even changing scene. RMS error values are given as follows: e1 = 0.1578, e2 = 0.0789 and e3 = 0.0816.  Figure 12 shows the structure estimation of the dynamic object position with noisy object and camera velocities and Figure 13 shows the structure estimation error of the dynamic object. Here again, only the third component of RMS error are changed e 3 =0.0992. However, the presence of noise on both camera and object velocities can significantly degrade the performance of NLUIO. Therefore, it can be seen that the practical situation does require a more robust nonlinear observer for the considered problem.

CONCLUSION
The robust NLUIO has been designed for a nonlinear camera-object system. Furthermore, the stability of the error systems has been demonstrated to estimate structure from motion of a feature point. The sufficient condition for existence of the designed nonlinear observer is derived. In order to facilitate the NLUIO design, the obtained condition is formulated in terms of LMIs. This paper extensively validates the proposed method for both static and dynamic scenes. Simulation results are promising and much has to be done to assess the performance of the proposed method against measurement noise. An interesting direction for future research is improving the proposed method to test it with experimental data and considering a trajectory of a moving object along a plane. [