Obstacle detection for autonomous systems using stereoscopic images and bacterial behaviour

This paper presents a low cost strategy for real-time estimation of the position of obstacles in an unknown environment for autonomous robots. The strategy was intended for use in autonomous service robots, which navigate in unknown and dynamic indoor environments. In addition to human interaction, these environments are characterized by a design created for the human being, which is why our developments seek morphological and functional similarity equivalent to the human model. We use a pair of cameras on our robot to achieve a stereoscopic vision of the environment

INTRODUCTION Active robotic sensors have today become a high-performance tool with great acceptance at commercial and military level [1,2]. These are embedded systems equipped with sensors that provide specific primary data, from which a real-time processor produces information relevant to the tasks of the robot [3]. This kind of sensors has promoted research in information-driven strategies for the development of tasks with robots, as well as the implementation of algorithms for digital signal processing and control schemes oriented to these sensors [4].
When faced with the design of motion strategies for autonomous robotic systems, these sensors prove to be very convenient, and even fundamental [5,6]. When environments are dynamic (a typical problem for service robots) it is necessary for the robot to be able to identify nearby obstacles in real time [7,8]. Unstructured environments are more complex due to their dynamics and lack of knowledge of identifiable characteristics. In addition, not all obstacles are the same, this means that the behavior of the robot in front of each of them must be different and appropriate in each case.
Between the minimum capacities that a robot must have is its capacity to define its relative size and dimensions in the environment. In other cases, it is also necessary to know its height to define interaction strategies (pick up a bottle from a table, for example). Depending on the application it is possible to use different kinds of sensors, but those capable of providing visual information are the ones that provide more relevant information [9]. In this sense, systems with two cameras turn out to be more advantageous than systems with a single camera [8], since they provide information on the depth and orientation of the obstacle [4,[10][11][12]. Digital cameras as fundamental elements of optical sensors have been used extensively for the robotic arm motion control solution. The camera provides the required feedback information in relation to the position of the objects to be manipulated. This strategy is known as Visual Servoing or Vision-Based Robot Control (VS) and is characterized by having as feedback information the image of a camera [13]. The aim is to support the robot's decision making with eyes that take optical information from its own perspective and in parallel (separated by a certain distance) [11]. The distance between the robot and the obstacle can be determined depending on the distance between the obstacle positions in both images, and the focal distance of the cameras [14]. The field of vision can be increased considerably by adding a hyperboloid mirror or a conic mirror in front of the camera lenses, which provides an omnidirectional view to the cameras [15].
The reconstruction of 3D models from 2D perspectives (stereoscopic vision) is a strategy inspired by animal biology that allows the collection of three-dimensional information from the navigation environment. However, the process of generating 3D models is computationally expensive [16], and requires good camera calibration, making it very difficult to implement in real time on embedded systems [17]. In addition, the generation of 3D models is highly dependent on the quality of two-dimensional images, which are strongly affected by lighting conditions [18]. The computation of the distance to the obstacle takes into account the angular distance, the distance between cameras and the pixels of the images [7,11]. However, in many applications, it is not necessary to rebuild the entire environment, which considerably reduces the computational requirement [19]. In fact, the human brain does something similar by processing information from the eyes, only focusing on a portion of the entire image that the eye detects. This information can then be processed to find specific shapes [20,21].
There are two strategies for estimating the distance to the obstacle in stereoscopic vision: active method and passive method [10,22]. In the first case, the sensor system sends signals to the obstacle such as visible light or laser signals, which are then detected and analyzed [11]. The ability of these sensors to establish distances is superior to human vision, but they are also costly and complex to implement, and they have unresolved problems. For example, the laser delivers the distance of a single point. In fact, these methods do not determine the exact 3D positions of all points of the obstacle. Another negative aspect is their speed, they are very slow for real-time operation [23]. On the other hand, the passive methods estimate the location of the obstacle from the images of the environment captured by cameras [19]. They use digital processing on the images to estimate the distance. This passive strategy has the additional advantage of working with different setups (cameras, light conditions, and embedded hardware). It should be clarified, however, that there are two problems that cannot be solved with this strategy: occlusions and overlapping of objects [24].
In order for the solutions to be real, it must be possible to massify them, and for this a low cost and high performance is essential [18,23]. In this sense, processing algorithms must have very low computational cost in order to reduce processing time and hardware cost, while demonstrating to solve the problem. This paper attempts to address some of the critical problems of the strategy by maintaining a low computational cost, in particular reducing the impact of lighting on image quality, and improving the coincidence between 2D image points.
The main idea of our strategy is to identify points of obstacles by means of a movement in the images based on bacterial interaction, these points are mapped in the planes of projection of the environment in order to establish the distance to the obstacle, all this without the need to make modifications to the environment [12]. The firmware used to control the hardware setup, as well as data acquisition and processing, is written in Python. We detail the methods and algorithms used for image processing and estimation of the distance to obstacles. The results presented are the product of real laboratory tests carried out on our robot. Our proposed bio-inspired algorithm for three-dimensional obstacle reconstruction and the resulting motion control scheme have a number of advantages over other methods that directly control the entire nonlinear system or rely on dynamic programming for planning [25].

PROBLEM FORMULATION
We want an autonomous robot with low resource consumption to be able to identify obstacles in an unknown environment. In this sense, we define our robot in a W workspace. Let  The robot has two cameras that form an optical system of stereoscopic vision. This system is located in r (t) ∈ R 3 and has R (t) ∈ SO (3) orientation, where SO (3) denotes the special orthogonal group of dimension three with respect to a global frame of reference for every instant t ≥ 0.
To determine the position of the obstacles with respect to the robot, we define a relative frame of reference with respect to the axis of the two cameras as shown in Figure 1. We denote the two cameras by Left camera (L c ) and Right camera (R c ). The L c and R c centers are located at (−0.14, 0, 0) and (0.14, 0, 0) in the relative reference framework. The distance between the cameras is b = 0.14 + 0.14 = 0.28 m.
where p i (t) corresponds to the position of the obstacle with respect to the frame of reference relative to the cameras. The cameras produce two parallel images at instant t with the location information p i (t). However, obstacles are not points, they are volumes whose surface is made up of a large number of points. We do not want to determine the position of all points of obstacles. Instead, we want to identify the position of a small group of points that will ideally move to the surface of the obstacles.

Int J Elec & Comp Eng
ISSN: 2088-8708 Ì 2167 We define a population of m bacteria in the space in which the robot may encounter obstacles when moving forward as shown in Figure 1. The initial position of each bacterium is random but known. From the images of the two cameras, we can establish trigonometric relationships for the three-dimensional position of each bacterium. If the bacteria are on the surface of the obstacle, then we can determine the distance to these points of the obstacle as depicted in Figure 2. We propose a search algorithm (obstacle search) in which bacteria move three-dimensionally according to local information detected in their 2D projections. In addition, the algorithm is accelerated according to the bacterial Quorum Sensing (QS), i.e. large populations of bacteria in a space make the space more attractive to other bacteria. The m bacteria (or agents), all identical to each other, move in W searching for areas of great interest to them (for example, in search of food). The value of a given position is determined from local readings (local interaction with the medium) evaluated from its projection on 2D images. Each bacterium is defined by its position in the environment (2): where p is a point in 3-dimensional space p ∈ R 3 . The population density is evaluated using the distance between bacteria (3): as the distance between bacteria V i and V j , which is calculated by an appropriate norm. The function used to evaluate the value of the region where the bacterium is found in the left and right projections considers the similarity of the neighboring pixels to the bacterium in the two projections is depicted in Figure 2. The mathematical expression is (4): where (x L , y L ) and (x R , y R ) are the coordinates of the left and right projections of the current bacterium, L (x L +i,y L +j) is the grey value at the left image at pixel (x L + i, y L + j) (in a similar way for the right image), N is the neighborhood around the projection of each bacterium, and |∇ (M )| is Sobel gradient norm on left and right projections (to penalize uniform regions).

Ì ISSN: 2088-8708
Bacterial QS is activated if the population density within a space is greater than a threshold value T called the quorum threshold. It is the parameter defining whether or not it has reached the quorum. The behaviors of bacteria (search in the environment) are coordinated by the following rule: -If the bacterium V k ⊂ W is located near to the bacterium V i ⊂ W , i.e. (5): and the number of bacteria within the sphere with radius h 2 and origin in V k is greater than T , then the value of the region increases for V i .

RESEARCH METHOD
We initialize the bacterial population randomly within the field of action of the robot (red dotted line in the top view of Figure 1, 3 m along the x-axis, 2 m depth on the z-axis, and 2 m height above ground). The coordinates of each bacterium are defined with respect to the frame of reference relative to the cameras. The size of the population was taken as a performance variable parameter with values between 10 and 1000.
The cameras are located on the robot at a height of 0.5 m from the ground. The origin of the frame of reference relative to these cameras is at this height, in the middle of the two cameras. The positive x-axis corresponds to the right side of the robot, the positive z-axis corresponds to the direction of advance of the robot, and the positive y-axis grows above the robot.
The images of L c and R c are scaled to 800 × 600 pixels. The projection of each bacterium i on the images is determined with the following equations (the position (0,0) of the image is in the upper left side): Left image : x p = 400 + (xi+0.14)800 Right image : x p = 400 + (xi−0.14)800 where (x i , y i , z i ) is the three-dimensional coordinate of the bacterium i, and (x p , y p ) is the two-dimensional coordinate of the bacterium projected in the image. The performance of the area adjacent to the bacteria at each projection is determined by (4). The bacteria move in the limited space according to this function.
If the bacterium is on the obstacle surface, then it will have similar neighboring pixels in both projections as shown in Figure 2, the illumination affects both cameras equally), and the function will assign a high value to the position of the bacterium. The more the neighboring pixels differ, the less value the function assigns. The position of the bacteria is updated with the gradient looking for the high values (movement of the bacteria). The QS forces the bacteria that are slow to find the obstacle surface to move towards the large groups of bacteria. A bacterium that does not appear in any of the projections obtains the lowest position value (it is outside the robot's range of vision).

RESULT AND ANALYSIS
We evaluate the performance of the strategy with different configurations varying the bacteria population, the QS threshold and the correlation window used in the denominator of the evaluation function. A larger number of bacteria allows for reconstructing larger portions of the obstacles without significantly influencing the computational cost of the algorithm. The QS threshold reduces the convergence time when it does not exceed the range of 100, above this value, does not have a significant effect. The most important effect was observed in the size of the correlation window of the function, which greatly affects the bacteria's ability to locate the obstacle. Large values improve the behavior but considerably increase the computational cost. Figures 3 and 4 show the result of one of the laboratory tests.  We perform more than 50 laboratory tests with different obstacles and more or less constant lighting conditions for a human indoor environment (the day with natural lighting and night with LED type lighting). The distances from the objects to the robot were established in a straight line between 0.3 and 2 m. The accuracy of the distance values determined by the optical sensor was established by comparison with the actual value, measured in the setup with a tape measure. These results were related to the distance of the obstacle. Figure 5 shows these percentages of accuracy with respect to the estimated distance. Our intention is to use the strategy to identify obstacles in the environment, and with this information coordinate the movement of the robot. The proposed motion planning strategy based on the detection and stereoscopic identification of obstacles considers three elements: capture and pre-processing of images, determination of obstacles and application of motion policies according to the information feedback as shown in Figure 6.

CONCLUSION
Considering the problem of motion planning of small autonomous robots in unknown environments, particularly for service robots with direct and continuous interaction with the human being, we propose a low-cost computational stereoscopic vision strategy that allows autonomous navigation in dynamic environments. Service robots perform their tasks in indoor environments, unknown, with a high probability of constant change in the location of obstacles and people. The stereoscopic vision systems allow to establish with precision the three-dimensional location of obstacles and therefore provide complete information for the design of navigation strategies. However, their computational cost is high, making it impossible to use them in real-time on moderate performance platforms. Our strategy proposes a local reconstruction of a finite set of points of obstacles in the environment, which guarantees a low cost and a high performance. We performed the calculation of about 100 points corresponding to the surface of the obstacles. These points are identified using an uninformed search algorithm inspired by bacterial interaction. The bacteria defined in the 2D projections of the cameras move in the three-dimensional space looking for similar neighboring regions in their projections. The algorithm converges with most bacteria on the obstacles. In the experiments carried out, it was possible to verify percentages of accuracy to the obstacle distance higher than 95% and low computational consumption, making it useful for embedded implementations. The future development of the scheme includes improvements in the determination of obstacle surfaces using larger bacterial populations, and reduction in convergence times through the use of the Quorum Sensing (QS) model.

ACKNOWLEDGEMENT
This work was supported by Universidad Distrital Francisco José de Caldas and the Centre for Scientific Research and Development (CIDC) through the project 1-72-578-18. The views expressed in this paper are not necessarily endorsed by Universidad Distrital Francisco José de Caldas or the CIDC. The authors thank the research groups ARMOS and SIE and its research seedbeds for the evaluation carried out on prototypes of ideas and strategies proposed in this paper. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.