Lane marking detection using simple encode decode deep learning technique: SegNet

ABSTRACT


INTRODUCTION
One of the primary reasons for the disability and sudden death throughout the world is the collision of vehicles. In accordance with the World Health Organization (WHO), around 1.2 million humans have died, and around half of the hundred million people are injured from all over the world due to the sudden accident on the road, which creates a substantial economic loss for the nation as well. The rate of traffic death is increasing day by day in the whole world, which can be visualized from the following Figure 1. Besides, Malaysia has become the third position among the Asian countries for the disaster rate of a traffic accident on the road and caused the death of innocent people around 7,152 [1].
Intelligence inspection has become the most significant research attraction regarding artificial intelligence. At the improvement of computer vision technology and enhancement of intelligent investigation, the implementation has become more accessible, which results in massive evacuation in the autonomous vehicles. Therefore, it saves thousands of people from death [2]. Due to the significant application on saving time for driving and reducing the sudden death for the road accident, autonomous driving has become the most crucial research arena for recent years [3]. Consequently, the advanced driver assistance system (ADAS) has become one of the most popular research content for the researchers [4] based on apprehension, safety, and awareness of the traffic atmosphere around the vehicles. There are many advanced features for high-end vehicles in the recent time frame, such as vehicle to vehicle, semi-autonomous driving system, self-parking, lane markings detection, vehicle to infrastructure, lane-departure system [5]. According to the traffic safety administration of US National Highway, lane markings detection is the preliminary requirement for all the autonomy features of the ADAS under the occlusion scenario [6], which plays a significant role in this arena. Besides, lane markings detection is one of the most powerful technology for the autonomous vehicles in the road scene interprets [7], and it will be easier to avoid the sudden lane changes and collisions by knowing the position of the lanes [8]. Again, the significance of the lane markings detection is not only for the lane-keeping performances but also for the traffic regulations depicted by the lane markings on the roads [9].
Though the research of lane markings detention is not a new one, it still has many challenges under different circumstances and conditions [10] as it is affected by distinct criteria like occlusion, fog, rain, sunlight, shadow, the divergence of illumination [11]. There are many computer vision techniques, and intricate image processing approaches have been used for lane markings detection like [12] and [13]. Though they often utilized the handcrafted and highly specialized features, the systems have become the worst due to the computational complexity and incapability to cope up with the intricate environmental conditions [11].

RELATED WORK
As Autonomous cars have been deployed by many advanced supportive peculiarities like a warning for adaptive cruise control (ACC), lane departure, instant breaking in emergency cases to curtail instance death in a sudden road accident, researchers are showing their keen interest on this research arena [14]. Though lane marking detection is considered as the primary research topic for autonomous cars, it is quite tricky and challenging under distinct conditions and effects [15]. Consequently, researchers are trying to detect the lane marking more precisely in the recent past [16]. At the time of changing the road lanes, it causes a thousand of innocent people's death due to human errors. A plethora of image processing techniques can be employed, such as Hough transformation [17], template matching [18], edge detection [19] to detect the lanes in which low-level features, texture and color features had been employed. But these conventional techniques are not appropriate for lane marking detection due to distinct appearance, position, place, the intensity of light, and barricading vehicles, which occlude lane marking [20]. As lane marking may have the texture feature, distinguishing features employed like LBP [21] and haar-like [22] with specified classifiers such as SVM [23], and AdaBoost [24]. which gave an unfortunate result. In the modern era, the majority of the methods are slow [25] though the fastest ways [26] are not precise enough to implement. In the recent period, many researchers have been employed different deep neural network techniques for lane marking  [27]. Tabelini et al. has introduced a PolyLaneNet to detect lane marks to use in real-time applications. However, it has obtained around 93.36% accuracy, which is of lower accuracy than the existing methods [28]. Again, E2E-LMD has been used for the row-wise lane classification in Yoo et al. and has a limitation of effecting by the curve and occlusion. Besides, accuracy is also lower than the other techniques, which is around 96.02% [29]. Authors have applied an E-Net architecture to detect the lane marking detection to make the model stable in the arbitrary numbers of the lane and achieved around 96.29% [11]. Whereas, Pizzati et al. prevailed around 95.24% accuracy by experimenting with cascaded CNNs [30]. Again, in [31][32][33][34], different convolutional neural network techniques have been employed but have relatively high system complexity, higher computational cost and overfitting problems.
The proposed method provides a simple and lower computational complexity encode-decode model to detect the lane marking detection. The model is based on the SegNet architecture, which has been trained by around 13K lane markings image frames. The dataset contains different intricate environmental conditions like rain, cloud, changes of light, straight and curve lines. The trained model also tested on a lane marking video from Udacity's SDCND Advanced Lane Lines project with higher accuracy.

RESEARCH METHOD
The proposed method is considered a simple encode-decode deep learning technique based on SegNet architecture for lane markings detection, which is very significant for semantic segmentation [27]. The layout of the proposed method architecture has been shown in Figure 2. A dataset consist of lane markings images on different environmental conditions has been used to train the model. The dataset is the collection of image frames of 720 pixels video clips, which contain around 12764 images of different environmental conditions. In this dataset, there are 17.4% clear night view, 16.4% rainy morning view, and 66.2% cloudy afternoon view. Besides, it contains around 26.5% straight roads, 30.2 curves roads, and 43.3% very curvy road. However, a data generator has also been examined for shifting the channel so that it can slightly sift the shadows. The annotated dataset has been collected from [35] and tested on a lane marking roadside video from Udacity's SDCND advanced lane lines project [36]. The annotated dataset has been fed into the encoder section of the SegNet architecture. The encoder section consists of 9 consecutive convolution layers to extract the feature maps with three max-pooling layers of the kernel (2, 2). Besides, six de-convolution layers have been applied in the decoder section with a fully connected layer.
Three up-sampling layers have been employed along with the de-convolution layers to enrich the resolution of the feature maps. Ten dropout layers have been utilized on both sides of the architecture to reduce the overfitting limitation. The input of the architecture has been normalized by batch normalization so that the model can be trained with analogous data with more speed. The model has been trained with a learning rate of 0.2, batch size of 128, and for the iteration number of 100. Again, strides of the kernel (1, 1) with valid padding have also been included in the training model. ReLU function has been applied as an activation function for all the convolution layers, whereas sigmoid has been used in the last fully connected convolution layer. The Adam optimizer has compiled the architecture with a loss function of Binary Cross-Entropy (BCE) according to equation 1.
Where y is actual output and is predicted output

RESULTS AND DISCUSSION
The annotated dataset has been feed into the designed SegNet architecture to detect the lane markings, and the result has been evaluated in terms of accuracy by equation 2. Since accuracy cannot be considered as a reliable performance perimeter to evaluate the performance of research, the other performance parameters like false positive, false negative and F1score can make a reliable result to evaluate the performance of the research work. The Adam optimizer has compiled the model with a learning rate of 0.001, strides of the kernel (1, 1), valid padding, and binary cross-entropy. ReLU and sigmoid function has been applied as the activation function for the consecutive convolution layers and last fully connected layer respectfully. The model has been trained and tested on Google Colab online platform using its default Tesla K80 GPU.
Where , , and are true positive, false negative, false positive, and true negative respectfully.
The perspective result of the model has been shown in Figure 3, which includes the accuracy and corresponding losses. It shows that it has achieved a higher accuracy of 96.38%, with a loss of only 1.45%. As per epoch loss is minimum for the proposed model, the model is archiving the actual features of lane marking from the input dataset. Therefore, the possibility of false detection is also minimum. Also, the other performance parameters for the proposed method are false positive 0.0311, false negative 0.0201 and the F1 score 0.9620. The final result of the proposed method has been displayed in Table 1.  The loss per epoch and accuracy per epoch for the training and validation has been shown in Figure 4. As the validation and training accuracy are closely analogous to each other, it indicates to have a lower over-fitting problem for the proposed model as well. Besides, the probability of having false detection is also less due to the less over-fitting problem. The performance of the proposed method is also compared with some of the recent existing lane marking detection methods, which is shown in Table 2. Table 2 shows that the proposed is more superior to the other existing lane marking detection models using deep learning techniques. The proposed method has been achieved the highest accuracy and F1 score comparing to the mentioned research article. Also, the lowest false positive and false negative values have been noted by the proposed method comparing to the other deep learning techniques in the field of lane marking detection.   [30] 95.24 0.0942 0.033 -Hoe et al. [11] 96.29 0.0321 0.0428 -Mamidala et al. [27] 96.10 --0.9445 Yoo et al. [29] 96.02 0.0722 0.0218 -Tabelini et al [28] 93 The proposed method has been tested on the Udacity's SDCND advanced lane lines video and achieved an accuracy of 96.38%. Besides, the proposed method has also utilized a simple encode-decode deep learning model with a fewer number of weights indicating lower computational complexity. Around 428ms is required for per step iteration in the training process, which provides the simplicity of the model. Several sample input-output image of the lane marking has been shown in Figure 5 (see in appendix) for visualizing the outcome of the proposed method. The left side of Figure 5 includes the sample images, whereas the corresponding output images are on the right side of Figure 5. It can be concluded from Figure 5; the model can detect the road lane marking more precisely as the accuracy of the model is comparatively high to the other existing methods. It is also optimistic that the proposed will have a significant impact on lane marking detection.

CONCLUSION
As lane markings detection is the preliminary requirement of the ADAS system, it is evident for the researchers to develop an advanced model for lane markings detection. In this research article, the proposed method has been utilized simple deep learning based on SegNet architecture to detect the lane markings on different environmental conditions. A vast dataset including different intricate environmental conditions like straight lane, curve lane, cloud, rain, low light. has been used for training the model and tested on Udacity's SDCND Advanced Lane Lines video. The proposed architecture achieved higher accuracy, F1 score with lower computational complexity, false positive, false negative, less overfitting problem, and a minimal loss of 1.45%. Hence, the proposed method is more accurate architecture to detect the lane marking detection, which has outperformed the state of art methods in terms of accuracy and can create a positive impact on this research arena. The result might be improved by using a vast dataset containing different complex environmental conditions such that the model can learn through the model more.