An index based road feature extraction from LANDSAT-8 OLI images

Road feature extraction from the remote sensing images is an arduous task and has a significant role in various applications of urban planning, updating the maps, traffic management, etc. In this paper, a new band combination (B652) to form a road index (RI) from OLI multispectral bands based on the spectral reflectance of asphalt, is presented for road feature extraction. The B652 is converted to road index by normalization. The morphological operators (Top-hat or Bottom-hat) uses on RI to enhance the roads. To sharpen the edges and for better discrimination of features, shock square filter (SSF), is proposed. Then, an iterative adaptive threshold (IAT) based online search with variational min-max and markov random fields (MRF) model are used on the SSF image to segment the roads and non-roads. The roads are extracting by using the rules based on the connected component analysis. IAT and MRF model segmentation methods prove the proposed index (RI) able to extract road features productively. The proposed methodology is a combination of saturation based adaptive thresholding and morphology (SATM), and saturation based MRF (SMRF), applied to OLI images of several urban cities of India, producing the satisfactory results. The experimental results with the quantitative analysis presented in the paper.

INTRODUCTION Road feature extraction from remote sensing (RS) images is a challenging and one of the most intensive research topic. Roads are very crucial for transportation, providing many ways of utility for human civilization. The research of road extraction has vital significance for surveying, updating the maps, urban planning, on-line traffic management, geographical information system (GIS), global positioning system (GPS) based road transport and so forth. In the absence of automatic extraction methods, manual road drawing from RS images requires great effort in terms of cost and time.
RS images provides information for various objects of the earth, based on the spatial and spectral properties. In low resolution (LR) images the roads look like curvilinear structure. From LR imagery, road feature extraction is always a difficult task, mainly in urban places due to the presence of trees, multistory buildings, fly-overs and their shadows are major obstacles irrespective of spatial resolution and sensors. Automatic road extraction from RS images is evolving and most of the approaches are limited in providing solutions with low to medium accuracy. This is due to the factors affecting the imaging conditions like environment (seasonal

PROPOSED METHODOLOGY
The flow diagram of the road extraction proposed methodology using OLI images shown in Figure 1. The thick dotted (Green) outline box represents the saturation based adaptive thresholding and morphology (SATM) method, and thin dotted outline boxes represent the saturation based MRF (SMRF) method. The methodology consists of six stages, and each step described in the following sub-sections.
The proposed methodology is a combination of SATM and SMRF and describes the selection of multispectral band combination for road index given in sections 2.1 and 2.2 In section 2.4, describes the image enhancement by using the SSF filter. The segmentation techniques used in the paper given in 2.5 to segment an image into road and non-road features and road features alone extracted by using the rules based on the shape parameters 2.6.

Band selection criteria for road index
The OLI data is open source and available from USGS earth explorer. The various spectral bands present in the OLI sensor listed in Table 1. Compared to the usage of the panchromatic image for feature extraction, the multispectral image contains more information in several spectral bands which used for advantage. Each band in a multispectral image corresponds to a band of wavelengths in visible, IR, and near infra-red (NIR) regions. From the available multi-spectral bands of OLI, we propose a band combination to form a road index (RI) for road feature extraction.
The reflectance of asphalt is more in NIR, short wave infra-red (SWIR) in [4]. USGS Spectroscopy Lab provided, the spectral reflectance of various minerals, liquids, vegetation, etc. Asphalt road spectral values are available from https://speclab.cr.usgs.gov/. Figure 2 shows the spectral variation from 0.35 µm to 2.5 µm. Spectral reflectance of asphalt feature and other features from the OLI image shown in Figure 3. From this, we can observe the asphalt reflectance variation is almost matching with the provided spectra as shown in Figure 2 in the bands NIR and SWIR.
In visible band region (0.40 µm to 0.70 µm) asphalt has low reflectance compared to NIR and SWIR regions (0.85 µm to 2.5 µm). In the spectral range of 1.8 µm to 2.5 µm, which is above the SWIR wavelength, the reflectance of the asphalt roads is varying. From this, it concludes that asphalt roads have a favorable spectral response in NIR and SWIR wavelengths.
The OLI sensor has different bands of wavelengths given in Table 1. According to Spectral Reflectance of asphalt road from Figures 2 and 3, the suitable bands for road detection in OLI are NIR, SWIR1, and SWIR2 bands.The NIR (0.85 µm -0.88 µm) and SWIR1 (1.57 µm -1.65 µm) present in OLI have narrow spectral bandwidth with good reflectance for road features (linearly increasing). In SWIR2 (2.11 µm -2.29 µm) from 2.11 µm to 2.2 µm reflectance of the roads is increasing and in the range 2.2 µm to 2.3 µm it is decreasing result in uncertainty in feature recognition. Additionally, the SWIR band can also distinguish the moisture content of soil and vegetation. The blue and green bands are having very low reflectance of asphalt road in the entire wavelength range. From the available OLI multispectral bands, the bands BLUE, NIR, SWIR1 used to form an RGB image. These bands proved to be the better combination for road extraction in the section 3.1. The bands B6, B5 and B2 are stacked, as RGB (B652) image. A fusion of B652 image with the Panchromatic band (B8) carried out using Brovey transform. This method [28] is chosen as it is faster and does not affect the band spectral relationship for road features as shown in Figure 4 with very fine spatial details.

Road index: Saturation
HSI color model most widely used in digital color image processing and developed by [29]. The road index (RI) component derived as saturation (S) formulae from the selected band combination B652, and the roads represented in uniform and with low values compared to surrounding features due to minimum reflectance value from three bands is dividing with the summation of all band values. In the RI (or S) image, roads show high discrimination from the other features. S image has normalized values within the range of [0,1]. Here, S is used (as termed in the paper) as road index (RI) and defined as the following 1 for B652.

Morphological transforms
The MM is a set theory approach consisting of union, intersection and complement. Top-hat (TH) transform used to enhance the objects smaller than the structuring element. It preserves sharp peaks and improves contrast. There exist two types of TH transform. White TH or TH transform, which highlights the 'Bright' features and defined as subtraction of morphological opening (•) image from the initial (before opening) image. Black TH or Bottom TH (simply bot-hat (BH)) transform defined as dual as the difference between morphological closing (•) image and the input image. It highlights the 'Dark' features that are smaller than the structuring element.
Where, S th represents the TH applied on RI image , B is structuring element 'disk' with (suitable) radius, and S bh represents BH. TH and BH transforms are uses in various image processing steps in pre or post-processing of the feature extraction process. BH transformation applied to the RI image, which highlights the roads and suppresses the remaining features.

Shock square filter
Shock filter is one of the classes of morphological enhancement method and given by [30] as Here, I represent any gray-scale image. The Laplacian of I i.e., ( I) is act as edge detector. The I replaced by (G I) in [31]. In this notation G is Gaussian and is convolution.
Here, S bh is BH of S image as derived in (3). The shock filter used to sharpen edges but forms maze-like structures in regions [32]. To eliminate this, we proposed a filter, which is a combination of Shock filter [31,30] followed by a modified shock filter. This smooth the regions as well as keep the sharp edges of features.
Here, the motive is to sharpen the edges of the image and to smooth the regions (inside the objective feature).
The direction of shocks measured by using the hyperbolic tangent function T sh described in [32] on the E, which is edge detector. The modified shock filter I sh is where ( I) replaced by Weickert's coherence shock filter [33]. It is one of the best edge detectors and multiplied by the Road index (S). In this way, the regions smoothed along with sharp edges in the image.
where λ designed for tuning the sharpness of curve function; hence, it set as the value 6 (fixed). E is an edge detector, as described in Weickert's coherence shock filter [33]. Let I = S bh , then The edge detection is performed on the shock filtered image by using (5). An-isotropic Shock filter [32] is represented as D(I bh ) and E(I bh ) are dilation and erosion of BH image. SSF is able to get sharp edges and enhance the feature objects. Also, it is useful in removing the connected noise to the feature. Without SSF, the extracted roads are having connected noise like building and spike ridges. By using SSF, it is producing better discrimination as shown in Figures 5a and 5b, i.e., roads in the image are enhanced and sharpen edges (highlighted in red color boxes). To suppress the other elevated features by the SSF, the TH transform is applied on the SSF image as aforementioned in section 2.3.
where I sh is SSF image and B is structuring element 'disk' with suitable radius.

Segmentation methods
An IAT algorithm is an edge-based image thresholding method and edges updated by iterative using the stochastic gradient descent optimization method based on the variational min-max principle [26] performed on the I T H image to get the segmented binary image. It helps in the extraction of roads accurately. This process termed as SATM.
A model, MRF [27] is also used for the segmentation of road features on I T H image, when there is no prior information about the model parameters. MRF depends on the neighborhood information for its probability distribution. The parameters estimated by using the expectation-maximization (EM) algorithm. The segmenting classes are known, and the label image generated by using the optimal threshold on TH, i.e., TH of SSF image. This process termed SMRF. MRF model provides the solution with maximum a posterior (MAP) [34] estimation by the maximizing label class and minimizing using the posterior energy function. The parameters (mean and variation) estimated by using the EM algorithm [35].

Connected component analysis
In the final step, by using the connected component analysis (CCA) [36], the segmented binary image is labeled. For each label, the shape parameters like extent (Ex), eccentricity (Ec) computed [37]. The extent defined as a ratio number of pixels in the label to the bounding box of the same label. The label whose Ex value is ≤ 0.12 and also Ec value is ≥ 0.99 considered as road label for extraction of elongated roads. The road features in OLI images look like curvilinear, elliptical, etc., for this reason, for each label, the shape features like shape index (SI) and density (D) computed. The SI with high values at the same time the D has moderate values of the label taken as road features from the labeled image.

3.
DATA SETS Data sets of OLI covering various urban areas used to extract the roads, which are level-1 precision and terrain (L1TP) corrected used for road extraction. In this study, four OLI data sets covering three major Ì ISSN: 2088-8708 urban areas of Hyderabad, Chennai, Bangalore, and one data set covering the rural areas of Rajasthan used to evaluate the performance of the proposed methodology. The used scene id along with their date of pass for each chosen area given in Table 2. All images mentioned in the paper are top of atmosphere (TOA) corrected. The size of the image represents the subset scene image. 3.1. Analysis on various band combinations As given in section 2.1 for road feature extraction, the chosen band combination is B652. In this section, the various band combinations of OLI experimented with the proposed methodology. A few interest band combinations are B652, B653, B654, B752, and B762 selected for analysis and shown in Figure 6 for importance in road extraction. The band combination B432 is a natural color image, used in this methodology. B432 combination was not able to produce even major roads as well. By using the proposed methodology, B432 is not useful for road detection from OLI.
The band combinations B542 and B654 are false color composite (FCC) images. In the B4 band, the concrete has high, and asphalt has low reflectance with each other, due to which converted S image has mixed features (building and roads). The B542 and B654 are helpful in the extraction of major roads only but not able to recognize other roads. The B654 combination image was shown in Figure 6i and corresponding results are RI index image, Optimal Threshold from I T H image and SATM result shown in Figures 6j, 6k and 6l respectively.
The B752 band combination, not able to produce urban area roads of a few major roads also missing. Due to B7 has uncertainty in the reflectance of concrete and asphalt. From Figure 3, B7 has a low reflectance of asphalt and almost equal to band B2. In conversion to S, most of the road values mixed with the barren and built-up areas, the segmentation methods are not able to separate the road features. The corresponding results of B752 are shown in Figures 6m, 6n, 6o and 6p.
The band combination B762 is able to detect the major roads, and few in urban area roads using the proposed method. Because of the asphalt and concrete reflectance are higher in bands B7, B6 and low in B2. The S of B762 is also useful in road detection, but only a few roads are missing. These missed roads detected by B652. Also, the B653 band combination able to extract roads, but B652 detects more roads with good accuracy and smoothness. This can be observed in Figure 6t, 6h and 6d. Based on the experimental results and analysis, the chosen band combination is B652 for road extraction.
One of the widely used methods for the selection band combination is the optimal index factor (OIF), which depicts the correlation of bands that have the highest to lowest OIF band combinations. By using this, it observed that the band combined with both SWIR1 and NIR bands have the highest OIF values. The spectral correlation plot between NIR and SWIR bands with the corresponding optical image shown in Figure 7 by using ENVI. From this, the water bodies have the highest correlation as well as vegetation. Also, it observed that asphalt has a high correlation from the other features, especially in urban areas. From this, both the bands are most useful to form the RI index.
According to OIF, the B651, B652, and B654 band combinations have the highest OIF values from all combinations. In the two (B651 and B652) combination, the B2 band is a regular band in all types of sensors, and B1 is a rarely using sensor in RS. Hence, the chosen band combination is B652. The various band combinations of original images with the RI index and corresponding results of optimal threshold and SATM shown in Figure 6.

EXPERIMENTAL RESULTS AND DISCUSSION
The proposed methodology is applied to sub-scene images of a full scenes, as mentioned in Table 2. All these sub-scene images are multispectral band combination of B652, as proposed in section 2.1 and based on analysis in section 3.1. The small scenes of sub-scenes Figure 8a (Chennai) and Figure 9a (Hyderabad) chosen for proposed methodology verification, has the roads and as well as water bodies. The proposed methodology consists of two methods are SATM and SMRF. One method SMRF applied to Figure 8a, for extraction of roads, but observed that water body edges also extracted. SATM applied to Figure 9a, also observed that water body edges are present along with roads. A method proposed for the water body and water canal extraction [38] used to remove the water body edges from the images.

Removal of water body edges
As mentioned above, by using the proposed methodology, the water body edges are also extracted as roads. Due to the water body (lakes, reservoirs and canals) boundaries are man-made structures, by concrete, gravel, and stone. These water body edges removed by the proposed method MNDWI2 of water bodies in the same images and dilated. That is, MNDWI2 [38] index used to extract the water bodies and water canals from the same image. From this method, extracted water bodies from the images shown in Figures 8c and 9c (SR is 30 m). After extraction of these features, image interpolated by cubic convolution to the SR of 15 m and dilated by the structure element 'disk' with radius 2 to match with edges. Wherever water body edges present (if water body positions are matches in both images 8b and 8c) are removed by using logical operators. Water body edges removed image shown in Figure 8d and similar procedure applied for Figure 9. Further results shown in subsection 4.2 used the same methodology for the removal of water bodies edges from the resultant images of the proposed methodology.

Results and discussion
The proposed methodology performed on OLI data sets given in Table 2. The sub-scenes of data sets taken as which covers the urban areas and one sub-scene (Rajasthan) with arid type rural area. The different steps involved in the proposed methodology resultant image shown in figures.
As mentioned in section 2.1 the proposed new multi-spectral band combination B652 sub-scene images of full scenes as shown in Table 2 are shown in Figures 10a, 11a and 12a. All the mages mentioned in the paper are pan-sharpened by the Brovey method with SR of 15 m. These images are converted into indexed RI image by using the 1 and are shown in Figures 10b, 11b and 12b. These results illustrate the proposed index (RI) image, and the roads having low values at the same time differentiating from the other features. To enhance the road feature, BH is applied and elevates road features as well as suppress other features.  10c, 11c, and 12c, illustrates the proposed filter (After applying the BH and SSF), sharpen the road edges as well as smooth the road regions.
SSF images segmented by IAT, Otsu, and MRF, which segment as the road and non-road feature. From these segmented images, road features alone extracted by using CCA and are shown in Figures 10d, 10e, and 10f. As described in section 4.1 the water body's edges also extracted as roads due to the features are like roads and those water bodies removed from segmented images. Figure 10g describes the proposed methodology of the combined results of SATM and SMRF. In combining the results, we can observe that the roads little more than individuals. Also observed, roads which are freeways and arterial roads extracted with high accuracy. Similarly, road features are extracted from the remaining images and given in Figures 11, 12 and 13. SATM results are shown in Figures 11d and 12d.  Figures 11e and 12e represents the SMRF results. The final results (Combined images) of proposed methodology are shown in Figures 11f and 12f. In OLI, pan-sharpened images, each pixel represents 15 m on the ground. Generally, road widths are 7 m (two-lane), 15 m (4 lanes), 30 m, 45 m and vary. Highway roads have 4 to 6 (or more) lanes with a varying width from 15 m to 45 m, which have smooth curves represents one to 3 pixels in the image. District (Arterial) roads are mostly 2 or 4 lanes with paved or unpaved roads varying width from 8 m to 20 m. Minor (residential) roads are mostly irregular with less than 7 m width.
By using the proposed methodology results illustrates in the Figures 10, 11 and 12, and the roads width > 22 m (1.5 pixels) are extracted consistently. The roads with 7 m (1/2 pixel) width are also extracted based on contrast and separation of road and background. These images show the proposed methodology extracting the roads with good accuracy. Figure 13a represents a sub-scene of the rural area (residential type) roads of Rajasthan with a band combination of B652. RI and enhanced images are shown in Figures 13b and 13c. In the arid type of conditions, roads highlighted due to high asphalt reflectance and observed the width of the roads is varied from 7 m to 3 m shown in Figure 13. These types of roads also are extracted efficiently by the proposed method, as shown in Figure 13d. The road widths less than 7 m also extracted in such arid areas due to obstacles not present beside roads. From the results, it observed that more than 15 m width of roads extracted with 100 percent, when not present multi-storey buildings and trees beside the road. The roads width less than 30 m, which are present in between the buildings and trees, pixel reflectance changes for roads due to having the high reflectance of buildings and trees, and not able to identify as the road in such dense urban areas.
The observed gaps in between the roads because of not enough width of roads, not consistent and urban area roads with less than 30 m. Gaps with less than 3 pixels in between roads connected, by opening and closing operators of morphology used. Gaps less than 10 pixels in between roads connected by the circle method, that is, at every endpoint placed a circle ( with diameter 10) and thinned by morphological Operation.
Where two endpoints of circles are connected, there itself connect, and remaining are left. The proposed method with gaps filled results converted into vector form by using ArcGIS 10.2, and these results used for validation. The accuracy of the proposed methodology in terms of length of extracted roads using proposed one of method SATM and Reference lengths(manually drawn on OLI test case images with QGIS) given in Table 3. The accuracy of SATM method for sub-scenes given in Table 4. Also, the length of extracted roads by the proposed methodology (SATM + SMRF) in Table 5 and accuracy presented in in Table 6 for the same. From these tables, the index based methods SATM and proposed methodology able to extract the road features with an overall accuracy of 85%.

Validation
Reference roads are obtained by digitizing the test images [7], to evaluate the road feature extraction framework, at a 1:50,000 scale. This reference utilized as ground truth for an evaluation of the extracted roads. Also, the extracted roads are converted into the vector format using ArcGIS and spikes with a length of 150 meters (10 pixels) removed. The reference layer is a buffer layer with a width of 15 m. The extracted roads fit within the reference roads network called as true positive (TP). Similarly, false positive (FP) and false negative (FN) calculated as in [39]. These Three metrics used to assess the accuracy of extracted road networks and calculated precision (P), recall (R), and F1-score [40]. Overlay of extracted roads of TP on the reference roads shown in Figure 14. Roads features extracted using RI from OLI images for the urban areas of Hyderabad, Chennai, and Bangalore with an overall F1-score of 85%.

. Comparison
Here, a classification algorithm i.e., pixel-based support vector machine (SVM) [41] classification with RBF kernel and polynomial order 2 is used to extract roads. Accuracy of the proposed methodology is comparing with the extraction of roads using the SVM on the same (B652) band combination. By using the SVM method, only major (width) roads get detected, due to limitation of SVM misclassification. The results of SVM and SATM images are shown in Figure 15. SVM method was able to classify the roads where roads are clear and remaining gaps are occurred due to the misclassification with building type. The accuracy of road feature extraction using the SVM and SATM methods given in Table 7.

CONCLUSION
In this paper, we proposed a methodology for road feature extraction based on road index using IAT and MRF (SATM and SMRF) from RS images. This methodology applied for the OLI images of the Indian urban areas. From the results, it observed that the extracted road widths are regularly greater than 30 m and airport runways with accurate. Also, observed that, in results, discontinuity is present in the road networks due to insufficient road width, presence of trees, and multi-storey buildings. These small gaps in between roads connected by the morphological operations, circle method that is, placing a circle at every endpoint of extracted roads and thinned. The remaining large gaps are not able to connect and consider for future work. The methodology brings advantages of extraction of the freeways and arterial roads by using OLI (LR images) instead of the HR images. Hence, the proposed methodology with RI is producing satisfactory results from OLI images. This methodology can extend for other sensors like Sentinel 2.

ACKNOWLEDGMENT
We wish to express our sincere gratitude to Dr. Shantanu Chowdhary, Director, National Remote Sensing Centre for their encouragement and guidance in bringing out this publication.