Device to evaluate cleanliness of fiber optic connectors using image processing and neural networks

Received Aug 19, 2020 Revised Dec 18, 2020 Accepted Jan 13, 2021 This work proposes a portable, handheld electronic device, which measures the cleanliness in fiber optic connectors via digital image processing and artificial neural networks. Its purpose is to reduce the evaluation subjectivity in visual inspection done by human experts. Although devices with this purpose already exist, they tend to be cost-prohibitive and do not take advantage of neither image processing nor artificial intelligence to improve their results. The device consists of an optical microscope for fiber optic connector analysis, a digital camera adapter, a reduced-board computer, an image processing algorithm, a neural network algorithm and an LCD screen for equipment operation and results visualization. The image processing algorithm applies grayscale histogram equalization, Gaussian filtering, Canny filtering, Hough transform, region of interest segmentation and obtaining radiometric descriptors as inputs to the neural network. Validation consisted of comparing the results by the proposed device with those obtained by agreeing human experts via visual inspection. Results yield an average Cohen's Kappa of 0.926, which implies a very satisfactory performance by the proposed device.


INTRODUCTION
Throughout the ages, humanity has faced the engineering challenge of transporting large quantities of information from one place to another, which has been faced through diverse means. During the 30s, the introduction of the coaxial and multipair cable technologies enabled transporting signals during many years. Later on, the increasing telecommunications demand and digitalization trends resulted in the development of a new technology which took off during the 60 s, known as fiber optics [1]. Meanwhile, other data transport technologies appeared such as microwaves and satellite communications, which use atmospheric properties for data transfer. However, each technology has its own operating conditions. For example, microwaves are severely affected by weather conditions and have limited capacity depending on the requirement [2]. On the other hand, the fiber optic communications requires that the connectors used to join the cables are completely clean and free of any kind of dirt to ensure the quality of the services. Otherwise, there will be a negative impact on the transmitted signal, which will impair the performance of the link [3,4].
With this in mind, industry has developed technology both for identifying dirty connectors and for cleaning them. IEC 61300-3-35:2015 is the standard that sets the quality requirements for fiber optic connector terminations [5]. That is how the main problem to solve is develop a device that include strong image algorithms to evaluate optical fiber connectors with high precision and low cost. In the scientific literature there are some proposals to solve the problem, Duffy et al. study in [6] automatized solutions to evaluate and clean fiber optic connectors. Rehman and Mozaffar [7] use wavelets to evaluate the dirtiness of connectors, while Filipenko in [8] use the interference method to analyze them. Commercially, companies such as VIAVI solutions [9] and EXFO [10] manufacture and commercialize inspection and evaluation equipment to verify the state of the fiber optic connector. Proposals in the scientific literature do not use image processing and neural networks, which can improve the method's precision. Likewise, the proposal by Duffy et al. uses a programmable logic controller (PLC), which is not the best alternative for a portable solution. On the other hand, commercial solutions imply making large investments which may not be viable for small and medium-sized telecommunications networks companies.
In the face of all these problems and drawbacks, the present work proposes a low-cost portable equipment capable of successfully evaluating and identifying the dirtiness of fiber optics connectors using an optical microscope, a single-board computer, open-source tools, image processing algorithms and neural networks. The device has a near-perfect assertiveness with evaluation results showing a Cohen's Kappa agreement index of 0.926. Furthermore, the equipment is well-suited to fieldwork requirements due to its portability. Figure 1 shows the block diagram of the proposed algorithm. The monocular of the microscope has an adapter fixed to it, which safely holds the digital camera that takes the RGB image of the fiber optic connector. The following paragraphs describe the image processing steps and algorithms employed to achieve the desired results.

Optical microscope
It is composed of a manual device which allows visualization of the connector's ferrule and the contact area. Figure 2 shows the composing parts of a manual microscope. The eyepiece permits visualization of the fiber through an optical zoom up to 400X. The focus control handles the manual focus to improve image definition. The light source entrance helps the user note if the illumination LED is on or off.

Camera-microscope adapter and prototype device
The camera-microscope adapter consists of a piece, 3D printed in black PLA. This design is exclusive to the proposed solution and helps in creating a dark enclosure to have a stable, constant illumination when imaging takes place. Figure 3 shows a schematic of the adapter. The top piece is a removable lid that covers the digital camera, while the rectangular hole is the entrance for an HDMI connector to the camera. Figure 4 shows the camera installed in the rectangular area of the adapter see  Figure 4(c) shows the HMDI cable connected to the camera. Figure 4(d) shows a picture of the bottom view, in which the camera lens is observed.
The adapter must comply with some minimum requirements to be compatible with the monocular in the microscope. Per the manufacturer, the minimum focal distance is of 3.6 mm. Then, the designed adapter will achieve a focal distance of 8 mm and be able to take images of the full fiber optic for subsequent analysis. Figure 5 shows a picture of the adapter, the microscope, the ensemble and the labeled parts. Likewise, the Figure 5(d) shows the complete box that contains the processor, the touch screen and the program that will allow the evaluation. The complete prototype has characteristics:  Easy to connect as seen in the Figure 5. The equipment allows connection through the HDMI cable and the camera sends the image to the processor. All the data processed in the single board computer will be displayed on the touch screen in real time.

3095
 The portability of the solution is based on its ease of transport: lightweight and easy to handle.  The prototype uses an AC-DC power adapter. However, it has the option to deploy an internal battery.    Because of the small sizes of the dirt objects, the chosen image resolution was of 3280x2464 pixels. This translated into a spatial resolution of 0.0625 μm 2 per pixel, an appropriate size to adequately evaluate the objects. The default illumination intensity level was used since it is supplied by the microscope and kept stable by the adapter's enclosure. Figure 6(a) shows an example of an image acquired with the ensemble.

Image processing and descriptors
The following section details the steps taken to analyze the acquired image and obtain the three selected image descriptors to train the neuronal network and determine the level of cleanliness or dirt in the regions of interest.

Image cropping
The image acquired in step 2.3 is cropped, taking into account the fiber core and cladding areas (area of interest for processing). The resulting image has a smaller size and its digitized primary components of light are expressed as ( , ), ( , ) and ( , ). The values of this components are 8 bit-integers ([0,255]) according with true color format.

Grayscale conversion
In order to reduce computational load without diminishing the algorithm's performance, the image from Figure 6(b) is converted into an 8-bit grayscale image, according to (1) [11].

Histogram equalization
Image 1 ( , ) as shown in Figure 7(a) goes through a histogram equalization process [12]. The resulting image is labeled 2 ( , ) Figure 7(b). Histogram equalization showed better results than enhancement via histogram stretching, because it improved the later segmentation between the cladding region and the rest of the image.
In this case we consider = 1 to obtain satisfactory results. The filtering process is expressed through convolution between 2 ( , ) and ℎ 1 ( , ) mask:

Image thresholding
Image 3 ( , ) is thresholded in order to obtain the cladding segmentation mask, according to (4): Any pixel with a higher value than the threshold 1 = 104 will be set to 255. This threshold was chosen because it yielded the best results for different types of fibers and levels of cleanliness. Figure 8(a) illustrates the output of the thresholded image, 4 ( , ).

Canny filtering
Canny filtering is applied to image 4 ( , ) to obtain the circle of the fiber cladding. The Canny filtering involve: smoothed using a Gaussian kernel, obtaining of edge strength from Sobel filtering, calculation and quantization of the edge direction, nonmaximum suppression and thresholding with hysteresis [15]. In this case, two hysteresis thresholds ( = 110 and = 50) are set to obtain the shapes and borders of the objects in the thresholding. The result of this filtering is 5 ( , ), shown in Figure 8

Circle detection via Hough transform
The Hough transform for circle detection [16] is applied to image 5 ( , ). In this case a 3-dimensional accumulator was used, storing the 2 coordinates of the center and the radius. In the implementation of Hough's transform, the parameter "dp" was set to 3, in order to resolve the search of a large circle with low distortion levels. Gradski [17] defines this parameter as the resolution of the accumulator image, which enables the creation of an accumulator with a lower resolution than the original image. Essentially, a larger "dp" value translates into a smaller accumulator matrix. The chosen value resulted in successfully detecting the circle which exactly belongs to the cladding, and greatly reduced the false detection of circles. A larger "dp" value might have reduced the detected circle quantities to zero in some cases. The transform requires setting a minimum distance between the centers of the circles. If this distance is too small, false circles may be detected. If it is too large, circles may go undetected. It is known that the circle to be detected has a radius of 250 pixels, so this value serves as the starting point for the following computation.
If the value were chosen to be less than or equal to 250, most detected circles would be superimposed over each other. If the value were chosen to be 500 (the circle diameter), some circles tangent to the true circle would be detected. Figure 6 shows that the cladding is almost in the middle of the image, and the region of interest is not large, so lost circles outside of this region would not be relevant for this process. Thus, the distance between centers of circles is defined to be 1000, which reduced the quantity of possible circles in the selection process. The results of the transform are the circle radius and the coordinates of the circle center ( , ). With these coordinates and the circle radius , the detected circles are drawn over the original image such that the regions of interest are defined. Figure 9 shows an example where results correspond to 3 case: Clean fiber as shown in Figure 9(a), dirty fiber as shown in Figure 9(b) and very dirty fiber as shown in Figure 9(c).
Since the fiber cleanliness must also be evaluated around the outer neighborhood of the cladding, the outer ring was also segmented. Thus, mask 3 ( , ) was computed according to the (7).
where 2 is defined according to (8).

Obtaining descriptors
These steps evaluate the segmented regions 6 ( , ) and 7 ( , ) to compute the first two descriptors: cladding and ring variance. Furthermore, they also evaluate image 1 ( , ) to find the third descriptor: quantity of contours in the image. The first step consists on calculating the histograms of 6 ( , ) and 7 ( , ), without considering pixels with zero intensity (black pixels), such that histograms only contain only gray pixels. Figures 11 and 12 show these histograms. Figure 11. Segmented cladding histogram Figure 12. Segmented ring histogram These histograms serve to calculate the variances [18] in each region (cladding or ring). These variances are defined as σ 2 and σ 2 according to (10).
where represents a pixel value of a region, the number of pixels of a region and ̅ the average value of a region (cladding or ring). The variances of each region are radiometric descriptors that allow to detect the dirt presence level. In this case a region with high level of dirt also will have high variance value. The third descriptor is the number of contours ( ) identified in the 1 ( , ) image. This descriptor was chosen because the number of contours increases significantly when the dirt level is high. It helps to differentiate between different levels of dirt or cleanliness.
To  ( , ). The difference is in the hysteresis thresholds used in the Canny filtering step: = 200 and = 10. These hysteresis values attempt to increase the threshold range such that more borders will be found in the whole image, instead of only in the cladding region. This translates into searching for more borders and shapes that could correspond to dirtiness. Figure 13 shows the output image 8 ( , ).
The next step consists of finding the quantity of contours in image 8 ( , ). Figure 14 shows a flowchart to achieve this objective. Figure 14 shows that Suzuki's algorithm [19] analyzes image 8 ( , ) to then discard redundant points and obtain a list with the coordinates of all the relevant contours. Then, the integer variable is calculated from this list's length.  Figure 13. Image 8 ( , ) Figure 14. Flowchart for contour quantification Figure 15 illustrate the basic steps of Suzuki's algorithm. The algorithm scans the image from left to right, labels edges as outer or hole edges and establishes the hierarchy between the discovered pixels. This repeats for each image row, from top to bottom. The algorithm considers contours as continuous and smooth curves. Finally, the steps achieve the following 3 descriptors: σ 2 , σ 2 and .

Identification of the cleanliness level by neural networks
A trained model of neural networks was used to identify the cleanliness level of the fiber optic connectors [20]. The model inputs consist on the three previously calculated descriptors, and the output is a 3-element array which indicates the cleanliness of the fiber optic connector. Figure 16 illustrates how the descriptors are input into the neural network and the model outputs the binary array as described above. The dataset consisted of 99 pictures of fiber optic connectors, evenly distributed into 3 groups of 33 photos for each possible output. The dataset was further divided into training and evaluation sets:  Training set (70% of the total dataset)  Evaluation set (30% of the total dataset) The objective was to achieve a precision of at least 90%. To achieve this, the network was set according to the following characteristics:  Network size: Sánchez [21] mentions that the importance of the hidden layer size to the neural network model fit has been stated many times in the scientific literature, but no conclusive results have been demonstrated. Nonetheless, there are plenty guiding criteria to start building said models. Some of the criteria considered in this work are described by Del Carpio [22]: a) Quantity of hidden layers: more layers add complexity to the network and make the training slower. If data are linearly separable, hidden layers are not necessary, but in this case, data are not linearly separable. b) Quantity of neurons in each hidden layer: these neurons are related to the input variables, such that this quantity must not be larger than twice the input size. If this is not enough to achieve acceptable results, more neurons are added to the output layer. Another criterion is directly related to the final adjustment error and desired precision. These criteria guided the model building and simulation, which was done with TensorFlow (a free and open-source software library for dataflow and differentiable programming developed by Google). This simulation helped in defining the final network size: 3 hidden layers with 12, 4 and 3 neurons, which improve the loss during training and testing. Scikit-learn [23] explains each neuron in the hidden layer transforms the values from the previous layer with a weighted linear summation, followed by an activation function. The output layer receives the values from the last hidden layer and transforms them into output values.  Alpha=0.06, for overfitting correction. This value contributes to keeping the precision above 90% by avoiding overfitting. Scikit-learn use this parameter as a regulator which penalizes overfitting by restricting the weight magnitude.  ReLU (rectified linear unit) activation function for the hidden layer.  Training algorithm 'adam'. Scikit-learn defines 'adam' as a stochastic gradient-based optimizer, which can automatically adjust the amount to update parameters based on adaptive estimates of lower-order moments  Maximum iterations=11000 With these hyperparameters, the model achieved a training precision of 97% and a testing precision of 92%. Figure 17 presents the training and testing algorithm flowchart. The first step is building the network model, defining neurons, layers, and iterations. Next, training takes place, where loss is analyzed at the end: if it is lower than 0.3, the model is tested, otherwise, training resumes. The 0.3 loss ensured that by the end of the training, precision was of at least 90%. Finally, if precision is higher than 91%, the model is saved, otherwise training resumes.

RESULTS AND DISCUSSION
First of all, as result, the device reaches the denomination of a low-cost device. Also, the novel proposal device permits a high-quality image acquisition in order to solute the issues registered in previous solutions and proposals [7]. In the other hand, to test the accuracy of the device, two human experts evaluated the cleanliness/dirtiness of 27 SC/UPC fiber optic connectors through survey-type formats. This evaluation was performed by visual inspection of images acquired with optical microscope. However, for validation, only cases where both experts agreed in their evaluation were considered.
Furthermore, the images of the same connectors were evaluated by the proposed algorithm (output of the neural network algorithm). Using the Cohen's Kappa Index [24] which is served to measure the agreement level according between two or more observers according to the values in Table 1 [25,26], the agreement level according between experts and the device proposed are defined by the magnitude of the index K [27].  Figure 18 shows the flowchart to compute K. A first agreement evaluation of the 27 connectors resulted in 6 of them where human experts had differing results, due to subjective perception. After eliminating those 6 conflicting elements, the evaluation set was reduced to 21 connectors (i.e., the device was evaluated 21 times), which results are indicated in Table 2. Next, Table 3 shows the confusion matrix between expert evaluation and the proposed algorithm's evaluation. These results are shown as proportions in Table 4. The metric K [24] is computed according to (11

CONCLUSION
The conclusions and final remarks are a low-cost device, capable of evaluating the cleanliness of fiber optic connectors with similar results than human experts, has been successfully developed. The microscope adapter achieved stable illumination and safely holds the digital camera. The image processing algorithm computes adequate variables, which reflects on the Cohen's Kappa index values being close to perfect agreement. Nonetheless, when analyzing results individually, the index for dirty connectors is 0.529, which is classified as Moderate Agreement. Although this can be considered an acceptable result, it will be necessary to enhance training such that the agreement value can be higher for this particular result. The algorithm takes 3 seconds to output the image classification. The digital camera microscope adapter is practical and innovative. The proposed system can be implemented in companies and organizations that already have the optical microscope, further reducing the system cost and increasing the market acceptance. The algorithm might fail when analyzing fiber optic connectors with a broken or damaged ferrule. Since this case is not considered in the algorithm design, the result might not agree with the connector being clean, dirty, or very dirty. As opportunities for improvement, there is the development of a case that can withstand the inclemency of field work and complement the algorithm developed with instances that allow us to distinguish what type of dirt is present and perhaps carry out an automatic cleaning, using electronic tools.