Semi-automatic model to colony forming units counting

ABSTRACT


INTRODUCTION
The study of microorganisms is one of the most extensive fields of scientific research around the world. Bacteria are a particular case of microorganisms and can be used, for example, in the pharmaceutical and food industries, in medical diagnostics, and in quality control of drinking water [1]- [5]. A recurrent tool in the study of bacteria is the direct counting of colony forming units (CFU) in petri dishes. This process is employed in tracking bacterial growth under specific conditions and estimates the number of viable bacteria in the sample, which is a bacterial suspension on a culture medium in a petri dish. On the other hand, many laboratories around the world carry out the count of bacterial colonies process manually yet, which is a problem because consumes considerable time and effort from researchers and laboratory employees and, as exposed by Hallas and Monis [6], it can introduce an inter-operator variability and allow the data entry errors. Then, a particular interest is the automation of CFU counting using digital image techniques and computational vision systems.
Currently, commercial and automated counter systems of CFU can be found, for example, ProtoCOL 3 [7], BioSpot ® Colony Counters [8], or PhenoBooth [9]. In general, these systems employ specialist image acquisition hardware, which is advantageous because they have total control of the position and illumination of the scene in the image acquisition processes, providing desirable features for the counting software. The expectancy is the systems working with excellent performance, but, in this case, the problem is they are very costly for small laboratories and institutions. Brugger et al. [10] and Chiang et al. [11] also presented prototypes with integration between specialist hardware and software, looking for the control of position and lighting of the sample. In the first case, a segmentation step is carried on for the whole image using a dynamic threshold based on Otsu´s method [12], and then an adaptive segmentation is going on to find an appropriate threshold for each pixel, next, they use ratio aspects and Bayes classifier to separate concatenated colonies. In the second case, they use bottom-hat transform for correcting problems with the non-uniform illumination in the image and also Otsu´s method for segmentation, a watershed transform [13], [14] is employed to divide the overlapping colonies. Zhu et al. [15], for example, present a system based on images captured at near-infrared light and also use thresholding technique and watershed transformation to the segmentation process.
In general, arrangements between software and hardware looking for illumination control can be called active, by contrast, systems that do not use external lighting are called passive. These last systems are attractive because, without the hardware components, they are low-cost, even some may be free. In this way, different authors have indicated distinct methodologies to automate or semi-automate the process based on the passive system [16]- [21], and others, such as Bray et al. [22], adapt general software of biological image analysis to count CFU. In general, the first problem in the CFU automatic counter systems is to separate the bacterial colonies from the petri dish background. The second problem is to ensure the complete separation of colonies among them. However, the thresholding technique and Watershed transform are recurrent techniques in these systems, the first, generally, is to separate the petri dish from the image background and as an initial CFU segmentation, the second is used to divide the remnant clusters of colonies. Other techniques included Hough transform [23] to detect the center of the dish or circular form of colonies; color similarity metric in hue-saturation-value (HSV) color space [24] to segmentation using chromatic images; mathematical morphology, like dilation, erosion, and combination of they both [25] to segment binary images. Moreover, different development environments, and programming languages are used too, for example, MATLAB, MIT App Inventor 2, C++, or Java.
Then, there are different systems and alternatives to automate the counting of colony forming units, however, new contributions are significant because the problems of segmentation and identification in the programmed systems are more complex with a high number of colony forming units in the petri dishes. Even, there are more difficulties for correct counting when laboratories use different culture methods or kinds of agars, and when they use different types of lights in the image acquisition process, which can it mean variables are unattended in the programmed algorithms. Then, the main aim of this work is to present an alternative method to estimate the number of CFU from a digital image of a petri dish with the sample. The present method uses a predefined mathematical model and some color information about the CFU present in the image to count. This is an assisted method, and it can be seen as semi-automated, but it shows good behavior with few steps. Figure 1 presents an example of a typical petri dish with a sample of CFU; in this case, a blood agar plate and Escherichia coli ATCC 25922. In general, CFU from the same type have similar color characteristics in the image. Fact is important because it is information to be taken into account in the segment process. In general, this is similar to information used on other programs and methodologies in the segmentation processes. In the case of this work, the color information is used to adjust a quadratic function, as shown in Figure 2. It is understood that in this model, the principal components of the same object have the same color characteristics, and these change from the center to the ends in a smooth form. The model is carried out for conventional cameras in the three-color space red, green, and blue (RGB). As a reference, Figure 2 presents an example taking x1 as any space color, red, green, or blue; x2 as another one, and the remaining space color as a constant value. , the function gives its maximum response, which has been encoded so that it does not take values greater than 1. This model has performed from a predefined model, which is shown in Figure 3 for one unique variable and, in that case, it is according to the relation:

METHOD
where x represents the input variable, y is the response, and produces the value zero in y for value 2 in x. It was decided to select this model to have not an abrupt change in the response and to use a coded value of 1 in the input variable as the standard deviation calculated for a sample of CFU taken from the image in the translation for real values. The codified value 0 for any input variable (red, green, blue) corresponds with the mean value calculated for the same taken sample. The model in (1) is translated for a central composite design for three input variables as shown in Figure 4 to get a second-order regression model as: 2 = 0 + 1 + 2 + 3 + 11 12 + 22 22 + 33 32 + 12 1 2 + 13 1 3 + 23 2 3 (2) and uses the following expression to find the regression coefficients: where * is the vector response according to (1) for each input variable X1, X2, and X3 at the central compositive design; M is a matrix formed by the corresponding values of X1, X2, and X3 represented in Table 1, and according to with the regression coefficients of (2); is the transpose matrix of M. Figure 2 is formed with the regression coefficients, (3), and the model, (2), keeping a constant input variable.   As commented before, this method needs a sample of RGB values from the imagen. Then, in the methodology, one must choose manually six random points that correspond to CFU in the image, as shown in Figure 5. Six points are enough estimating ten digital levels of deviation, a confidence level of 90%, and an error of 7 digital levels. The number of digital levels in the image depends on its radiometric resolution; for a conventional 8 bits resolution, the number is 256.  Table 1 for X1, X2, and X3 correspond with factorial points in CCD as shown in Figure 4, value 0 to central point, and values ±1.682 to axial points. For any input red, green, and blue space color, the value is codified with the expression: where, _ is the real value for a color space input (red, green, or blue) of a pixel in the image; _ is the mean value computed for the six choose points in this input; and _ is the standard deviation for these points in the input.
The segmentation process is carried out from (2) for all pixels in the image, values greater than zero in the response of the model are label as CFU, the others are discarded. At this point, a binarization process in the image is executed, and a morphological opening function is carried out to help separate connected CFUs. Here, the process can differentiate between a CFU pixel and a pixel of another object in the image. Then, a new labeling process is carried out to connect pixels of the same object (CFU). This process is made using connected component labeling with the eight closest pixels. Thereby, each object is identified and told apart from the others. Besides this, the method takes a mean size from all objects labeled as CFU, and all objects with a size less than 15% of the mean value size are not taken into account. Finally, counting is performed from the object labeling in the image. This method was carried out in C++ programing language, and used the OpenCV library. Different culture mediums and kinds of bacteria were performed for agar plate samples.

RESULTS AND DISCUSSION
The model was achieved as:  Figure 6(b), where, for illustration, the labeling pixels as CFU by process are highlighted in different colors; in this case the system computed 127 CFU, and the manual counting register 131, which is agree at a 96.9%.  presents response for Escherichia coli ATCC 25922 in agar blood sample as shown in Figure 1, in this case 993 CFU was computed with 947 manually counting, agreement 95.36% Figure 8 shows a good behavior of the systems, despite having a large number of colonies in the image. Tests with other organisms have shown high efficacies of the model, specifically for a samples of Staphylococcus aureus in blood agar, computing 41 CFU by 43 manually counting (95.34%), for Escherichia coli in MacConkey agar, computing 89 CFU with 96 manually counting (92.7%). Other example for MacConkey agar with 983 CFU computing by the system, 997 manually counted (98.59%) shows good behavior with a large number of CFU presents in the image. As seen from Figures 7 and 8, with an appropriate choice of six CFU points the method exhibits good behavior for images with many or few CFUs contained in the sample. However, it is clear that the segmentation process depends of the chosen points, and then, an adequate response is achieved when the chosen point corresponds to CFU in the sample. Thus, this method can be used to count CFU from the digital image and does not have problems in implementation, but it must be correctly assisted with the selection of the six points in practice, then it is acceptable to say that it can act like a semi-automatic system

CONCLUSION
This paper presents a method for counting colony forming Units in agar plates from a digital image of petri dishes with the sample. The process uses a predefined second-order mathematical model, and it is based on the space color agreement between different pixels of the same CFU sample. The method is assisted and presents good behavior for images with many or few CFUs in the sample, with an agreement higher than 90% compared to manual counting. The segmentation process depends on the information of RGB color spaces of six points corresponding to CFU in the sample. The user chooses these points, and, for a correct function, the points must be chosen in an accurate CFU position in the image. In the future, the standard deviation of the chosen points must be used to segment UFC with different space colors in the same samples, and the sizes of labels in the image, to not eliminate small objects that correspond to CFU.

Yeison Alberto Garcés-Gómez
received bachelor's degree in Electronic Engineering, and master and Ph.D. degrees in Engineering from Electrical, Electronic and Computer Engineering Department, Universidad Nacional de Colombia, Manizales, Colombia, in 2009, 2011 and 2015, respectively. He is Full Professor at the Academic Unit for Training in Natural Sciences and Mathematics, Universidad Católica de Manizales, and teaches several courses such as experimental design, statistics, and physics. His main research focus is on applied technologies, embedded system, power electronics, power quality, but also many other areas of electronics, signal processing and didactics. He published more than 30 scientific and research publications, among them more than 10 journal papers. He worked as principal researcher on commercial projects and projects by the Ministry of Science, Tech and Innovation, Republic of Colombia. He can be contacted at email: ygarces@ucm.edu.co.