On data collection time by an electronic nose

We use electronic nose data of odor measurements to build machine learning classiﬁcation models. The presented analysis focused on determining the optimal time of measurement, leading to the best model performance. We observe that the most valuable information for classiﬁcation is available in data collected at the beginning of adsorption and the beginning of the desorption phase of measurement. We demon-strated that the usage of complex features extracted from the sensors’ response gives better classiﬁcation performance than use as features only raw values of sensors’ response, normalized by baseline. We use a group shufﬂing cross-validation approach for determining the reported models’ average accuracy and standard deviation. This is an open access article under the CC BY-SA license. cross-validation approach was used for determining the reported models’ average accuracy and standard deviation. We demonstrate that most of the information used by the models for classiﬁcation is available ﬁrstly in the data from the beginning of the adsorption phase, which means sensors On data collection time an electronic nose Borowik)

INTRODUCTION Electronic noses (e-noses) [1]- [3] are artificial devices that consist of an array of gas sensors supported by machine learning pattern recognition techniques. One of the critical fields of application of e-nose is the food industry [4]- [10] for which odor characteristics of the products are one of the essential indications of product quality. Development and verification of the machine learning methods in application to the odors classification by e-nose measurements data have a similar level of importance as the development of the sensors and sensors arrays consisting of the e-nose hardware. Similar techniques can be employed regardless of the domain of application of the e-nose. One of the main obstacles in such development is the relatively long time needed to collect sufficient measurement data. To overcome this, one can use publicly available datasets, which became famous as testbeds for machine learning modeling reported by multiple authors.
In the present studies, we use publicly available datasets [11] of odor measurements by electronic nose. The same dataset was used in our previous research [12]. We focused on features extraction and selection, optimization of the number of used sensors, and the possibility to use for classification only single-sensor electronic nose. The original studies of the same dataset [13] were focused on the possibility of spoilage odor detection after a very short exposure of the electronic nose to the odor sample, lasting a few seconds. Zhang and coworkers [14] used this dataset to demonstrate proposed analytical algorithms' performance.
In this report, we deal with the different subjects of optimization concerning the time of odor measurement. We are interested in the analysis of the dependence of the classification accuracy on the odor measurement time. Recently Rodriguez Gamboa and coworkers [15] examined several datasets and used deep Ì ISSN: 2088-8708 learning and support vector machine models to demonstrate the potential of using only a part of electronic nose measurement data for correct odor classification. The used dataset is collected by custom-made e-nose consisting of Taguchi type MQ-series gas sensors. In recent years one can find many suggestions for constructing low-cost electronic noses, and several groups propose devices based on similar sensors [13], [16]- [22]. The findings presented in this report can be relevant to other applications of similar devices.
Considerable research concerning e-nose data is focused on the extraction of the complex features describing curves of sensors' response to the gas exposure. However, there are also other reports, especially applying deep learning neural networks, in which raw measurement data are used. It is interesting to compare both approaches and demonstrate the influence of the dimensionality reduction by the principal components method.

2.
METHODS AND PROCEDURES 2.1. Odor measurement Rodriguez Gamboa and coworkers [13] presented the measurement of odor at various spoilage stages. Twenty-two samples of bottles of commercially available wines of different varieties and vintages from four producers from the São Francisco valley (Pernambuco-Brazil) were used. Thirteen randomly selected bottles were left open for six months, which gave the population of low-quality wines. Four randomly selected bottles were left open for two weeks before measurement, and they are considered as average quality wines. The remaining five bottles are labeled as high-quality wines. Except for these samples, samples of ethanol diluted in distilled water in six different concentrations were used, which may be considered additional six measured bottles. That gives four categories of odor that are classified. In total, the dataset consists of measurements of 300 samples as collections of sensors' response of 3 300 points for each sensor.
The e-nose developed at Universidade Federal Rural de Pernambuco [13] consists of six commercially available metal-oxide gas sensors produced by Hanwei Sensors (www.hwsensor.com). Two sensors of each type (MQ-3, MQ-4, MQ-6) have been used in the presented construction. During the measurement, the first 10 seconds were used to collect baselines of sensors' response when e-nose was exposed to pure air. Then, the odor's prepared sample was pumped into the sensor chamber, and 80 seconds of sensors' response during the adsorption phase was collected. After that, the sensors' response during 90 seconds of the desorption phase was collected when pure air was pumped to the sensors' chamber. After the measurement, the e-nose was exposed to pure air for 10 minutes to purge the experimental setup.

Classification modeling
The measurement data are a series of sensors' responses expressed as their resistance R over time. As the first step of data processing, the measurement data are divided by the sensor resistance's baseline value R 0 collected just before electronic nose exposure to the measured odor sample. In Figure 1, we present an example of sensor signals collected during the measurement of the odor sample, with a schematic representation of the time span used for the extraction of modeling features. The signal collection is performed with a frequency of 18.5 Hz. As a first step of the data processing to reduce the measurement noise, the average response with 20 observations is calculated.
Then two approaches of extraction of features used for classification within machine learning model training have been employed. First, we decided to use modeling features, just the magnitudes of sensors' response relative to the baselines: R/R 0 and inversion of these values representing sensors' conductance G/G 0 . We employed the second approach to extract the complex features describing sensors' response curves, e.g., average value, maximum value, and maximum slope. The complete list of the features that we have used for training classification models is presented in our previous report [12]. Since our studies are focused on the dependence of the classification performance on the measurement time by the e-nose, the modeling features are calculated using only part of available data, in the range from the beginning of gas exposure until the considered time. We represent this by the dashed region in Figure 1.
The odor samples were prepared from 28 bottles, and each of them was used for about ten measurements. It should be noted that such an experimental procedure leads to a correlation between training observations. Hence, to obtain a reliable estimation of the classification models' performance, we applied a group shuffle cross-validation procedure, assuring that all observations from a given bottle's odor measurements are  Two types of modeling techniques have been applied: logistic regression with multinomial classification (LogReg) and support vector machine classification (SVC) with radial basis functions kernel and one-vs-one multi-class scheme. For both algorithms, we performed two types of tests. In the first case, the modeling features, as described above, were used. In the second case, these input variables were transformed using the principal component analysis method. Only the six most important components were used as the modeling features (PCAReg, PCASVC). The prepared features dataset was transformed using the standard scaller method.
We decided to use only these classical modeling techniques [23] Moreover, we disregarded more complex algorithms such as multilayer neural networks since the number of observations available for modeling is quite limited. In total, the used datasets [11] contain measurements of 300 odor samples. Even though more flexible modeling techniques can provide more expressive classification models, the number of fitted parameters is much higher than in the applied methods. Aggarwal [24] (page 25) indicate that the total number of training data points should be at least 2 to 3 times larger than the number of parameters in the neural network. However, the precise number of data instances depends on the specific model at hand. Hence, the simpler models that we applied in principle should be less prone to over-fitting. The modeling has been performed using computer codes in Python 3.7 language with a scikit-learn module [25].

RESULTS AND DISCUSSION
In Figure 2, we present a comparison of the average cross-validation accuracy of various types of models as a function of time from the beginning of sensors' exposure to examined odor, from which data have been used for model building. Besides, in two subfigures, we would like to distinguish between various approaches to the extraction of modeling features. Figure 2(a) shows the raw data of sensors' response relative to the baseline. While in Figure 2(b), models are built using an extensive set of complex features [12].
The first observation from these results is that the logistic regression model exhibits the best model accuracy performance. It is also interesting to notice that this is confirmed in two considered modeling feature sets. When we compare Figures 2(a) and (b), we can also observe that models trained on complex features Ì ISSN: 2088-8708 exhibit better classification accuracy than the models with the sensors' response's raw values. The significance of the feature extraction procedures developed by the e-nose research community is thus visible. Another important observation can be deduced from Figure 2(b). There is an abrupt increase in the model performance just at the beginning of the sensors' exposure to the studied odor. Similar behavior can be noticed at the starting moment of desorption when the sensors are again exposed to the clean air. We deduce that precise measurement can give the most relevant information that can be used for odor classification during these moments.
Special care [26] in the design of an electronic nose is required to provide a rapid change of sensors' exposure to different gases, remembering to ensure repeatability of measurement conditions. Szczurek et al. [27] and Staymates et al. [28] reported measurements in "sniffing" mode when frequent changes between studied odor and pure air occur or in the initial time of the sensors' action [29]. In Figure 3, we present another comparison of models' performance as a function of time of measurement from which data are available for model building. We focus on logistic regression models and compare six types of modeling feature sets, which are a combination of two cases, as we summarize in the Table 1.
As we already noticed, the results presented in Figure 3(a) confirm that better classification performance can be achieved when complex features extracted from the sensors' response curves are used compared to models built on just raw values of normalized sensors' response. Another interesting observation in this figure is that the models in which features are based on sensors' conductance G also exhibit better performance, especially when the time of odor measurement by the electronic nose is reduced. Suppose the model is built on the sensors' resistance R data. In that case, this requires performing odor measurement for a longer time and mainly includes measuring the desorption phase of the sensors' response. The same observation concerning models built on the resistance data is valid for both types of sets of features considered in the present studies. In Figure 3(a), one can notice that for both "R" curves, they exhibit a kind of saturation region. After the beginning of the desorption phase, at 100 seconds, the models' accuracy is again improved. Figure 3(a) suggests that it may be enough to reduce the odor measurement time for about 30 seconds when complex features are extracted from resistance or resistance and conductance response curves. In that case, the increase of measurement time and measurement in the desorption regime does not lead to better classification performance.
More insights give examination of Figure 3 models trained during group shuffle cross-validation procedure is presented. We can conclude that the odor measurement time should be in the range of 70-90 seconds (including 10 seconds of baseline conditions measurements), allowing us to obtain more stable classification results. When the data extracted from the sensors' resistance values are included in the modeling, it introduces some additional noise, which only slightly reduces the classification accuracy and leads to less stable models. The reduction of the classification performance on new data may appear in this way. As one can notice, examining Figure 1 and the description of the measurement procedure in section 2.1, such optimal time of data collection is shorter than half of the measurement time given in [11], [13]. In other research, [15] similar results have been found for measurements of other types of odors. An advantage of shortening the time of odor detection by an electronic nose is noticeable. However, one can keep in mind that this time is not directly related to the number of odor samples measured by the electronic nose device in a given time. After the measurement, there is still a need for device purging and sensors' base state recovery in clear air, much longer than the odor measurement time.

CONCLUSION
In the paper, we presented machine learning classification models built on publicly available datasets of e-nose measurement of spoilage odor. The research focused on verifying the optimal choice of odor measurement time by e-nose to collect data for training a machine learning classification model with superior performance. We presented a comparison of various modeling features based on sensors' response resistance and conductance. A group shuffling cross-validation approach was used for determining the reported models' average accuracy and standard deviation. We demonstrate that most of the information used by the models for classification is available firstly in the data from the beginning of the adsorption phase, which means sensors Ì ISSN: 2088-8708 exposure to the studied odor, and secondarily in the data from the beginning of the desorption phase, which means sensors exposure to the clear air after exposure to the studied gas. The performed analysis leads us to the conclusions: i) that for the considered case, only complex features extracted from the sensors' conductance curves G should be used for a classification model, ii) it is sufficient to use data of measurement performed during gas adsorption phase only, iii) and that the logistic regression algorithm should be used. There is a conclusion concerning the recommended machine learning classification method. In many reports, the support vector machine is used as a gold standard for such applications. As we demonstrated, it may depend on the considered application, and there are cases when the logistic regression algorithms prove superior performance.
ACKNOWLEDGEMENT This work was supported by the National Centre for Research and Development by the grant agreement BIOSTRATEG3/347105/9/NCBR/2017.