Case-based reasoning system for prediction of fuel consumption by haulage trucks in open–pit mines

ABSTRACT


INTRODUCTION
Mining is the process of obtaining useful minerals from the earth's crust at a profit [1]. Open-pit mining operations generally employ conventional methods to mine ore or waste [2]. The profitability of any mine is determined by the efficient management of the unit operations [3]. Research shows that truck haulage costs account for 26% of the production costs, and together with shovels, they account for 40% of the total mining cost [4][5][6]. Diesel operated dump trucks are commonly used to haul materials in most surface mining operations [7]. The mobile materials handling fleet often account for a sizeable amount of both capital and operational costs. Fuel consumption is one of the primary operating costs associated with shovel-truck operations, with fuel cost constituting a significant part of the materials handling cost [8] and accounts for about 30% of total energy costs in surface mines. During idling, there is no production by the truck but the engine continues to run. Hence it is necessary to control fuel consumption by haulage trucks to reduce mining costs. The need for cost-saving has motivated mine operators and governments to conduct several research studies on how to reduce energy consumption in the mining industry [9]. A paper using case based reasoning (CBR) shows that good driving style reduced fuel consumption by 12% for light driving vehicles in urban areas [10]. However, CBR cannot be used in predictions when the available data is misleading or irrelevant [11]. Hence, methods that lessen the impact of irrelevant or misleading features are required to ensure that reliable and accurate predictions are made using CBR for the CBR frameworks to benefit from improvements of the feature sets.
The use of the CBR method could enable mine operators to predict the fuel consumed by each truck per trip and the total amount of fuel consumed per shift or per day by the truck fleet. This will assist in making logistical arrangements for fuel supplied to the mine and to determine when the fuel consumption of any of the trucks is getting too high for remedial action. This study investigated the factors that affect fuel consumption by haul trucks in open pit mines and developed an algorithm for predicting the consumption of fuel by haul trucks. The CBR algorithm that was developed was used to predict fuel consumption of randomly selected haulage trucks in an operating open pit mine.

BACKGROUND RESEARCH
Case based reasoning is the process of effectively dealing with a new problem based on the solution of the past problems [12,13]. After getting a prediction each time, a new problem has been solved, the new solution is retained and made available for future identical problems [14]. For accurate predictions to take place the cases must be similar and close to the new problem [15,16].
The primary idea of CBR is that when a new prediction of fuel consumed per trip is required, a similarity measure will be used to select the most similar past trips to predict the amount of fuel to be consumed. The performance of CBRs in relation to other prediction methods has been shown to be encouraging [17]. The level of success of CBRs depend upon attributes of the dataset. Discontinuities existing in fundamental connections between dependent and independent variables tend to make the CBR technique more effective [18]. Unfortunately, CBR method is often exposed to the likelihood of misleading or irrelevant factors in the prediction exercise [11]. Therefore, all irrelevant or misleading features must be removed to reduce their effects in the predictions. This is done by using feature subset selection (FSS) which determines the most favorable feature subsets for an accurate prediction [19]. In FSS the factors that carry significant information regarding the output are considered as relevant features. Thus, in assigning weights, it is best to allocate higher weight values to factors that carry significant information about the output. Also, all CBR methods must assign weights to factors. For instance, a CBR technique that uses all features basically assigns non-zero weights to all factors. FSS techniques assign 0 or 1 weights to features indicating exclusion or inclusion respectively [12]. FSS techniques can be categorised into two main groups using filter and wrapper methods. The filter method reduces the number of factors before training [20]. Thus, it is less computationally complex but less accurate [10]. Wrapper method combines with indicators to limit fitting errors [21]. Hence, wrapper method has high fitting accuracy, high complexity and a low generalisation of chosen factors to other conditions.

Dataset creation
In this paper, a dataset is formed entailing dependent variables (DV) and independent variables (IV) as required by the CBR technique. DV is the variable that must be estimated using a lot of independent variables or parameters. Fuel consumption in light duty vehicles has been successfully evaluated using CBR techniques [10]. In most researches, DV and IV of datasets are chosen utilising domain experts and the choice of the best parameters do not necessarily lead to optimal results. Therefore, to derive the right independent variables, factors that affect fuel consumption by haulage trucks are used.

Independent variables
From the literature, 19 parameters where identified to influence fuel consumption by haulage trucks. Interviews, site visits and data was collected from Komatsu and Barloworld who are the service providers for haulage trucks at Debswana Orapa Mine. Table 1 (see in Appendix) summaries the key 19 parameters identified by various researchers as determinants of fuel consumption by trucks in open pit mines. These were employed to predict fuel consumption by haulage trucks in this study.

Repeated measure design
Repeated measure design is taking measurements on the same subject over time and under different conditions. It is characterised by having more than one measurement of at least a given variable for each subject [34]. Repeated design measures were used in this research because they reduce the variance of estimates of fuel consumed per trip, fewer trips need to be trained to complete the analysis enabling many analyses to be done in a shorter period and it permits monitoring of how the weights of the features change after some time for both short-and long-term situations.

Creation of datasets
Domain experts have been used by most researchers to select dependent and independent variables for data sets. There are next to no hypothetical reflection to help select great variables [35]. To select the best variables, domain experts may not always be available. Therefore, this work presents variable selection systems which can be used by mine managers.

EVALUATION METHODS
The accuracy of various estimation techniques is used as the main criterion in analysing the value of CBR's estimation of the fuel consumption by haulage trucks. Shepperd and MacDonnell method (SMM) was used to reduce the irregularities between validated research results and to provide a basis for understanding results with a specific emphasis on continuous prediction systems [36]. This work also followed the methodology generally employed when predicting the Design Reality Gap scores for telecentres [12].
The evaluation of an estimation system is based on standardised accuracy (SA), an unbiased statistic, calculation of effect sizes and testing the results of likelihood of the value relative to the baseline technique of random predictions (guessing). According to the SMM, an estimation system, Ei, is assumed. This is validated over a dataset D using some accuracy statistic S as per the validation scheme V [37]. The SMM method can be utilised alongside other competing estimation systems. For example, given two estimation systems, E1 and E2 and an accuracy statistic S, one must answer basic questions such as: How is the performance against random guessing? What is significance testing? What is the effect of size?

Performance against random guessing
A baseline of random guessing is established to find out if the suggested method performs better than random guessing. It is expected that any theoretically good system should execute better than random guessing over time. If that is not the case, then it is assumed that the indicator is not utilising the target case features in any useful way.

Significance testing
Mean absolute residual (MAR) is used as the accuracy statistic, S, for continuous estimation systems. In contrast to mean magnitude of relative errors, MAR does not depend on proportions. Thus, it is unbiased. Unfortunately, with MAR comparisons across the datasets cannot be made because the residuals are not standardised and are difficult to interpret. Accordingly, a standardised accuracy measure (SA) has been presented [36] where accuracy is measured as the MAR relative to random guessing figure, E0. Hence, the SA for estimation of Ei is given by (1).
where MAR E 0 is the mean of a large number, normally 1000, runs of random guessing? SA is a ratio representing how better the estimation system, Ei, is than random guessing E0. If the value of SA is near zero it would be discouraging, and a negative value would be regrettable.

Effect size
Effect size is a simple way of quantifying the difference between the methodologies. Research work has shown that the larger the effect size the stronger the relationship between the two methodologies. To assess the effect size, a standardised measure by Glass delta (∆) is used [36]. Glass delta is a measure of effect size which uses only the control standard deviations and it is only used when the standard deviations are significantly different between the techniques, this is given by (2).
where E0 represents the sample standard deviation of the random guessing methodology. Glass delta standardises the contrast between the two estimation frameworks and afterwards it contextualises the distinction as far as measure of variety in the two measures of the accuracy statistic S. The standardised effect size is scale-free and considered as small (≈0:2), medium (≈0.5) and large i.e.≈0.8 [37]. The ∆ has a unit of a standard deviation, so the effect is a decrease in the MAR of n amount of fuel consumed per trip.

EXPERIMENTS
Three CBR estimation techniques namely traditional CBR, case-based reasoning using forward sequential selection (CBR-FSS) and Naïve methods, were used in the prediction of fuel consumption by haulage trucks in an operating open pit mine in Botswana. These techniques have been utilised by other researchers in their work and they constitute a range of potential methodologies that can be used for casebased fuel consumption estimation [37,38]. The Naïve method uses a sample mean to estimate a new fuel consumed per trip [36,37] and CBR method uses all the parameters equally weighted to estimate the fuel consumed. The Naïve technique might be considered as a baseline technique and its application is from the 1990 s [39]. The case-based reasoning using forward sequential selection (CBR-FSS) method was preferred in this research because it has been proven to perform better that other techniques employed in their research [37]. CBR-FSS gives weights of 0 to irrelevant parameters and 1 to relevant parameters. This technique excludes all irrelevant features to fuel consumption prediction as compared to CBR.

Limitations of study
This study used data obtained from Debswana Orapa Mine only. Therefore, the results may not be applicable to all mines due to their different locations and operating conditions. Another limitation of the data used is that it does not include all parameters such as weather identified from the literature. Earlier studies have identified the parameters in Table 2 as those that influence fuel consumption by haulage trucks in open pit mines [22,[40][41][42][43].

2.
Extreme braking This is when the braking distance required to stop is shorter than expected. [44] 3. Excess RPM It is the measure of how many times an engine turns in a minute. Fuel consumption is lower at low RPM due to friction. [45] 4.

Traffic conditions
Increasing levels of congestion lead to lower average speeds, longer travel times and increased delays at loading and dumping points. Hence flow of road traffic has a great effect on fuel consumption. [46,47] 5. Road geometry Road geometry has a great impact on overall energy consumption. [48] 6. Driver's age Older drivers have been acknowledged as less aggressive when driving as compared to younger drivers [49] 7. Truck age Older vehicles have high shortfall in fuel consumption as compared to younger drivers. [42] 8. Driver's experience A skilled driver can reduce the fuel consumption by more than 10% compared to an inexperienced driver, by reducing the need of braking [43] 9. Vehicle consuming accessories These are heating, air-conditioning and other accessories that consume fuel. [45] 6. EXPERIMENTAL RESULTS All the statistical analysis contained in this research work is based on absolute residuals from the methodologies selected. The absolute residual results are summarised in Table 3 while Figure 1 shows the residual distribution of the results. The SMM method was used to evaluate the results from the various estimation techniques. Consequently, the results are analysed based on mean absolute residuals being 14.24, 20.24 and 24.33 for CBR-FSS, CBR and Naïve technique respectively as compared to 35.97 obtained from random guessing. The initial results on the standardised accuracy of the three methods compared to random guessing based on standardised accuracy (SA) and effect size (∆) are summarised in Table 4 [39]. The results in Table 4 shows that when the CBR-FSS, CBR and Naïve techniques are used to predict fuel consumption by haul trucks per trip in open pit mines, the predictions are 60.42%, 43.73% and 32.37% better than those based on random guessing respectively. The effect size relative to guessing by CBR-FSS can be said to be of medium effect size while that for the CBR and Naïve technique may be considered as small.

CONCLUSION
In this paper, it has been shown that the CBR-FSS technique can be used to predict fuel consumption per trip of haulage trucks in open pit mines without any expert knowledge. The results also show that when using CBR-FSS, CBR and the Naïve techniques to predict fuel consumption by haul trucks in open pit mines, the predictions are 60.42%, 43.73% and 32.37% respectively better than those based on random guessing. Furthermore, the standardised values show that predictions using CBR-FSS are 16.69% and 28.05% better than those of CBR and Naïve techniques. The effect size relative to guessing by CBR-FSS can be said to be of medium effect size while those of CBR and Naïve techniques may be considered as small.
It is concluded that using CBR prediction techniques to predict fuel consumption in open pit mines could be more cost-effective than using rough estimations to predict outputs or engaging experts to do extensive evaluations for the mine. It is acknowledged that predictions made in this work were based on datasets from one operating mine. Accordingly, the predictions made on fuel consumption by haulage trucks may only be applicable to the mine's setting only and larger dataset features are needed to make the findings more applicable all mines irrespective of their location and operating conditions. Weather conditions Ambient conditions refer to the external conditions such as wind, temperature and barometric pressure. These affect a vehicle's fuel consumption as they influence the engine operation. They might also affect driver's behavior as the driver must adjust his driving pattern accordingly. [24,25] 3. Idle time Refers to running a vehicle without moving the vehicle. When the duration of idling is longer than 10 s, an engine consumes more fuel compared to when it is restarted.

Speed
The rate of fuel consumption increases with an increase in speed. Fuel consumption is increased by aggressive driving dramatically by up to 24%. Gradient Rolling resistance of the haul trucks vary due to road conditions. A haul road that is dry and hard-packed keeps fuel costs and tyre wear to a minimum. [31,32] 8. Payload Fuel consumption increases with an increase in the gross weight at which a truck operates. [33]