Software aging prediction – a new approach

ABSTRACT


INTRODUCTION
The performance degradation caused by software aging has hit various types of computing systems including virtualized cloud systems [1], web servers [2], [3], clusters [4] and online transaction processing systems [5].The software aging concept has also impacted spacecraft systems [6] and military systems [7].The impact may be loss of life in critical applications.Software aging happens because of unreleased file handles, data corruption, memory fragmentation, memory leaks, storage space fragmentation and round-off error accumulation.Software aging reduces the performance of cloud-based systems because of the complexity with which they are built.The system consists of servicing components and an execution environment.The system's boundary separates it from its environment, but its services are towards the surrounding environment [8].In complex systems like the cloud, various levels like application level or operating system are prone to software aging [9].Operating system-level effects are non-released memory, file handles, and sockets.Application-level effects include non-terminated threads, round-off errors, or data file corruption.It is very important to estimate the optimal time to trigger the rejuvenation to mitigate the software aging effects in cloud systems.Researchers have attempted to predict the time of software aging which can be seen in the previous works.Researchers have used threshold-based, statistics-based and machine learning approaches to estimate the software aging time.It is possible to predict the failure time of the system using machine learning algorithms.The various aging indicators used to estimate resource exhaustion include memory and central processing unit (CPU) usage [10].
In this work, software aging prediction has been made using a new approach wherein virtual machine's current resource utilization status is fed to a machine learning model that classifies the virtual machines as healthy, aging-prone and aged using the k-nearest neighbor (k-NN) algorithm based new method.Static thresholding and adaptive thresholding methods have been used for aging prediction.Once the virtual machines are classified, rejuvenation is to be initiated for aging-prone and aged virtual machines.The rejuvenation process cleans up the system's internal state and brings the system back to its original state by removing the accumulated errors.The time when the rejuvenation is initiated is called rejuvenation trigger time.The time to trigger the rejuvenation has been forecasted using the new method in this work.

RELATED WORK
Based on the type of algorithm used, machine learning approaches are categorized into two types: classification approaches and regression approaches.In the classification method, the system status is classified as either stable or unstable.Forecasting of system failure can be done using a regression method.The procedure has been explained in [10].Yan and Guo [11] developed a mechanism that forecasts software aging using a machine learning algorithm.Data was collected from a live commercial web server and the collected data was pre-processed.To identify a subset of the model parameters set, a feature selection algorithm was applied.A time series model was used for the prediction of selected parameters.To predict software aging, the model was built using machine learning algorithms.Sensitivity analysis was done to analyze how heavily outcomes depend on input variables.IIS webserver was used to apply the method.Experiment results were analyzed and found that the proposed method predicts software aging in the early phase of the system development life cycle.
Alonso et al. [12] performed a comparison of various regression algorithm families like linear regression, regression trees and hybrids.The researchers compared these algorithms in various scenarios and various aging concepts involved.The outcome of the experimentation indicated that phenomena performed better in the hybrid version i.e., MP5 between linear regression and decision tree.Due to the bugs in the software like unreleased threads or memory leaks, resource exhaustion was caused leading to aging phenomena.The model included linear piecewise models (i.e., a reasonable number of linear patches) capturing various aging slopes and speeds.In one of the previous works, three machine learning algorithms were used along with time series models for the prediction of software aging in web applications [13].The three machine learning algorithms used are decision trees, naïve Bayes classifier and neural network model.The researchers built the models relating several system variables to aging trends like throughput and number of connections.This was based on the observation that aging phenomena can be approximated by making use of the piecewise linear model.The models in this work were trained using samples of data that were collected in preliminary experiments.The model built was able to predict the time-to-exhaustion (TTE) of system resources under different conditions.Alonso et al. [14] had compared the large set of families like decision tree, linear discriminant analysis/quadratic discriminant analysis (LDA/QDA), random forest, support vector machines, naïve Bayes and k-NN for prediction of state in the context of a three-tier J2EE system.Andrzejak and Silva [15] compared four classification methods: ZeroR, decision tree, naïve Bayes, and support vector machines.These algorithms are compared under constant software aging injection rate by considering one aging indicator metric i.e., memory consumption.The results indicated that all classification methods performed similarly.
Jia et al. [16] used multiple linear regression algorithms to do a detailed analysis to predict web server parameters.In the first step, the system was pressurized using a pressure testing tool and collected data was pre-processed.The resource consumption trend was generated using the time series model in the second step.In the third step, the feature selection algorithm was used to select the best subset to be used as input parameters of the algorithm.In the fourth step, analysis was done using multiple linear regression and the aging process prediction.In the final stage, the algorithm feasibility is evaluated using evaluation metrics.The results indicated that this algorithm could predict the aging process in the allowable error range.
Liu and Meng [17] designed a method for predicting software aging which used an artificial bee colony algorithm.This achieves better prediction accuracy as the back propagation neural network optimization is achieved.The experiment results showed that the software aging prediction trend is more accurate than the traditional BP neural network.The proposed method also has a faster convergence speed and more prediction results which are more stable.
From the previous works, it can be observed that the concept of software aging is gaining importance.Researchers are trying to predict the software aging time to trigger the rejuvenation to avoid its impact.Similar works related to software aging prediction have been mentioned in the literature.There is a ISSN: 2088-8708  Software aging predictiona new approach (Shruthi Parashivamurthy) 1775 scope for improvement or alternative methods to predict software aging.Considering these points, an attempt has been made to find a new approach to predict software aging in the proposed work.The proposed k-NN based method performs better compared to similar previous works.

MOTIVATION
The motivation to conduct this research originated since software aging is an emerging research area and machine learning is an emerging technology trend.Contributing to the area of software aging and research is satisfying work as this area is gaining momentum in recent years.The power of machine learning algorithms can be applied to achieve the objective.As the usage of cloud-based applications is increasing, it is the responsibility of the service provider to provide uninterrupted services to satisfy users.The inclusion of a module that avoids the impacts of software aging on a platform on which the application is hosted will make the service provider trustworthy.In this regard, the intended research helps cloud service providers also.

THE PROPOSED MODEL
Most of the services hosted on the cloud run in a virtualized environment.Virtualized environment includes various layers such as physical hardware, virtual machine, virtual machine monitor and applications running on a virtual machine.Figure 1 shows the typical cloud platform.these resources can also be considered as aging indicators.In this work, CPU consumption and memory consumption metrics are used to build the prediction model.The prediction model has been built using the following strategy.In this work, for the prototype, metrics collected from a virtual machine (VM) are used and the same technique can also be applied to a virtual machine monitor.a.The status of VM identification using the three methods  Static threshold: In the live environment, previous data related to resource usage was captured to know when the system was affected by software aging.CPU and memory usage metrics when the system failed were considered as static threshold values.At a certain point in time, various VMs status and resource utilization are captured to build the data set.The scatter graph is plotted using these values.The status of VM is identified by finding the nearest neighbors. Adaptive threshold of CPU usage: The CPU usage history of k-NN is captured.Inter quartile range (IQR) statistical method is applied to find the adaptive threshold.The labeling of nearest neighbors is done based on the adaptive threshold value.The statuses are healthy, aging-prone, and aged. Adaptive threshold of memory usage: The memory usage history of k-NN is captured and IQR is applied to find an adaptive threshold.b.Prediction of software aging  Once the aging-prone VMs are identified, the nearest aged neighbors are to be found. Resource utilization trend of aged VMs is found and based on this, prediction of time required for agingprone VMs to reach aged state is made.Table 1 shows the steps followed for software aging prediction using k-NN based software aging prediction.Find out the time taken for aged VM to reach the current status from aging-prone status.End for 20 Find out the average time taken by k aged VMs to reach aged status from aging-prone status.21 On the basis of obtained average time taken, forecast the status of aging-prone In step 13, the value of s is taken as 1.5 for the following reason.When John Tukey was inventing the box-and-whisker plot in 1977 to display the values, he picked 1.5×IQR as the demarcation line for outliers [18].This has worked well, so researchers have continued using that value ever since.
The concept has been implemented using Python scripting language.Python is being used by researchers nowadays because of the various libraries it has that can support any type of research.Python includes libraries and frameworks related to machine learning.It is platform-independent and has a wide user community which makes it the first choice of research.

k-nearest neighbor algorithm
The k-NN algorithm is a supervised machine learning algorithm.It is applied to solve classification problems.The usefulness of the k-NN algorithm has been proved by the number of applications built based on this machine learning algorithm.In this research work, the k-NN algorithm has been used to classify the entity of virtualized environment as aged, aging-prone, or healthy.

Cluster creation
Figure 3 shows the sample dataset used for plotting the scatter graph.The scatter graph is plotted using the dataset to form the cluster.The rows used also included outliers.Outliers, in this work, are the aging indicator metrics that are usually not in the range of other points.It happens because of an unexpected spike in resource usage which is actually not a result of software aging.Outliers have been handled.Missing values are filled with suitable values.Clusters are formed based on the x and y values, in this case, CPU and memory consumption status.If the resource consumption reaches 80%, it is considered aged because service delivery will be hit.If one of the CPU or memory values reaches 70%, it is considered aging prone.The static threshold defined is based on our observation.This is also found in previous works [19].
The scatter graph has been plotted to visualize the clustered formed.Each of the clusters indicates a group of VMs with similar status. Figure 4 shows the scatter The scatter graph has been plotted to visualize the clustered formed.Each of the clusters indicates a group of VMs with similar status.The status can be healthy, aging-prone, or aged.The different clusters in the scatter graph shown here indicate groups of entities belonging to various statuses: aged, aging-prone, and healthy.

Query point
Once the model is built, the status of any VM can be obtained by providing CPU and memory utilization percentages.This input is called query point.The nearest neighbors are found by calculating Euclidian distance.The formula for calculating the Euclidian distance is given in (1): where p1 and p2 are cartesian coordinates of the point p and q1 and q2 are the Cartesian coordinates of the point q, d is the distance between p and q. (p and q are points for which the Euclidian distance is calculated).After calculating the Euclidian distance, the nearest neighbors are found.The status of the majority of the neighbors indicates the status of the requested VM.The model which is built based on the k-NN algorithm returns the status as one of the three options: healthy, aging-prone, or aged.The value of k is to be provided which means the number of neighbors to be considered.

Adaptive threshold method
In this method, the resource usage history of k-nearest neighbors is captured.An IQR statistical method is applied to find the adaptive threshold.The labeling of nearest neighbors is done based on the adaptive threshold value.The statuses are healthy, aging-prone, and aged.Figure 5 shows the sample dataset.For example, there are seven data points.These points are the resource consumption percentages of previous days of a virtual machine (represented as T1 to T7 in the program).Table 2 shows the values.
ℎℎ = 6.8 + (1.5  0.8) ℎℎ = 6.8 + 1.2 = 8` The obtained status is tabulated, and the query point is labeled accordingly as shown in Table 3.The status of VM is calculated using three methods: static threshold, adaptive threshold using CPU metric, and adaptive threshold using memory usage metric.Depending on the mode of k points in three evaluations, query point label is done.If three statuses are different, then static threshold status is considered.The screenshot of the program execution has been given in Figure 6.Hence, current aging-prone VMs take 6 days to become aged VM. Figure 7 shows the screenshot of the program execution.As per the trend observed in the nearest 3 aged VMs, identify the time required for aged VM to become aged from aging-prone.Table 5 shows the resource usage of 3 virtual machines.

Rejuvenation
Software rejuvenation is the technique that refreshes the system and brings it back to a healthy state.The rejuvenation process is triggered for aging-prone VMs to improve service availability.Actions are triggered based on classification as depicted in Table 6.

Aged
Rejuvenation is triggered immediately.
The system returns to a healthy state after rejuvenation.

Evaluation of the proposed method
As the work is k-NN based new approach, the performance of the k-NN classifier has been compared with similar classifiers decision tree and naïve Bayes for the same data set.The result indicates that the k-NN algorithm performs better than the decision tree.The execution result is tabulated in  [20] Instead of fixed thresholds, the method used in this work regularly regulates the thresholds by taking feedback information in the running process into account.Recommended adaptive threshold for aging detection Adaptive thresholding is a part of the overall software aging prediction strategy in this research work.
Ahamad [21] Found reasons for aging, effects of software aging.
Concluded that it is impossible to stop software aging, but it is possible to reduce its speed and progress.
The software aging prediction method employed in this work enables rejuvenation to reduce the speed and progress of aging accumulation.Liu et al. [22] A monitoring agent in every VM collects metrics; CPU usage and free memory available to detect the aging severity.
It is an intrusive method.
The tool used for collecting the metrics related to software aging is non-intrusive in this work.NMS tool captures metrics without adding overhead.
Yan [23] Operating system parameters and database parameters in the running phase are collected using a built-in windows counter without disturbing the running system.Used IIS Webserver which is platform specific.
The model used in this work can be deployed on any platform like Windows or Linux.It is not platformspecific.
Cui et al. [24] The rate of aging is more in virtual machines.The proposed model is built for a cloud platform which is a virtualized environment.It justifies the chosen platform.Umesh et al. [25] Software aging forecasting using time series model.Not recommended method if there is a spike in resource usage.
The proposed model uses static and adaptive techniques which eliminates this concern.Umesh and Seinivasan [26] Used different methods for forecasting.Weightage given to different techniques is not acceptable in all scenarios.
The model proposed in this work improves the prediction accuracy.

CONCLUSION
In this work, an attempt has been made to forecast software aging.During the testing phase of software development, the application can be tested for all types of probable issues, but a problem like software aging must be dealt with during runtime only.It cannot be avoided; it can only be managed.As the accumulation of errors, lock contention, and data corruption, lead to this problem, the impact can be seen as the owner's loss as the service provider will lose the customers.Also, reduced performance and decreased reliability are other negative impacts of software aging.Even if all proactive measures are taken, the problem of software aging cannot be prevented, but it can only be managed.The only available solution is to predict the future status and pre-emptively rejuvenate the system.The aging forecasting is done using the new method.This research work can be one of the considerable contributions to the area of software aging and rejuvenation

Figure 7 .
Figure 7. Prediction part of random execution

Table 1 .
Steps for software aging prediction using k-NN based methodNoStep 1 Load the dataset which consists of CPU usage and Memory usage percentage. 2 Determine the value of K, which indicates chosen number of neighbors.3 Calculate the Euclidian distance between the query example and the current point for each point in the dataset.Add this attribute to the dataset.4 Sort the dataset in ascending order of Euclidian distance (smallest to largest).5 Pick the k number of rows from the sorted dataset.6 Get the labels from selected k entries.7 Return the mode of k labels.8 Sort the CPU and memory utilization history of k points in the ascending order 9 Find the Median for CPU entries.

Table 3 .
Status of query point from three methods

. Prediction of software aging
As mentioned in the algorithm, the nearest k-aged VMs are identified.Here the procedure for one aged VM is given.Resource usage of aged VM is found which is previous days' data before it gets aged.Resource usage data is tabulated in Table4.

Table 4 .
Resource usage of one virtual machine

Table 5 .
Resource usage of 3 virtual machines

Table 6 .
Aging status and actions

Table 7 .
Details of previous research works for software aging prediction have been tabulated in Table8.It can be observed that the proposed model of software aging prediction addresses the drawbacks in the previous works.

Table 7 .
Performance comparison

Table 8 .
Comparison of similar research works