Hybrid swarm and GA based approach for software test case selection

ABSTRACT


INTRODUCTION
In this era of technology, the hardware and software industries are growing together at very fast pace to meet the growing need of smart devices. The smart gadgets have invaded our lives so badly that we can't predict our future without them. The software embedded with these devices play a crucial role to provide best known user experiences to provide the intended functionality. This scenario raises many challenges in front of the software developers to fulfill the quality needs of the end user. Software testing is a crucial and unavoidable step to achieve the same. The role of test cases in the process of testing is very important to verify the functionality and detect faults. A software failure can claim many lives in case of critical systems. Moreover the development paradigms have evolved a long way from traditional procedural approach to a modular component based approach. Component based software engineering (CBSE) [1] evolved back in late 1980's and growing since then. It works on the principle of reusability and the software in developed in small chunks called components. Each component has some set of functionality and interacts with other components through interfaces. They provide a black box view of the functionality. Commercial off the shelf (COTS) is gaining popularity with time. Considering the impracticality of the exhaustive testing, it becomes the need of the hour to select a promising suite of test data that is capable of providing higher fault coverage. Ant Colony Optimization [2] and Genetic Algorithm [3] are search based techniques that are inspired from nature and natural phenomenon. Swarm intelligence has provided us inspiration to solve many search based optimization problems. These are meta-heuristic techniques that are problem independent and can work with incomplete knowledge. In contrast to heuristics, meta-heuristics provide randomness during searching and prevent us to get stuck in local optima. ACO has been widely used in solving NP hard optimization problems in reasonable amount of time. It is inspired from the behavior of real ants in their natural habitat searching for food and traversing an intelligent path discovered through group behavior.

4899
One ant follows other ant and communicates through a chemical substance called as pheromone. This substance has two operations: deposition and evaporation. The path with maximum deposition of pheromone is likely to be followed by more ants and finally chosen commute to and fro from nest to food source. This whole process inspired researchers to solve technical problems and to search a promising solution by simulating artificial ants and pheromone level. This is achieved by converting the problem into graphical form.
On the other hand, GA is an evolutionary approach to search for promising set of solutions from a pool of population. It is inspired from Darwin's theory of evolution and natural selection. The possible solutions of the problem are first encoded as chromosomes and initial population is created. The two operators: crossover and mutation are applied to produce new population. A fitness function is chosen to determine the effectiveness of new generation. Generation after generation an effective set of solution gradually evolve through this process satisfying the fitness function and convergance is achieved.
We exploited the advantages of above mentioned techniques to develop a hybrid approach i.e. HACGA (Hybrid Ant Colony -Genetic Algorithm) that is capable of selecting promising test cases to reduce the size of test suite without compromising with the efficiency and test coverage.

RELATED WORK
Soft computing based techniques have attracted the researchers over many years due to their potential to deal with uncertainty and incomplete knowledge. The field of software testing over component based system is also been influenced with these search based techniques and has resulted into a vast literature and research work done over years. A few important recent researches over last five years in this field are summarized here. Abhishek Singh et al. in [4] presented a modified genetic algorithm based technique for test case generation. They used particle swarm optimization (PSO) for fitness enhancement. Neha et al. in [5] applied ACO for reducing cost of regression testing and implemented it in C++. Traditional ACO has scarce initial pheromone, keeping that point in mind Shunkun Yang et al. in [6] proposed improved pheromone deposition and updation coefficients and compared the results with random testing and GA based testing. Various soft computing based techniques like neural network, ant system etc. are compared in [7] for software fault prediction. Authors in [8] utilized potential of ACO for reducing test cases for object oriented systems and implemented their proposed approach using MATLAB. Maunika et al. in [9] exploited Bee colony optimization for test case selection and to improve path coverage. Authors in [10] used genetic algorithm for regression test suite prioritization and produced mutants for object oriented codes. Wasiur Rhmann et al. in [11] presented their research in which they applied GA for improving test efficiency in early stages of software development. They tried to improve test coverage of activity diagram created from design specification. Researchers are also attracted towards the adaptive behavior of ACO in which they tried to modify the algorithm based on some parameters to get better results in case of test case selection as done by [12][13][14][15]. Similarly many researchers and practitioners are more attracted towards genetic algorithm for software testing and applied the same at various phases of testing as in [16,17].
A variant of GA is presented in [18] as bacteriologic algorithm (BA) and introduced new memorization operator. To consider the fact that there is always a scope of improvement, researchers went one more step ahead and developed hybrid techniques by combining two or more soft computing based techniques to further enhance the potential to optimize problems. One such research is presented in [19] which applies crossover between ants to reduce the regression testing cost. P. Gulia et al. in [20] presented a review of all the soft computing based techniques for testing reusable components and concluded that GA and ACO are the prominent nature inspired techniques that attracted researchers in recent years. Authors in [21] proposed a hybrid approach for test case selection using fuzzy inference system and ACO. Further Bee colony optimization (BCO) has also attracted researchers as in [22] where authors implemented GA based BCO for automation of various testing phases. Palak et al. in [23] proposed an ACO based model for testing component based software and their interaction failure. To summarize, a vast literature is available in this field which shows its industrial importance and coverage.

PROPOSED MODEL
In this section, a hybrid approach is proposed that combines the benefits of ACO and GA. First of all the system under test (SUT) is converted into its respective component diagram. The main idea is to populate the system with some random ants as done in traditional ACO. Each ant while moving to the neighboring components in the search of food deposits some amount of pheromone on its path. Remaining ants follow the pheromone and choose next hop on probability basis depositing more pheromone on it. A set of test cases is selected by this process and GA is applied to further refine the test case suite. The proposed approach HACGA that is given in Figure 1 is summarized here: Input: Fault Matrix, Component Diagram Step 1: Convert the system into component diagram.
Step 2: Apply ACO over component diagram.
Step 3: Over the result of ACO, further apply GA crossover and mutation operation.
Step 4: Repeat steps 2 and 3 until stopping criteria is met. Output: Reduced set of test cases.
The stopping criteria is this research is based on two parameters: one by limiting the total execution time of the test cases; other by checking whether all the faults has been covered.

METHODOLOGY
Before applying the proposed technique there is a need of a data structure to hold test data showing relationship with the possible faults covered under each test case. A fault matrix F(m,n) is m by n matrix, where m is the number of test cases into consideration and n is the total number of faults. Each row i in this matrix represent a sequence of 0 and 1 indicating the capability of test case Ti to cover a subset of total faults. Various coding schemes can be used for populating fault matrix but we have used 0-1 encoding just for simplicity. The proposed technique is applied on the fault matrix given in Table 1. The entry F i,j in fault matrix at the intersection of row i and column j is determined using the following notation: We took fifteen different faults and fifteen test cases into consideration with assumption of at least one fault detected per test case. The problem of test case selection can be viewed as selection of subset of rows from fault matrix that covers each column at least once with minimum amount of execution time. First of all traditional ACO and GA is applied on the given fault matrix and results are plotted as shown in Figure 2. Then the proposed HACGA (Hybrid Ant Colony -Genetic Algorithm) approach is applied and it was found that this technique performs better than the traditional techniques and results are shown in next section.  Test Case F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 Execution Time  T1  1  0  1  0  0  0  0  0  1  0  0  0  0  0  1  6  T2  0  1  0  0  0  0  1  0  1  0  0  1  0  0  0  5  T3  1  0  0  0  0  0  0  1  0  0  0  0  1  0  0  4  T4  0  0  0  0  0  1  0  1  0  0  1  0  0  0  0  5  T5  1 Table 2 shows reduced fault matrix F'i,j that contains subset of rows with selected test cases after applying HACGA. The graph shown in Figure 2 compares the proposed technique with traditional techniques. In Figure 2, traditional ACO and traditional GA are compared with proposed approach HACGA in terms of percentage of test cases that need to be executed to achieve higher percentage of faults detection. It was resulted that HACGA based technique is capable of achieving 100% fault coverage in 33% of test cases. While traditional techniques underperform in this scenario. Figure 3 shows comparison of HACGA with the two traditional techniques on the basis on total execution time to detect all the faults and percentage saving in execution time. Graph clearly depicts that the proposed hybrid technique performs better than the traditional technique and resulted into better time saving.  Test Case  F1  F2  F3  F4  F5  F6  F7  F8  F9  F10  F11  F12  F13  F14  F15  Execution  Time  T1  1

CONCLUSION
Test case selection is an important activity to reduce the testing effort without compromising with the quality of the software. In recent years, search based nature inspired techniques are evolving to select optimal test suite. This paper presents a Genetic Algorithm based hybrid ACO technique HACGA which exploits the benefits of the two. The main idea is to select a subset of test cases from a large pool which are more promising for efficient testing. These test cases are selected using traditional ACO technique. To further optimize the effectiveness, we employed GA over selected test cases to get a good mix of test cases to avoid the problem of getting stuck in the local optima. It was analyzed that proposed technique performs better than traditional techniques and is capable of finding 100% errors in 33 % of the test cases. Moreover the proposed technique is also capable of reducing the overall test case execution time when compared to other two traditional techniques. In future, it will be interesting to implement the proposed model for large scale testing problems to assess its scalability.
ISSN: 2088-8708  Hybrid swarm and GA based approach for software test case selection (Palak) 4903