Performance degradation assessment and VM placement policy in cloud

ABSTRACT


INTRODUCTION
The distributed architecture of datacenters supports an enterprise with enhanced computational capacity, storage and advanced applications. Virtualization is the technology used for enhancement of capabilities of datacenter. Usually the data center is transformed into fully fledged cloud architecture mainly through the implementation of successful virtualization of machines in processing communication and storage domains [1]. The applications and services available in cloud computing are stored in data centers that are distributed across several geographic locations. The even distribution of system load to various data centers help to achieve better performance, reduce response time and to deal with the fault tolerance. In order to optimize the energy efficiency of their data centers, the tasks running on the under-utilized physical machines are mapped onto other Physical Machines (PMs) of the data center and then the under-utilized ones are shut down [2]. This can be accomplished by the migration of the virtual machines (VM) from the overloaded servers to other servers with reasonable migration cost. Placing the VM in a distant location away from the data centers affects the application performance. So, the network aspects between data centers and the VM have to be considered for placing the application in a VM [3]. VMs are chosen from the overloaded host that will be migrated to an underutilized physical machine (PM). While migrating the VMs, the cloud providers have to ensure the quality of service and they have to comply with the SLA (Service Level Agreement) also [4][5][6].
The traditional process migration was a complicated technique that involved relocating a process from one machine to another, whereas the Live VM Migration technique involves transferring of VM from source server to destination server with a minimal downtime [7][8][9]. The live migration process has six stages-(a) The pre-migration stage in which resources are allocated to a selected remote machine. (b) The reservation stage where setup is done for deploying the VM. (c) The iterative pre-copy stage where the pages are repeatedly copied until a minimum number of dirty pages is remaining. (d) The stop-and-copy stage where the VM is stopped at the source and the last dirty pages are transferred to the destination which results in downtime. (e) The commitment stagethe VM no longer exist at the source and the resources held by the VM are released. (f) The activation stage-the VM is resumed at the destination [10].
The two important factors involved in various live migration techniques are the total migration time and the downtime. Total migration time is the total time required to migrate the VM from source to destination machine. The downtime is the amount of time when the VM is not running [11]. The total migration time and the downtime are the key performance factors to be considered during VM migration. The purpose of live virtual machine migration is to ensure an uninterrupted service provisioning to the hosted applications during the migration process [12]. The two approaches in live VM migration are the push phase and the stop and copy phase. In the push phase, the memory pages are transferred from source server to the new location. During the transfer, as the VM is running, some of the memory pages may be modified which is called dirtying of pages. The dirtied pages have to be rectified and transferred again. The process is repeated iteratively, and it takes several rounds for the migration process to complete. At some point, the time taken by the rounds and the number of pages to be transferred will be very low. At this stage, the stop and copy phase takes place. The entire process of live VM migration affects the performance of the applications running in those VMs [7,13,14].
The whole work presented here results in the estimation of performance and cost based on the following four aspects: (1) the limit set for maximum iterations; (2) the volume of pages transferred with respect to the memory size; (3) the dirty rate comparison of successive iterations and the bandwidth usage; (4) the threshold set for dirty memory.
The primary contributions of this paper are: a. An improved model for computing the number of iterations for VM migration by considering all the above-mentioned factors. b. The model incorporates a very valuable parameter θ, a factor, which enable the datacenters to take a decision whether to continue with highly inefficient and under-performing migration operation which may be on continuation. Halting such faulty migrations bring rewards into the system through better resource utilization improving the efficiency of successful migration and achieving better performance for allowed operations. After the implementation of θ the performance value of migration jobs has improved up to 15%. c. The simulations carried out according to the methodology discussed here estimated the performance and its variations successfully indicating the decline in the performance whenever occurred. This approach can enable the data centers to take necessary measures to reorganize the resources to contain the declined performance successfully. The rest of the paper is organized as follows: Section 2 provides the related research method. Section 3 explains the performance evaluation and results. Section 4 concludes the paper.

RESEARCH METHOD
A guest VM is placed into the PM that has the least completion time. If such a PM is lacking the required resources, then either direct placement or migration-based placement technique can be adopted. In [15], the authors have examined that in on-line VM Placement, a VM whose migration overhead added with the completion time is minimum is chosen for migration. The number of VM request will not be known in advance. Hence, when the PMs are completely loaded, further VM requests should not be accepted. The migration cost from the overused PM to the active PM might be constrained by factors like bandwidth [16].
The VM to be migrated must communicate with the device drivers and network cards which are available in Domain 0. The device drivers are hosted in domain 0. It controls the physical network cards [17]. The performance of migration is depended on VM Memory size, memory dirtying rate, network transmission rate and the algorithm used for migration. The workload has to be distributed among all the host machines to enhance efficiency [18].

4963
The migration process manages the state of resources, i.e., processor, memory, storage and I/O, during the transfer of VM. This includes memory allocated and used by VM, memory requested by application, virtual disk sizes and blocks used by VM [19]. In [13,20] the number of iterations is calculated as the minimum value of threshold point and the maximum number of iterations set for migration. The threshold point is computed as a function of the ratio between threshold and memory allocated to the migrating VM. In [17], the authors have identified dirty rate and the frequency of its occurrence as the important factor affecting the iteration time and downtime.

The migration process
A server (host) with resource utilization of 80%, is considered to be as an over-utilized server. The workload of such an over-utilized host has to be distributed to the hosts which are under-utilized. A server with a utilization of 20% or below is assumed to be an underutilized server. The migration process evenly redistributes the workload among all the hosts with certain overhead. The process is pictorially depicted in Figure 1 and Figure 2. The migration cost is determined in terms of performance and energy.

Performance assessment and VM coupling
The performance of a system is defined as the amount of work done/unit time. The cost is the expense incurred due to the unutilized capability of the resources. The cost of migration is the amount of performance lost during migration. Performance has to be calculated (a) Before migration (b) During migration (c) After migration. Comparison between (a) and (b) gives the cost. Comparison between (a) and (c) gives the enhancement. The difference between (b) and (c) gives the cost of implementation of migration. In this paper, we have made the comparison between (a) and (b) to determine the cost before migration and the cost during migration. The virtual machines are migrated from one physical machine to another in the same domain or in a different domain. In any case, the processor will be associated with the memory, I/O and Storage.
Consider the VMs, say, VM0, VM1, …VMo, that are hosted at datacenter, DC1, with m resources. VMo represents the optimal number of VMs. When the number of VMs is greater than the optimal number (VMo), additional r resources are required to serve the extra VMs that exceeds the VMo. The processor coupling with resources can be achieved with migration and additional r resources can be acquired from other data centers to support all the VMs. Figure 3 pictorially represents the processor coupling with resources. Figure 3. Processor coupling with resources will help to redistribute the VMs, that needs additional resources, to other servers Table 1 shows the processing units for the memory, I/O and storage assuming that the number of processing units is same for all the three domains. All processing units are assumed to be homogeneous and homotopic with that of other processors. They can be mutually replaceable. Each VM has the structure of PM11 to PM1n, PIO21 to PIO2n and PS31 to PS3n supporting equipotential capability in all parameters including capacity. The VM to be migrated can be coupled with any of these main processing domains, I/O processing domains and the Communication domains. Figure 4 shows the general image of such a coupling of processing units represented as PR.

The cost model
We assume that M number of VMs are to be migrated from PM1 (Physical Machine 1) to PM2 (Physical Machine 2). The pages get modified during migration and hence the recopying of modified pages and the migration process transpires in several iterations [17,21].
Based on [13], in iteration 0, the whole memory of the VM gets copied to the destination machine. Let Amt0,j represents the amount of memory copied during 0 th iteration of migration process of jth VM, VMj, PMbwj represents the physical memory bandwidth allotted for VMj and Vmemsize shows the memory size of VMj. Thus, if the 0 th iteration which occurs between the time t0 and time t1, then Amt0,j can be represented as shown in (1).
During the copy operation, some of the pages of the VM changes. This is called dirtying of pages. The dirtied pages have to be verified and copied in subsequent iterations. In (2), VMdirj represents the portion of memory which is dirtied.
Once the copying process is over, the VM can be restarted at PM2. During the whole process the processor experiences a downtime, Tdown,j, which can be calculated as shown in (3). In this equation, Pdir is the page dirty rate, Psize is the size of the page size, durprecopy is the duration of the pre-copy and VCPUcontext is the time for context switch for the virtual CPU and VMresumej is the time taken to resume the VMj at the new physical machine.
The total down time for M can be calculated as is an important ratio, that decides the duration of the iteration. In the first iteration, the entire vm, Vmemalloc,j is copied. By substituting duri,j =ti+1ti, for i=0,…,N, (1) and (2) can be represented as shown in (6) and (7) respectively. Amti,j=Psize.Pdir.Amti-1,j=PMbwduri,j.
The iteration may stop when it reaches the threshold, h, of dirty memory i.e., , = , ≤h.
Another condition to stop the iterations is Maxitr, which is the maximum number of iterations set for pre-copy migration. The iteration stops when the volume of pages transferred is greater than the product of maximum multiple of memory size that migration should be terminated i.e., Maxmemsize, and the memory size of VM, VMmemsize. The iteration can stop in case the dirty rate of current iteration, curdir exceeds the previous iteration predir and the total bandwidth, Totbw, exceeds the maximum bandwidth, Maxbw. This can be represented as θ= C2: C3: C4: = 1 The total number of iterations, N, is calculated as shown in (13). The value of N is initialized to 0. The value of N in incremented iteratively until any one of the four conditions become true.
The migration cost for the jth VM can be calculated as shown below.
The total migration time, TotalMigtime, for M is estimated as given below.

TotalMigtime = M.Migtime
The cost before migration can be represented as- In the above equation ω represents the CPU factor for memory; α represents the CPU factor for I/O and ρ represents the CPU factor for storage. The cost during migration can be represented as The performance before migration, represented as Perfjbefmig, and the performance during migration, represented as Perfjdurmig, can be calculated as follows.

Algorithm design
The procedure for performance evaluation is described in the following algorithm. Input: hostList, vms Output: Performance variation For each host in hostList Step 1 : Verify host utilization b. New VM resource requirement +Exising resource utilization<80% Step 6 : repeat the iterations for migrating VM, j to the identified PM until (13) is satisfied.
Step 9 : Calculate performance variation as the difference between performance before migration & performance after migration.

PERFORMANCE EVALUATION
There are several cloud computing environments supporting live migration of VM. VMWare vSphere migrates the virtual machine's state while the network related details are retained. Automatic optimization of virtual machines, hardware service without any interruption of normal operations is possible with live vm migration [22]. In Xen hypervisor, the process of memory dirtying and updating of memory continues till the estimated time for transferring the remaining pages equals the time the guest is paused for migration [23].
For our experiment, the simulation set up has been done in Cloudsim 3.0.3 [24]. The features of the VM resemble the features of AmazonEC2 instance types, but with single core VM. The model uses the Minimum Migration Time (MMT) policy which selects the VM with minimum migration time requirement [24,25].
The values entered for the page size, bandwidth and the threshold values decide the number of iterations, the migration time and the down time. The simulation is carried out by assigning certain values of the following parameters. The results are shown in the graph below. a. The number of VMs are assumed to be: 5, 10, 15 and 20. b. PMbw -The bandwidth of the physical machine is 1 Gbps c. The first iteration results in the dirtying of some pages, which has to be subsequently transferred in further iterations. The page dirty rate is proportional to the memory page transfer required which further adds up the downtime. On an average the page dirty rate Pdir=2500 pps. d. The page size Psize is 4KB. e. The threshold is the point beyond which the iterations does not occur and the VM migration fails. We are assuming the value of threshold 'h' as 100MB. Figure 5 shows the migration time and downtime as a function of x varying the memory size ratio. The pre-copy algorithm will work effectively for x <1. There is a substantial decline in performance when the dirtying rate increases. The CPU factor for memory was taken in the range of 100 to 1000 Mbits/s, I/O in the range 5 to 4200 Mbits/s and for storage 500 to 1000 Mbits/s. For a particular run of the algorithm, the cost and the performance obtained in terms of Mbits/sec is given in the Table 2.
The cost before migration using the different CPU factors were calculated using (12). The simulation was performed by varying the number of VMs. Figure 6 shows the percentage of performance degradation during VM Migration. The performance has degraded by 40% to 75% for different variation of ω, ρ and α. The improvement in performance by including the parameter θ, is shown in Figure 7.   Figure 6. Performance Degradation in %. The performance varies with the number of VMs and the variation in CPU factors associated with the resources Figure 7. Improved Performance with θ. On an average the performance has improved upto 15%

CONCLUSION
In a cloud environment localized overloading of resources remains to be a challenge. Overloaded regions of resources like memory, storage and CPU facilities can be relieved from VM migration. Here we presented a detailed study of various aspects of VM migration through improved models and calculated the cost of migration which could be used for performance studies in cloud clusters. The analysis has shown the nature of degradation of performance which is a vital factor in improving the overall performance of current cloud architectures. The model can be improved to accommodate various other factors to detect further optimal values enabling cost reduction.