A latency-aware max-min algorithm for resource allocation in cloud

ABSTRACT


INTRODUCTION
Cloud computing is the next generation technology and exploits the vision of utility computing where computing resources are provided as a service [1,2] via an internet connection. According to NIST [3], it can be thought of as a model for enabling a ubiquitous, convenient and on-demand way of accessing a pool of resources. It has led to the growth of virtual and physical devices, which in turn has led to an escalation of data generated on a continuous basis. Thus, big data is generally characterized by high volume, variety, and velocity of data [4,5]. Virtualization techniques act as the backbone of cloud technology [6,7]. They involve constant allocation and reallocation of resources, which is done at the service provider level [8] and provides the clients with a view of an infinite computing and storage capacity. The service providers [9] are constantly flooded with demands for high-performance computing and newer resources. This constant and varying demand for resources by the clients is what leads to the development of novel scheduling techniques across a distributed system, such as cloud. Until recently, meeting the demands of the consumers has been the sole concern of service providers, without paying any attention to the optimum allocation of resources for maximization of profits. However, even though the number of cloud providers and consumers is increasing, there is also an escalating amount of resource wastage. Moreover, there is a need to maintain high service-level performance along with optimization of the resource usage.
The existing scheduling algorithms for real-time tasks are based on aspects like infrastructure, provisioning of resources for power management [10][11][12][13][14][15][16][17] and cost efficient selection [18]. Unfortunately, none of these scheduling algorithms takes into account the fluctuating demands of clients and user priorities. In a cloud environment, scheduling algorithms have to address heterogeneity [19] at all levels, which in itself is a big challenge. Besides this, network latency is also an important factor as the presence of latency degrades the crispness of system response and has an adverse effect on factors like energy consumption [20]. This work proposes LAM, which is a latency aware algorithm for resource allocation in the cloud. It also takes into account fluctuating demands and user priorities. The user priorities are decided upon using AHP technique. Furthermore, unlike the previous works in literature, LAM shows a better resource consumption and thereby optimum allocation of resources.
SMI cloud [21] provides a framework for ranking and comparing different cloud services to the cloud users. Apart from these, there are other approaches such as multi-criteria decision making (MCDM) [22], which have been explored for ranking and selection of different cloud services. These initiatives are meant to facilitate decision making for the cloud users but no concrete efforts have been made for decision making at the service provider level. Decision making at the service providers side usually involves allocation of resources, distribution of workload, catering to the dynamic resource requirements by the consumers and consumer prioritization. The work done in this paper helps the service providers make important decisions regarding allocation of resources through LAM.
The problem of resource allocation in cloud environment involves twin steps, which are identification of user requirements and mapping of user requests to the actual resources at the service provider's side. This paper focuses on the latter that is optimum utilization of resources. Therefore, measures need to be adopted by the cloud service providers to ensure high availability of resources along with optimum utilization. The optimum utilization in this context means reduced total cost of ownership (TCO) and increased return on investment (ROI). This makes optimum allocation of resources, a challenging and complex task. Moreover, the computational requirements of users are growing so fast that increasingly large number of servers and resources are required to handle them. In particular, the cloud service providers are required to allocate resources for satisfying quality of service (QoS) requirements via a service level agreement (SLAs) and ensure optimum allocation of resources thereby leading to efficient computing resource utilization.
The main objectives of this study are to present our vision, discuss resource allocation and develop an algorithm for resource allocation in cloud environment [23,24] such that cloud computing can be adopted as a more sustainable technology leading to technological advancements and better utilization of resources for the coming generations. Thus, the major contributions of this work are: a. Modeling of the proposed system based on parameters such as machines available, user priority, memory, bandwidth and CPU usage (fraction of CPU usage), in addition to a few others. The key objectives are listed below. b. Formulation of resource allocation problem. c. Development of an algorithm for efficient allocation of resources in the cloud, which not only helps in optimum allocation of resources but can also, be implemented readily without any special requirements. d. Evaluation and validation of the proposed algorithm by comparing it with five existing algorithms Greedy-R, Greedy-P, FCFS, dynamic resource allocation [25] and preference-based resource allocation schemes [18] in the cloud through extensive simulations. The results show that the proposed algorithm shows better resource consumption with increasing number of tasks and arrival interval and hence helps in improved infrastructure performance. The remainder of this paper is organized as follows. Section 2 gives an insight into work that is already done in the literature relating to the proposed algorithm. In section 3, models that are used in this paper are described and resource allocation problem is stated. Section 4 describes our proposed algorithm i.e. LAM. In section 5, performance evaluation of LAM is done based on the number of tasks and task arrival interval. Finally, the paper ends with the conclusion and future works in section 6.

RESEARCH METHOD 2.1. Literature survey
Resource allocation has always received a great deal of importance from the research community. Thus, algorithms and methods related to it have been rigorously evaluated. Resource allocation in cloud computing is a non-cooperative problem as the consumers who wish to access the same resources are competitors and thus, they are reluctant to cooperate with each other [26]. A vast number of the solutions proposed, generally target the execution time as the solution objective but the optimum allocation of resources is rarely a concern as the resources available are considered to be infinite. Thus, none of the approaches consider allocation from the service provider's perspective. Condor [27] and Load Sharing facility [28] are traditional resource management approaches based on system-centric resource allocation. These resource allocation strategies implicitly assume that all the job users have equal priorities and do not take into consideration the levels of usage of services by its users. Thus, they fail to fulfill the key requirements of cloud computing and utility computing characterized by fluctuating resource needs. Fluctuating resource demands refer to the change in resource usage demands by the cloud users. Sometimes, the resource demands may be high in case of peak computing requirements and low at other times. The algorithm proposed in this paper has been designed for handling such kinds of computing needs and takes into account the amount of usage of resources and the priority of each user.
Scheduling of tasks, proper assignment and mapping of resources is an integral part of cloud computing. Monitoring the activities and performance of cloud is equally important in order to ensure proper resource provisioning. This monitoring of activities in the cloud involves twin perspectives: Cloud service providers perspective and Cloud user's perspective. Most of the studies in the literature are based on the client's perspective, but in this study, we focus on the cloud service provider's perspective. The advantage of using this perspective is that it helps in optimum allocation of resources based on the dynamically changing user demands and priorities.
Cloud service providers monitor activities like allocation of resources and meeting the end users demand. The end users, on the other hand, monitor the quality of service, which is being provided to them apart from data security and safeguarding data against potential threats. There are two categories under which performance monitoring can be classified according to Vineetha [29,30] which are Infrastructure performance and Application performance. Infrastructure performance involves measuring the performance of cloud resources that are provided as a service to the cloud users, which may include network, storage, and servers. This study takes into consideration infrastructure level performance by monitoring resource usage pattern of infrastructure level resources. Application performance management involves monitoring of databases and applications that provide support for application program performances.
Resource management and allocation are possible in the cloud with the help of virtualization technology. Virtualization offers its users complete transparency, but this transparency has now resulted in further complications in terms of distribution of resources and flexibility [31,32]. The characteristics for task scheduling and resource allocation in the cloud as enlisted by Sun et al. [33] as a) Cater to the distribution of resources of a unified platform. This platform may involve all the different types of PCs, workstations and servers; b) Task scheduling in the cloud must be globally centralized; c) Independent scheduling of every node in the cloud; d) Task scheduling must cope up to the scalability feature of cloud computing; e) Support for dynamic scheduling depending upon the increase or decrease in demand for the number of resources; f) scheduling strategies must proceed in sets, which involves scheduling of cloud applications and scheduling of port resources. The framework proposed in this paper makes use of LAM, which is a scheduling algorithm for allocation of resources in the cloud. The algorithm proposed is flexible in nature catering to dynamically changing user and resource priorities. It also caters to the scalability and independent scheduling of nodes.
Moreno et al. [16] have performed analysis, modeling, and simulation of workload patterns in utility clouds of very large scale. They have carried out their study using cloud data centers having approximately 900 users who submit 25 million tasks in one month. They have modeled the scenario by extending the capabilities of cloudSim [34] framework. Their work provides a platform for researchers to simulate resource consumption patterns in the production environments. They also made several conclusions about the dependency of workload on user behavior along with tasks. According to them, a higher degree of diversity exists in user's pattern as compared to task patterns.
Tsai et al. [9] have proposed a Hyper-Heuristic scheduling algorithm for providing scheduling solutions in a cloud-based environment. This algorithm has been implemented on cloudSim and Hadoop. Rodriguez and Buyya [28] have put forward a particle swarm optimization-based algorithm, which meets the deadline constraint and minimizes the execution cost of workflows. They have evaluated their approach using cloudSim. We have also carried out our experiments by using cloudSim [34] rigorously on the Google cluster [35] data set for performance evaluation. Table 1 further shows a comparative study of existing approaches of resource allocation in literature [12,20,22,24,[36][37][38] and LAM algorithm. LAM has features such as assisting service providers with decision-making based on factors such as user priorities, network latency, duration and time of resource use. On the other hand, the aforementioned approaches assume equal priorities of the users and do not take into account network latency between different entities in the cloud environment. Assumes equal priorities of users and static user requirements Fails to serve requirements of utility and cloud computing Polymorphic ant colony optimization [22] Improved resource utilization

Assumes equal priorities of users
Does not take into account user priorities Many-Objective Virtual Machine Placement (MaVMP) problems. [40] Virtual Machine Placement Assume equal priorities of users Does not take into account user priorities and energy efficient approaches Semi-Markov decision process [12] Adaptive, multi resource allocation for resource and latencysensitive mobile applications Assumes equal priorities of users  Figure 1 shows a broad outline of the proposed system model. The actors involved in the given system include cloud service providers and cloud users. The usage of cloud resources by a consumer is not fixed and varies depending on the consumer requirements. The system monitors this variation in usage pattern and depending upon this pattern and various other parameters such as user priorities and availability of resources, decisions regarding assignment of resources are made by the service providers. The key elements of the system, therefore, include the following: The application component consists of service requests that are submitted by end users for processing. An application can furthermore be defined, as an entity to which resources are allocated, be it an operating system process, a complete application or a data warehouse job.

2) Service broker
This component has the responsibility of coordinating with cloud users, interacting with them and understanding their requirements and needs. Customer SLAs with the service providers are maintained at this component. It also keeps track of whether a customer has certain special privileges or requirements that can be helpful in future for prioritizing these customers. The priority of different users submitting requests has been requests has been calculated using AHP [41]. It is one of the most popular approaches for decision-making and helps in making decisions by arranging the factors in a hierarchical manner. AHP is preferred over other decision-making approaches like multiple attribute utility theory (MAUT) [21] and outranking [42] because it is based on pairwise comparison of utility and weighing functions allowing scope for flexibility and ability to check inconsistencies. It reduces biasness in decision making by providing powerful consistency evaluation mechanism. Figure 2 shows the AHP hierarchy for cloud users. Here, the different users can be pairwise compared to one another based on factors like type of tasks being submitted, SLAs and computational requirements. It is assumed that the temporal demands of the tasks such as task deadlines are mentioned in the SLA requirements and they are considered while assigning user priorities. The comparison matrix can be built based on judgments carried through Saaty Rating Scale as shown in Table 2 [43]. It can be used to determine the relative importance of each of the users and thereby assign them with a priority value.  Weak importance of one activity over another Experience and judgment slightly favor one over another 5 Essential or strong importance Experience and judgment strongly favor one activity over another 7 Demonstrated importance An activity is strongly favored and its dominance demonstrated in practice 9 Absolute importance The evidence favoring one activity over another is of the highest possible order of affirmation 2, 4, 6, 8 Intermediate values When compromise is needed between the two adjacent judgments Reciprocals of above nonzero If activity i has one of the above nonzero numbers assigned to it when compared with activity j , then j has the reciprocal value when compared with i Consider 'n' cloud users (denoted by C) that are to be compared, (C1, C2, … Cn) and aij denotes the relative priority of user Ci with respect to Cj. It forms a square matrix called reciprocal matrix A = (aij) of order n with the constraints that aij = 1/aji, for i ≠ j, and aii= 1, ∀ i. The weights are consistent if they are transitive, that is aik= aij.ajk∀i, ∀j, and ∀k. This step is followed by calculation of a vector ω of order n such that Aω = λω. Where ω is an eigenvector and λ is an eigenvalue. For a consistent matrix, λ = n. The difference between λmax and n shows the inconsistency of the judgments taken at service provider's side. If λmax = n then the judgments are considered to be consistent. After this a Consistency Index is calculated from (λmax-n)/ (n-1), that needs to be assessed against judgments made completely at random. Saaty has calculated large samples of random matrices of increasing order and the Consistency Indices of those matrices. The Consistency Index for the set of judgments is divided by the Index for the corresponding random matrix to yield Consistency Ratio (CR). According to Saaty [44], if CR exceeds 0.1 the judgments are considered too inconsistent to be reliable. Judgments are perfectly consistent if CR equals 0. In order to detect inconsistencies in elements and improve CR, when CR is more than 0.1, an induced bias matrix technique [45] can be used. It can be used to identify inconsistent elements not only in case of CR greater than 0.1 but also when CR is less than 0.1. Example: Let w, x, y, and z be four users submitting requests to service providers, the eigenvector of w, x, y, and z are (0.058, 0.262, 0.454, 0.226). Thus, y user can be considered to be that of highest priority, followed by x and z and w is the task of lowest priority. Aω is obtained as (0.240, 1.16, 1.916 and 0.928) and λmax is 4.18. The consistency index is 0.060. The consistency ratio is then obtained as 0.060/0.90=0.0677. Since, CR <0.1, it indicates the consistency of judgment.

3) Monitor
This component is responsible for monitoring the usage pattern of resources by the cloud users. This usage pattern includes frequency of usage of machines, the number of VMs required bandwidth usage and scalability requirements. The usage pattern has been predicted in our system using [35]. This data set consists of traces of production workloads that were acquired after running on google cluster for about 29 days. It comprises of machines that are connected by a very high bandwidth cluster network. The total number of machines used is about 12,000. The workload is divided into a set of jobs; each job consists of one or many tasks. Each of these tasks consists of linux programs with multiple processes. The data set is divided and comprises of 6 tables namely machine events table, machine attributes table, job events table, task events table,  task constraints table and task resource usage table. The tasks in the data sets differ based on their resource requirements.
To observe how the number of tasks differs with respect to time, we created a plot between number of tasks and time as shown in Figure 3. This plot shows that the number of tasks arriving fluctuates with time. Thus, when a large number of tasks arrive, the resource requirement is at its peak and it decreases as the number of tasks reduces. Based on this observation, it can be concluded that proper allocation of resources is important for a cloud system.  Figure 4 further shows resource usage pattern of jobs in Google cluster trace with respect to time. From this, it can be deduced that out of all the jobs that were submitted for execution only half of them were actually completed. After evaluation of all the job tables, we were able to identify the timestamp at which maximum demand for resources was made and the timestamp at which demand was least. It was also observed that resource allocation in Google trace followed a zip-f like distribution [46]. This pattern was later on used for provisioning of resources. This information can be very vital for the service providers and will aid them in making important decisions. and CPU (c). Thus, each Mi is composed of (mi, bi,ci). Every service provider contains several machines in the form of virtual machines with varying resource types. These virtual machines can be started as well as stopped depending on the dynamically changing workload. b. Task model Tasks arriving at the service provider's level can be defined as a set T= {t1, t2…}. Each of these tasks submitted by a user can be represented by twin parameters, nj and ej i.e. tj = {nj, ej}, where nj and ej are network latency and execution time respectively. Execution time of each task is calculated using length of task. This approach for calculating execution time in virtualized cloud environment has been used in literature [47]. It is assumed that the tasks arriving at the service provider's side are of varying lengths with different resource usage requirements and durations. It should also be noted that it is not feasible to know the length of each task accurately in advance. However, we can make an estimate about the jobs behavior by monitoring the access pattern of resources. Let lj denote duration of each task (task length per size) on a virtual machine Mi and cj denote amount of CPU usage and mj amount of memory used by a particular task. Thus, From network latency and execution time, we can calculate the total time to execute a task tj using (2). e t j = n j + e j (2) c. Problem formulation Table 3 shows the parameters and notations used in this paper. Each of the application submitted to the service provider consists of a set of T tasks. R is a set of resources such that Ravailcpu, Ravailmem and RavailBW are the amount of CPU, memory and bandwidth available at the service provider's end. Rcpu, Rmem and RBW are the CPU, memory and bandwidth demanded by the consumers respectively. Let TS be task success ratio, TE be number of tasks that have been executed and TT is total number of tasks that have been submitted. Then task success ratio is given by (3).
Let RR be resource utilization rate. The resource utilization rate at each host node can be calculated as (4).
Where, ck is CPU capability of host and ljk is length of task at each host k. Therefore, the problem can be stated as follows: Find a mapping from T onto a subset R, in order to optimize the amount of resources used. Thus, the mathematical model of the stated problem has two objective functions; the first objective is maximizing the task success ratio of all the tasks that are submitted for execution, and the other objective is minimizing the resource consumption.

LATENCY AWARE MAX-MIN ALGORITHM (LAM)
LAM is based on DMMM algorithm [48]. According to DMMM algorithm shown in Algorithm 1, if T = {t1, t2...tj} be 'j' cloud user tasks, that are to be assigned resources R = {r1, r2...rm}, where 'm' is the number of resources. If Xik is a value calculated from {v1, v2 ...vm} as the outputs of a decision matrix where Xik = maximum (v1, v2 ...vm). Then, DMMM algorithm selects the resource with value Xik and assigns this resource to task which takes minimum time for its execution.
Algorithm 2 shows the pseudo-code of LAM algorithm. LAM first finds the maximum value associated with each resource by calling ALGO_FindResourceVal algorithm, and it then, finds out the task that requires minimum total time and assigns resource having maximum value to task having minimum execution time. This algorithm will iterate until all the tasks have been assigned resources. Algorithm 3 gives the pseudo code of algorithm for calculating maximum value associated with each resource. In algorithm ALGO_FindResourceVal, the set P={P1, P2 ...Pn} is a set of priority values associated with each user type, where n is the number of users. C = {C1, C2…Cdc} are decision criteria's and dc is the number of decision criteria's or constraints that will be adopted by the service providers. For all ti ∈ T 4.
For all rj ∈ R 5.
End For 8.
Do while T ≠ Null 9.
ti → rj(Xik) /*assigning resource with maximum value to task with minimum execution time*/ 11.
End Do While 13. End

Theorem 1
The Let (xa, ya) be coordinate positions of cloud users and (xi, yi) be coordinates of the data center, both may be dispersed across different geographical locations. Let {d1, d2 ...dn} be distance of data center1, data center2 and so on from the cloud user. Therefore, distance di, where 0<i<n+1 is given and 'n' is the total number of data centers owned by cloud service providers dispersed at varied geographical locations is given by (10). ti( )= min(e t 1 , e t 2 , … e t j ) /*finding out task having minimum execution time*/ 9.
End Do while

End
Now distance D is calculated by (11), The latency in cloud is not a function of distance alone; it includes other factors like time delay between different network entities as well. These entities can be the hosts, data centres, SaaS providers or end users. Latency is an important parameter to be considered in the cloud environment because it has a direct impact on the customers' overall satisfaction. An unsatisfied end-user is more likely to switch to other cloud providers. Thus, network latency is given by (12), where, Network bandwidth (β) is defined as speed of network in bits per time units and tdij is the time delay between two network entities.

RESULTS AND DISCUSSIONS
To exhibit the performance efficiency obtained by LAM, we compared it with three benchmark scheduling algorithms Greedy-R, Greedy-P and FCFS [49]. Apart from this, we also compared it to other resource allocation techniques in cloud proposed in literature such as dynamic resource allocation [25] and preference-based resource allocation schemes in cloud computing systems [18]. a. Greedy response (Greedy-R) Scheduling: In order to maximize the response time of a system the task having quickest time of execution first is assigned to the most powerful cloud resource b. Greedy parallelization (Greedy-P) Scheduling: In order to maximize the response time of a system and perform task parallelization, the task having quickest time of execution first is assigned to the cloud resource that is least powerful. c. First come first serve (FCFS) Scheduling: In this form of scheduling, tasks are assigned to any of the available cloud resources as soon as they arrive. d. Dynamic resource allocation scheme in cloud computing [25]: In this scheme, resources are allocated to users based on characteristics of jobs, with low priority jobs preventing delay of high priority jobs and dynamic allocation of resources for a user job within a job deadline. e. Preference-based resource allocation in cloud computing systems [18]: It is a demand-based preferential resource allocation scheme and allocates resources based on users payment capacity.

Experimental setup and metrics
Experimental setup for performance evaluation is described in this section. All the experiments were conducted ALGORITHM 3 ALGO_FindResourceVal Begin:
End For 8.
Return Max_Value /*returning the maximum value associated with a resource*/ 11. End ISSN: 2088-8708  Performance Analysis Tool: Since, the targeted system for this study is a generic environment for cloud computing, results need to be evaluated on a large-scale cloud environment. However, conducting experiments repeatedly in order to compare our algorithms with others on such a large scale, in a real environment can be a very daunting task. Thus, in order to ensure repeatability and adaptive tuning of the proposed framework, experiments have been conducted using cloudSim toolkit [34]. We chose cloudSim because contrary to other similar simulation toolkits such as SimGrid and GangSim, it also helps in modelling of virtualized resources on demand [50]. Furthermore, it also supports simulation of dynamic loads.
Workload setup: In order to make conclusion about our simulations the experiments were conducted using a real workload trace i.e. Google workload trace.This also validates the practical usage of LAM algorithm. different VM types are modelled as per Amazon EC2 [14] instance types and the different parameters used are presented in Table 4. The resources used in the experiment include virtual machines comprising of different processing, memory and network bandwidth requirements. The different types of tasks include tasks of length 400, 1000 and 2000 MIPS. Since the trace logs record consists of around 25 million tasks and it is quite difficult to conduct experiments on such large number of tasks. Therefore, in our experimental setup, 20*104 tasks spanning were used as representative.
Environment: The experimental environment consists of about 80 nodes. Each node is modelled to have one CPU core with performance of 3000 MIPS, 4 GB RAM, 100 GB/s network bandwidth and 100 GB storage. Each host can have multiple VMs. There are five different types of virtual machines used.
Metrics used: The performance of LAM has been calculated by using metrics, typical to a large-scale cloud system. Since optimal utilization of resources at the data centre is the main objective of the proposed problem, therefore, the following metrics were used. These metrics stem from the ones used by Z. Xiaomin et al. [47]. a. Task success ratio: Task success ratio is defined as ratio of tasks executed to the total number of tasks submitted. It can be calculated with the help of (3). b. Resource utilization rate: The resource utilization rate at each host node is defined as the total amount of resources being utilized by the system and is given by (4). c. Task arrival interval: Task arrival interval determines the difference in arrival time between two tasks.

Analysis of results
A series of experiments were conducted in order to find out the impact of the difference metrics on the performance of LAM. All the results are summarized in Figure 5 and Figure 6.

Effect of number of tasks
Task success ratio: We analyzed the effect of varying the number of tasks on the task success rate. The results are demonstrated in Figure 5(a). It can be observed from the results that task success ratio of all the algorithms do not vary much with increase in the number of tasks. This uniformity in the success ratio is attributed to the scalable [51] and elastic nature of the cloud environment. The results also show that LAM has high task success ratio in contrast to algorithms in literature. Therefore, it can be concluded that LAM is a preferred approach for critical tasks where the tasks are required to be successfully executed with no scope of failure.
Resource consumption: Figure 5(b) shows the effect of the number of tasks on resource consumption. It can be inferred that FCFS algorithm shows the best results in terms of resource consumption followed by LAM. However, LAM outperforms Greedy-P, dynamic resource allocation scheme and Preference based resource allocation in terms of both task success ratio and resource consumption. It shows similar resource consumption to Greedy-R, but has a higher task success ratio. It also shows that even though LAM has higher resource consumption than FCFS, but still is a preferred technique as the task success ratio is higher.

Effect of task arrival interval
The rate at which tasks arrives at the service provider also has an impact on performance. Therefore, in order to assess the effect of task arrival interval, we have taken the value of tasks in the range of [0, 20]. Task success ratio: Figure 6 depicts the effect of arrival interval of tasks on its success ratio and resource consumption. From Figure 6(a), it can be observed that task success ratio of LAM is approximately 31, 35 and 36 percent more than greedy-p, greedy-r and FCFS algorithm respectively. It also shows that LAM algorithm outperforms other approaches for dynamic cloud environment like dynamic resource allocation scheme and preference-based resource allocation by 9 and 17 percent respectively.
Resource consumption: This experiment analyzes the impact of arrival interval of tasks on resource consumption. Figure 6(b) shows the results of the experiment that was conducted. According to it, greedy-r utilizes minimum amount of resources followed by Greedy-P and LAM, but still LAM performs better than dynamic resource allocation scheme and preference-based resource allocation by consuming only 54 percent of the resources as compared to the other two that utilize 60 and 61 percent resources, respectively.Thus, consumption is optimal in LAM as compared to other algorithms for varying arrival interval. It also shows that its task success ratio is higher than the other algorithms.

Evaluation of scalability of the algorithm
The scalability of LAM can also be inferred from Figure 5(a) and Figure 5(b). It was observed that as size of the tasks increases, LAM algorithm gives comparable performance both in terms of resource consumption as well as task success ratio. Thus, it can be established that LAM is scalable and appropriate for xecuting large-scale applications in IaaS clouds.

Robustness
Robustness of LAM can be inferred from the fact that it consumes lesser number of resources and has high task success ratio with change in arrival interval and number of tasks. Therefore, it can be inferred that the proposed approach is a viable solution for allocation of resources in a cloud environment.

CONCLUSION
This work presents a strategy for allocating resources at the service provider's side. The scenario was modeled as a resource allocation problem with the objective of minimizing resource consumption and maximizing task success ratio. The proposed approach incorporates basic elements of cloud computing such as pay per usage, elasticity and dynamic requirements and usage of resources. A resource allocation algorithm called LAM was proposed. It is a novel technique as it utilizes resources in an optimal manner and has several benefits and advantages such as better management of resources, improved quality of services, scalability, robustness, infrastructure level performance enhancement and ability to attract more customers i.e. better QoS while abiding by the SLAs. Furthermore, it shows an improvement in resource consumption when compared with other resource allocation algorithms such as Greedy-R, Greedy-P and FCFS, dynamic resource allocation and preference-based resource allocation. The authors strongly believe that LAM represents a significant step towards enabling service provider to make important decisions regarding provisioning of resources. With techniques presented in this paper, the service providers can distribute resources in an optimal manner leading to more profit and a better infrastructure performance. The proposed method can be easily incorporated into a real-world system as the experiments have been performed on a real world production workload i.e. Google cluster trace using system setup similar to a public cloud such as Amazon EC2. As a future work, we would like to explore with other different strategies like particle swarm optimization, genetic algorithm and min-min algorithm for resource allocation.