Demand-driven Gaussian window optimization for executing preferred population of jobs in cloud clusters

ABSTRACT


INTRODUCTION
Generation of a huge volume of data and requirement of high-speed computation with less investment has paved way to fast development of the Cloud technology.There is a complete transformation of computing when compared to the traditional method.This successful model is built using the Grid technology, Virtualization and Distributed computing.The Cloud provides high-speed processors for computation and storage as a service.Though this model is highly used for computation and storage there are several issues to be addressed as Infrastructure-as-service. Due to non-uniform and time-varying workload, the resources required to sustain the workload is also variable [1].Amazon, Google, etc., invested a considerable amount of money in their data centers as they have to maintain the serversto sustain their peak workload.The average utilization of servers was only 10% [1].They then realized that merging different workloads with the complimentary usage patterns will enhance the server efficiency and it would be a costeffective economic model to rent the resources to the public [2].Amazon launched AWS (Amazon Web Services) utility computing, and after the launch, several IT industries opted for Cloud computing than investing on costly servers [2].As the demand grew, the computing model started encountering severe challenges like job scheduling and resource allocation [3] exclusively to compute real data.
The service provider has to cater to heterogeneous jobs and not just a cluster of clients whose requests are homogeneous in nature.Challenges are related to flexibility in IaaS.This paper presents a novel approach using the Gaussian window to optimize the execution of preferred population of jobs in cloud clusters.Here the computing model comprises of clusters of four different capacities.Figure 1   The paper is structured as follows, Section I presents the evolution of the computing model, features and the challenge addressed in this paper.Section II provides a summary of the related literature.Section III provides the details of the conceptual model and the algorithm for allocating the resources to requesting jobs.Large amount of jobs arriving are classified based on properties like size, resource utilization period and computing cost.Performance evaluation of the proposed algorithm is discussed in Section IV.Finally, Section V concludes the paper.
This section briefs about the background of the proposed sytem emphasizing on significant research work carried out to enhance the quality of sevice in the cloud system.Scheduling and Resource allocation in the heterogeneous cloud environment is an open research challenge where many researchers and academicians are working towards to enhance the performance of the computing model.Form the survey it is observed that the scheduling techniques has an impact on computation cost, resource utilization time, energy utilization and QoS, efficient scheduling also reduces the job rejection ratio.T. R. Gopalakrishnan Nair et al., in their work say that it is essential to identify the trends of different request streams in every category by auto classification and organize pre-allocation of resources in a predictive way to reduce the number of jobs being rejected and also reduction in cost per task completion [4].M. Mezmaz et al. in their work has investigated the problem of scheduling the problem precedence -constrained parallel applications on the heterogeneous computing system (Cloud computing), in their work they have proposed a new parallel biobjective hybrid genetic algorithm that takes into account the task completion time and also minimized energy consumption [5].Mingsong Chen et al., say that due to the existence of resource variations, it is a challenge for cloud workflow resource allocation strategies to guarantee a reliable QoS.They also say that it is hard to predict their performance under variations because of lack of accurate modelling and evaluation methods [6].
Hsu Mon Kyi et al., say that in cloud computing systems scheduling and allocation of virtual resources and virtual machine are challenges.To address this issue, they have proposed an algorithm which provides effective and efficient resource allocation.They have used Stochastic Markov model to measure the scalability and tractability of infrastructure resource in private clouds.Their contribution has focused on enhancing the system performance by enabling the response time [7] Wexin Li et al., have proposed a joint optimization model, this chooses the request allocation policy such that the provider gains high bandwidth utilization at its datacenters, and each user experiences a low delay [8].Pandaba et al., in their work say that cloud infrastructure comprises of several datacenters, and the customer need a slice of the computational power over a scalable network.They say delivery of resources are done in an elastic way.The challenge investigated by them is the wait time experienced by the customers.The researchers have proposed a modified Round Robin algorithm that reduces the wait time, thereby improving the performance [9] Mubarak et al, have made a study on task scheduling algorithms.The researchers have enhanced the Min-Min algorithm to enhance the task completion period.The authors say that, through the experimental analysis, the proposed algorithm has produced a better Make span and improved resources utilization [10].Stefano Marrone et.al, have proposed a model-driven approach for the automatic negotiation and resource allocation for availability of critical cloud services.The authors have used Bayesian network to evaluate the availability of resources for critical services [11].
Though many research has been carried out to enhance the resource utilization in the cloud environment, there few areas to be focused to increase the quality of service in cloud environment.The cureent systems less focus on approaches to enhance 1. Latency of the computing model This work focuses on enhancing the system performance by optimal utilization of the available resources.To achieve this, we propose a conceptual model as shown in Figure 1.The model comprises of clusters which is a set of machines packed into racks.These clusters are connected with high bandwidth cluster network.Clusters are managed by the Cluster management system which allocates jobs to machines.Jobs generally have a set of resource requirement for scheduling or packing the tasks in the machine either for storage or execution.The Gaussian window model is applied to the proposed model to accomplish best utilization of available resources.The Gaussian window is applied to different Clusters.The machines are clustered depending on different data handling and processing speed.

Limitations in the existing Resource Allocation Model
Many of the researchers working on scheduling and resource allocation have proposed new algorithms and computing model but their work does not emphasis much on clustering the machines according to computation or storage requirements.

Application and working of the algorithm
The proposed algorithm is applied to a set of sample data shown Table 1.The table comprises, of JobID, Time stamp assigned, size of the job and Machine ID assigned to each computing machine and finally the clusters formed with an Id assigned.The Cluster management system (CMS) classifies the jobs based on the resource requirement for processing and then distributes to appropriate clusters.The Round-robin scheduling is used for the execution of the jobs in clusters.
Initially, the jobs are classified by CMS as a free priority, production priorityand monitor priority jobs.The free priority jobs use minimum resources for computation, and the computational cost is comparatively low, the production priority jobs have the highest priority, the CMS sees to that these jobs are not denied of the requested resources, and they are also not allocated to overloaded machines.This ensures that load balancing is taken care of the proposed model.The free priority jobs are taken care of by the monitor priority jobs to ensure resources.Each job has a time stamp (t s ), job Id (J i ) and a comparison operator.The comparison operator is greater than or less than.

Resources and Units
CPU-number of cores/second Memory-bytes Disk Space-bytes Disk time fraction (I/O in seconds/ seconds) [12] 2. ALGORITHM IMPLEMENTATION a. Assumption: with reference to the proposed model in Figure 1 In this paper, we implement the Gaussian window to a set of jobs.The unique property of the algorithm is that just the previous history of the job size [8] is sufficient to predict the resource requirement for the current job.Knowing the Mean "µ" the average required resources, and the standard deviation σ 2 which indicates the actual resource utilized by the job earlier will enable the CMS to allocate resource for the current job in execution without wasting time for computation of resource requirement.Hence this approach also will increase the throughput of the system.In our research, we use the information of the resource size allocated for execution of the previous job.c.Gaussian window for prediction Gaussian distribution is a powerful tool applied expansively to problems like regression and classification.In the Gaussian process a prior probability distribution is assumed initially over the various functions possible to describe the process of generating data, and a posterior probability distribution is obtained after gaining knowledge about the observed values.The posterior improves the knowledge of the observed over the prior.

𝐺[𝑧] = 𝑎 ⅇ -y
Where y= To implement the proposed algorithm to achieve the results, heterogeneous Clusters along with a communication network using a TCP protocol was built.The Clusters were created using VMware.Wireshark tool was used to monitor the Data transmission among to the Clusters.

RESULT ANALYSIS
Three different cases were considered to prove the application of the proposed algorithm, here a. Case I: Best Utilization of available resources by requested Jobs Figure 2 depicts the best utilization of the allocated resources; here the proposed algorithm classifies the jobs as per the resources requested.It can be observed that the area under the curve shows the utilization of the allocated resources.
The application of the Gaussian window has also enabled to enhance the latency and throughput of the system.The Figure 3 and Figure 5 illustrates the latency and Figure 4 illustrates the throughput.The latency period.With reference to Figure 3 timestamp "0ns" depicts the start of execution of a task, the task waits approximately for "6ns" to load on to the machine and completes the execution in 556ns.
Before scheduling, clustering of the jobs based on their job size based on the proposed algorithm the latency of the system is comparatively improved.Figure 6 represents the throughput of the proposed system.It is observed that scheduling the jobs to appropriate clusters has reduced the job rejections and starvation.The scheduled jobs have fully utilized the allocated resources.Case II: Underutilization of Resources when job size is less than the resource allocated Figure 7 shows the utilization of resources by jobs allocated to machines.In this case the actual resource required for computation is not estimated prior to scheduling.The area under the curve depicts the utilization of allocated resources.Though the computation is complete, the resources are not fully utilized.The throughput of the system is reduced by 15-20%.Figure 8 shows the average reduce in throughput of the system.We can observe that though the computation is completed the allocated resources are not completely utilized.

CONCLUSION
Economic data computing and storage has paved way for IT industries and researchers to improve the existing system by reframing and redesigning the existing technology.In our research work we proposed a Conceptual model of Future Cloud Cluster.The application of the Gaussian window to this model has enhanced the system performance.From the results it can be observed that the allocated resources are best utilized, the proposed algorithm also prevents resource contention.The simulation results show the improvement in throughput and reduce in latency.Further the proposed system can support optimal energy utilization which will be carried as an extension.
presents a  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 9, No. 3, June 2019 : 1637 -1644 1638 conceptual model for application of Gaussian window.The model includes of two entities, a Job dispatcher and Clusters.The model estimates the required amount of resources for computation of jobs to avoid inappropriate use of the available resources.

Figure 1 .
Figure 1.Conceptual model of clusters with a cluster management system 2. Throughput Improving these two parameters would enhance better utilization of resources in cloud environment ISSN: 2088-8708  Demand -driven Gaussian window optimization for executing preferred population of jobs…(Vaidehi M) 1639

2𝑐 2
With respect to Gaussian windowThe parameters a= defines the requirement of the resources x= The actual allocated resource c 2 =defines the real resources utilized b= the average resource required d.Simulation setup and prediction of results

Figure 9 .
Figure 9. Deficit of resources to allocated jobs 1643

Table 1 .
Parameters Related to Jobs Scheduled