Page 165 - The-5th-MCAIT2021-eProceeding
P. 165

2.  Job Scheduling Performance Issues and Solutions in Spark

           The goal of job scheduling in Spark resource management is to plan the execution of tasks throughout the
        nodes. It aims to maximize resource utilization while minimizing the total execution time. This section will
        elaborate  on  the  performance  challenges  or  issues  in  Spark  job  scheduling  and  solutions  available  in  the
        literature. We classify these issues into three categories, as shown in Table 1. All the categories can be described
        as follows:

        2.1 Parameter Configuration

           Parameter configuration refers to the setting of Spark parameter values before executing an application
        (Zaharia et al., 2010). Typically, the configuration of the Spark parameters can be done in 2 ways. One, the user
        can  manually  set  the  configuration,  and  two,  they  can  use  the  default  configuration  to  have  an  easy
        implementation. One of the issues found here is that it can cause slowdowns or even worst failures in the Spark
        applications if its parameters are not properly configured. Therefore, given the high significance of the problem,
        many previous efforts have been made to determine the optimal solutions for parameter configuration.
           Among these, Petridis et al. (2017) applied manual tuning by trial and error to tune Spark configuration
        parameters. They conducted a series of experiments for all the possible combinations of parameters by utilizing
        expert knowledge to search for an optimal configuration. The results showed that their manually tuned method
        could increase Spark performance by 10 times speedup. Gounaris and Torres (2018), on the other hand provide
        an alternative approach to Petriditis et al. (2017), where they proposed a systematic methodology for parameter
        tuning. However, this study also involves with repeated experiments of a trial and error approach, but it is
        guided by a systematic methodology. The results of this study reveal that the proposed methodology improves
        the speed by up to 20 % during implementation compared to the default settings. However, both of these studies
        clearly is time-consuming as it needed considerable effort in performing repeated experiments to find the best
        parameter configurations. Furthermore, it requires expert knowledge and researcher experience to determine
        the value of the parameters at the beginning of the phase.
           Study by Bian et al. (2014), proposed CSMethod, a simulator for Spark where the whole Spark application
        execution  environment  is  simulated.  This  paper  aims  to  provide  a  fast  and  accurate  simulator  as  well  as
        providing a reliable approach for testing parameter combinations until the optimal setting is met. This approach,
        however, seems rather challenging to precisely simulate the environment due to the vast hardware diversity and
        software complexity. Moreover, when applied to the actual cluster, there might be an inconsistency of getting
        the expected results due to a different implementation environment. Other techniques by Perez et al. (2018)
        developed a multi-parameter tuning method called PETS (Parameter Ensemble Table for Spark) using a Fuzzy
        approach. It utilized a metric called bottleneck score with multiple fuzzy engines and a parameter ensemble
        table. Most of the rules and fuzzy classes require knowledge from researchers or experts. PETS is able to tune
        18 parameters simultaneously and outperformed other machine learning techniques with a speedup of up to
        x4.78  using  6  different  workloads  of  the  Hibench  benchmark.  However,  there  is  a  trade-off  between
        performance speedup and convergence speed. Achieving a higher speedup resulted in slower convergence when
        compared to simple strategy due to high rates of changing the parameter at one time.
           A more popular method uses machine learning-based approach by building models and making predictions
        on the performance before the application started. Previous efforts by Bao et al. (2019) proposed an automatic
        parameter tuning called Autotune. The researchers implement testbeds that use a sampling strategy called Latin
        Hypercube Sampling (LHS) to generate more samples based on given time constraints to train the model.
        Therefore,  more  promising  configurations  can  be  found  using  the  trained  prediction  model.  Autotune
        demonstrated  that  it  improves  execution  time  to  63.7%  on  average  when  compared  to  default  parameter
        configuration. However, when compared to other tuning methods, the speedup improvement is only 6-24%.
        Other research that also utilized the LHS sampling strategy is Nguyen et al. (2018). Unlike in Autotune, they







        E- Proceedings of The 5th International Multi-Conference on Artificial Intelligence Technology (MCAIT 2021)   [152]
        Artificial Intelligence in the 4th Industrial Revolution
   160   161   162   163   164   165   166   167   168   169   170