Page 66 - The-5th-MCAIT2021-eProceeding
P. 66

3. Results and Discussion

           Student’s  data  were  collected  and  preprocessedas  explained  in  step  one  to  four.  After  the  data  were
        cleaned,Correlation analysis using Spearman correlation coefficient was conducted because it is suitable for
        dataset containingboth the continuous and discrete variables (Hussain et al., 2018). Every variable in the study
        received a correlation coefficient (r) after the Spearman correlation analysis, which represented the intensity
        and direction of the linear relationship between the tested pair of variables.

        Table2. Correlation analysis and descriptive statistics for all students-related features.


                             Variables            r      P   Std. deviation   Mean
                             Jantina            0.030*   0.000   0.469   1.672
                             Perkahwinan        -0.043*   0.000   0.085   1.007
                             Negeri_Lahir       0.076*   0.000   1.398   2.861
                             Kediaman_Penginapan   0.016*   0.000   0.456   1.705
                             Kelas_B40          -0.011*   0.000   0.763   1.523
                             Sekolah            -0.019*   0.000   0.941   1.364
                             Universiti         -0.040*   0.000   0.656   2.210
                             Umur_Daftar        -0.009*   0.000   0.816   2.070
                             Kelayakan          -0.018*   0.000   0.806   2.047
                             Bidang_Pengajian   -0.052*   0.000   1.538   3.792
                             Tempoh_Pengajian   -0.018*   0.000   0.784   2.521
                             Tajaan             0.020*   0.000   0.782   2.347
                             Cgpa               0.043*   0.000   0.707   2.072
                             Keputusan_Li       0.049*   0.000   0.982   2.167
                             Bil_Aktiviti_Keseluruhan   -0.024*   0.000   6.573   2.945
        Note: * Correlation is significant at the 0.05 and 0.01 levels, respectively (2-tailed)

           The statistical results shown in Table 2 suggests that the correlation of jantina, perkahwinan, negeri lahir,
        universiti, bidang pengajian, cgpa and keputusan LI with the employment status ares lightly higher as compared
        to kediaman penginapan, kelas B40, sekolah, umur daftar, kelayakan, tempoh pengajian, tajaan and bil aktiviti
        keseluruhan. Moreover, Table 2 indicates that all variables were significant with respect to the employment
        status where P-values are less than 0.05.
           This study concluded that the variables in our dataset are meaningful and can be used in subsequent studies
        to  develop  a  prediction  model  of  student  performance.Machine  learning  algorithm  such  as  random  forest,
        decision tree, naïve bayes and k-nearest neighbour can be applied to our dataset because they are widely used
        and produce high accuracyaccording to prior research (Casuat & Festijo, 2020).

        Acknowledgements

           This  publication  was  supported  by  the  Universiti  Kebangsaan  Malaysia  (UKM)  under  the  Research
        University Grant (project code: GUP-2019-060).











        E- Proceedings of The 5th International Multi-Conference on Artificial Intelligence Technology (MCAIT 2021)   [53]
        Artificial Intelligence in the 4th Industrial Revolution
   61   62   63   64   65   66   67   68   69   70   71