Page 32 - AKSES vol3
P. 32

AKSES                                                 AR TIKEL  PEN Y ELIDIK AN                                              ADV ANCING  KNOWLEDGE  FOR  SUC CESS                        FT SM  UKM



          LEVERAGING MACHINE LEARNING FOR PROPERTY ANALYTICS                                                                  HOUSE RENTAL PREDICTION MODEL


                                                                                                                              In this project, we focused on a housing rental prediction
          Nor Samsiah Sani, Azwanis Abdosamad, Loo Yong Li
          norsamsiah@uk m.edu .m y                                                                                            scheme via predictive modeling for real estate rental
                                                                                                                              price  forecasts.  The dataset  consists  of condominium

                                                                                                                              and apartment details which were collected from local
                                                                                                                              housing websites. Four different models were applied and
          Facts show that real estate is one of the main assets that contribute to the development of the Malaysian economy.   compared, namely the linear regression, lasso regression,
          Real estate affects social stability and certainly is used as a reference to current economic situations. According to   random  forest  regression  and  xgboost  regression.
          the Malaysian property market report from the Valuation and Property Services Department of Malaysia (NAPIC),       The dataset covered information  such as rental prices
          in the first quarter of 2020, the recovery of the property sector depended on domestic and external factors such as   and  multiple  apartment  and  condominium  features
          political stability, global oil conditions, and developments related to the Covid-19 pandemic. In particular, NAPIC   in 13 different districts in Selangor, Malaysia. Results
          also revealed that Malaysian serviced apartments with suspended status increased by 3.3% to 31,661 units, worth     highlighted superiority of xgboost regression which
          RM20.03 billion in the first quarter of 2020. In the same quarter last year, the number of suspended residences     recorded the lowest root mean square error of (RMSE)
          amounted to 30,664 units worth RM18.82 billion. NAPIC noted that the Malaysian Housing Price Index (MHPI)           among all methods tested. This approach can be applied
          continues to grow moderately, whereby the number of hanging apartments increased by 26.5% to 21,683 units or        to assess the market value of rental properties, and the
          worth RM18.64 billion. A recent report by the investment research group from Maybank stated that the country’s      forecast results can be used as an indicator of various
          economy is currently showing a weak phenomenon due to business closures and rising retrenchment rates. The          urban phenomena and provide a practical reference for
          factors described above can be attributed to influence rental prices for apartments and condominiums. In research   homeowners and tenants. A web applicationprototype
          conducted by computer scientists, models analyzing data mostly predict results based on simple predictive capital.   was also developed by integrating the best machine
          Due to this situation, there is a significant deviation in the short-term change. The above recognition has led us   learning model from thetest as a predictive model based   FIGURE 12.   Housing rental price web application prototype
          to propose this study using data analytics with machine learning for house price prediction and high-rise rental    on the information entered.                          based on the best ML prediction algorithm for house rent price.
          prediction in Selangor. Data analytics concerns with the extraction of meaning, patterns, and trends from varied
          and large volumes of data. It leverages data from various sources to reveal relevant indications for creating the
          house price and rental prediction model in Selangor. The details of the projects are described below:

          HOUSE PRICE PREDICTION MODEL


          Considering the worrying trend of Malaysian housing price increase every year, we conducted a research study
          focusing on the state of Selangor, which has multiple areas with large population density and high house prices.
          The purpose of this study was to explore and identify essential features influencing housing prices in Selangor.
          The housing dataset was obtained from National Property Information Centre (NAPIC) which has a total of 64982
          data and 23 attributes. This dataset contains data of residency sector in Selangor from 2015 through 2020. Three
          (3) algorithms were tested between Random Forest (RF), Gradient Boost Decision  Tree (GBDT) and k-Nearest
          Neighbours (k-NN), which are all part of machine learning techniques. Mean Squared Error (MSE) values of each
          algorithm was determined and compared to find the best algorithm in term of accuracy.

          RF algorithm was identified to achieve the best prediction performance with the lowest MSE, compared to other algorithms
          tested. This project serves as a testament on the ability of ML and suitable algorithms, combined with multi-dimensional
          data, that the modelling of housing price prediction can be pursued and explored further to amplify knowledge and
          information for real-estate decision making at all levels.





                                                                                                                                         FIGURE 11.   Performance comparison of actual housing price values (blue) and Prediction value (red)
                                                                                                                                         using the RF algorithm
   27   28   29   30   31   32   33   34   35   36   37