Page 33 - AKSES vol3
P. 33

AKSES  AR TIKEL  PEN Y ELIDIK AN  ADV ANCING  KNOWLEDGE  FOR  SUC CESS              FT SM  UKM



 LEVERAGING MACHINE LEARNING FOR PROPERTY ANALYTICS   HOUSE RENTAL PREDICTION MODEL


          In this project, we focused on a housing rental prediction
 Nor Samsiah Sani, Azwanis Abdosamad, Loo Yong Li
 norsamsiah@uk m.edu .m y  scheme via predictive modeling for real estate rental
          price  forecasts.  The dataset  consists  of condominium

          and apartment details which were collected from local
          housing websites. Four different models were applied and
 Facts show that real estate is one of the main assets that contribute to the development of the Malaysian economy.   compared, namely the linear regression, lasso regression,
 Real estate affects social stability and certainly is used as a reference to current economic situations. According to   random  forest  regression  and  xgboost  regression.
 the Malaysian property market report from the Valuation and Property Services Department of Malaysia (NAPIC),   The dataset covered information  such as rental prices
 in the first quarter of 2020, the recovery of the property sector depended on domestic and external factors such as   and  multiple  apartment  and  condominium  features
 political stability, global oil conditions, and developments related to the Covid-19 pandemic. In particular, NAPIC   in 13 different districts in Selangor, Malaysia. Results
 also revealed that Malaysian serviced apartments with suspended status increased by 3.3% to 31,661 units, worth   highlighted superiority of xgboost regression which
 RM20.03 billion in the first quarter of 2020. In the same quarter last year, the number of suspended residences   recorded the lowest root mean square error of (RMSE)
 amounted to 30,664 units worth RM18.82 billion. NAPIC noted that the Malaysian Housing Price Index (MHPI)   among all methods tested. This approach can be applied
 continues to grow moderately, whereby the number of hanging apartments increased by 26.5% to 21,683 units or   to assess the market value of rental properties, and the
 worth RM18.64 billion. A recent report by the investment research group from Maybank stated that the country’s   forecast results can be used as an indicator of various
 economy is currently showing a weak phenomenon due to business closures and rising retrenchment rates. The   urban phenomena and provide a practical reference for
 factors described above can be attributed to influence rental prices for apartments and condominiums. In research   homeowners and tenants. A web applicationprototype
 conducted by computer scientists, models analyzing data mostly predict results based on simple predictive capital.   was also developed by integrating the best machine
 Due to this situation, there is a significant deviation in the short-term change. The above recognition has led us   learning model from thetest as a predictive model based   FIGURE 12.   Housing rental price web application prototype
 to propose this study using data analytics with machine learning for house price prediction and high-rise rental   on the information entered.  based on the best ML prediction algorithm for house rent price.
 prediction in Selangor. Data analytics concerns with the extraction of meaning, patterns, and trends from varied
 and large volumes of data. It leverages data from various sources to reveal relevant indications for creating the
 house price and rental prediction model in Selangor. The details of the projects are described below:

 HOUSE PRICE PREDICTION MODEL


 Considering the worrying trend of Malaysian housing price increase every year, we conducted a research study
 focusing on the state of Selangor, which has multiple areas with large population density and high house prices.
 The purpose of this study was to explore and identify essential features influencing housing prices in Selangor.
 The housing dataset was obtained from National Property Information Centre (NAPIC) which has a total of 64982
 data and 23 attributes. This dataset contains data of residency sector in Selangor from 2015 through 2020. Three
 (3) algorithms were tested between Random Forest (RF), Gradient Boost Decision  Tree (GBDT) and k-Nearest
 Neighbours (k-NN), which are all part of machine learning techniques. Mean Squared Error (MSE) values of each
 algorithm was determined and compared to find the best algorithm in term of accuracy.

 RF algorithm was identified to achieve the best prediction performance with the lowest MSE, compared to other algorithms
 tested. This project serves as a testament on the ability of ML and suitable algorithms, combined with multi-dimensional
 data, that the modelling of housing price prediction can be pursued and explored further to amplify knowledge and
 information for real-estate decision making at all levels.





                     FIGURE 11.   Performance comparison of actual housing price values (blue) and Prediction value (red)
                     using the RF algorithm
   28   29   30   31   32   33   34   35   36   37   38