Page 188 - The-5th-MCAIT2021-eProceeding
P. 188

2. Method


           A literature study was conducted to identify and map the results of previous studies related to certain literature
        themes. In addition, a good literature study will produce a map of knowledge about a research topic that can
        guide researchers to dig deeper into areas that are not yet mature (Fisch & Block, 2018). The literature data in
        this study were collected through the use of Google Scholar with the keyword "Named Entity Recognition".
        The literature with the topic of introducing named-entity was then selected according to several factors, namely:
        1) The approach used only  focuses on the  machine learning approach, and 2) The publication  year of the
        literature obtained should be from the year 2018 to 2020. The results of the gradual selection resulted in seven
        pieces of literature which will be used as materials for comparison.

        3. Results and Discussion

           The process of analyzing and extracting large amounts of unstructured text or documents using  Artificial
        Intelligence algorithms is often referred to as text mining. One part of text mining is the process of recognizing
        named-entities that can be used in various fields such as economy, health, social, politics, or culture. Based on
        the seven pieces of literature analyzed in this study, six pieces of literature apply the introduction of the main
        entity in the health sector, especially in the field of biomedicine and medicine. On the other hand, Wintaka’s
        research used data taken from Twitter social media to identify the entity's name, location name, and organization
        name (Wintaka et al., 2019). The pieces of literature used in this research are shown in Table 1.
           The health sector, especially the pharmaceutical industry, requires research on the introduction of named-
        entities, especially the medicine entities. The influence of a particular medicine with other medicines is closely
        monitored by the pharmaceutical industry in order to maintain patient safety from side effects caused by drug
        interactions  (Chukwuocha  et  al.,  2018).  The  biomedical  field  also  has  a  very  large  corpus  and  requires
        information extraction to reduce the ambiguity due to several different entities that have the same acronym.
        Furthermore, several biomedical entities have inconsistent use of prefixes and suffixes (Cho et al., 2020).

        Table 1. Literature Review Data based on Dataset

         No         Ref        Year          Object / Dataset               Machine learning
          1   ⁠(Chukwuocha et al.,   2018   Medicine names / PubMed dataset    Conditional Random Field (CRF), and
                   2018)                                                    Naive Bayes (NB)
          2   ⁠(Phan et al., 2019)   2019   Biomedical texts / BioNLP 2004   Convolutional Neural Network (CNN), and
                                            Challenge dataset         Recurrent Neural Network (RNN)
          3   (Casillas et al., 2019)  2019   Medical Online Corpus (GEN-MED)   Bidirectional Long Short-Term Memory
                                     IXAMed Spanish EHR Corpus (EHR)         (Bi-LSTM), and
                                                                      Conditional Random Field (CRF)
          4   ⁠(Suárez-Paniagua et   2019   eHealth-KD dataset      Bidirectional Long Short-Term Memory
                  al., 2019)                                                 (Bi-LSTM), and
                                                                      Conditional Random Field (CRF)
          5   ⁠(Wintaka et al., 2019)  2019   600 manually-labeled tweets in Bahasa   Bidirectional Long Short-Term Memory
                                      Indonesia from Twitter social media    (Bi-LSTM), and
                                                                       Support Vector Machine (SVM)
          6   (Gligic et al., 2019)   2019   Informatics for Integrating Biology &    Forwards Neural Network (FFN), and
                                    the Bedside – i2b2  dataset (2007-2012)   Recurrent Neural Network (RNN), and
                                                                    Bidirectional Long Short-Term Memory






        E- Proceedings of The 5th International Multi-Conference on Artificial Intelligence Technology (MCAIT 2021)   [175]
        Artificial Intelligence in the 4th Industrial Revolution
   183   184   185   186   187   188   189   190   191   192   193