Page 189 - The-5th-MCAIT2021-eProceeding
P. 189

(Bi-LSTM)
          7    (Cho et al., 2020)   2020   JNLPBA (Biomedical) dataset    Convolutional Neural Network (CNN), and
                                         NCBI (National Center for   Bidirectional Long Short-TermMemory

                                      Biotechnology Information) dataset      (Bi-LSTM)


           From Table 1, the popular machine learning algorithm used is the Conditional Random Field (CRF), while
        the popular deep learning algorithm is Bidirectional Long Short-Term Memory (Bi-LSTM). CRF is one of the
        various algorithms that are known to be great in building predictive models. CRF with its probabilistic model
        can be used for pattern recognition because it can consider word order labels that form sentences to identify
        entities from a text (Casillas et al., 2019). The LSTM algorithm is a development of the Recurrent Neural
        Network (RNN) algorithm through the generation of a memory cell that functions as a container for information
        for a long period. As for the Bi-LSTM algorithm, it has two layers that can move forward and backward. Bi-
        LSTM algorithm is generally used to handle sequential data to improve prediction accuracy (Cho et al., 2020).
           The  combination  of  Bi-LSTM  and  CRF  approach  is  shown  in  Figure  1  (a),  it  shows  two  modules  that
        compose a two-stage information extraction system. The input for the first Bi-LSTM layer is word embedding,
        in which the obtained output from the first layer is combined with word embeddings and sense-disambiguous
        embeddings in the second layer. Additionally, CRF was used in the final stage to get the most appropriate label
        for each token (Suárez-Paniagua et al., 2019). The concept of Medical Entity Recognition (MER) as shown in
        Figure 1 (b), relates to natural language processing applied to the clinical domain. The combination of Bi-LSTM
        with CRF serves to adapt the sequential tagger and to make it tolerant of high lexical variability and a limited
        number of corpus (Casillas et al., 2019).




























        Fig. 1. (a) Two-stage deep learning approach (Casillas et al., 2019); (b) Neural architecture based on Bi-LSTM and CRF (Suárez-
        Paniagua et al., 2019);
           A similar NER model but applied to different natural languages is still a frequent problem and it is still
        necessary to embed different trained words for each different natural language. As the first step, choosing the
        right algorithm and continuing to choose the corpus domain and genre is crucial to the success of the research.








        E- Proceedings of The 5th International Multi-Conference on Artificial Intelligence Technology (MCAIT 2021)   [176]
        Artificial Intelligence in the 4th Industrial Revolution
   184   185   186   187   188   189   190   191   192   193   194