Page 189 - The-5th-MCAIT2021-eProceeding
P. 189
(Bi-LSTM)
7 (Cho et al., 2020) 2020 JNLPBA (Biomedical) dataset Convolutional Neural Network (CNN), and
NCBI (National Center for Bidirectional Long Short-TermMemory
Biotechnology Information) dataset (Bi-LSTM)
From Table 1, the popular machine learning algorithm used is the Conditional Random Field (CRF), while
the popular deep learning algorithm is Bidirectional Long Short-Term Memory (Bi-LSTM). CRF is one of the
various algorithms that are known to be great in building predictive models. CRF with its probabilistic model
can be used for pattern recognition because it can consider word order labels that form sentences to identify
entities from a text (Casillas et al., 2019). The LSTM algorithm is a development of the Recurrent Neural
Network (RNN) algorithm through the generation of a memory cell that functions as a container for information
for a long period. As for the Bi-LSTM algorithm, it has two layers that can move forward and backward. Bi-
LSTM algorithm is generally used to handle sequential data to improve prediction accuracy (Cho et al., 2020).
The combination of Bi-LSTM and CRF approach is shown in Figure 1 (a), it shows two modules that
compose a two-stage information extraction system. The input for the first Bi-LSTM layer is word embedding,
in which the obtained output from the first layer is combined with word embeddings and sense-disambiguous
embeddings in the second layer. Additionally, CRF was used in the final stage to get the most appropriate label
for each token (Suárez-Paniagua et al., 2019). The concept of Medical Entity Recognition (MER) as shown in
Figure 1 (b), relates to natural language processing applied to the clinical domain. The combination of Bi-LSTM
with CRF serves to adapt the sequential tagger and to make it tolerant of high lexical variability and a limited
number of corpus (Casillas et al., 2019).
Fig. 1. (a) Two-stage deep learning approach (Casillas et al., 2019); (b) Neural architecture based on Bi-LSTM and CRF (Suárez-
Paniagua et al., 2019);
A similar NER model but applied to different natural languages is still a frequent problem and it is still
necessary to embed different trained words for each different natural language. As the first step, choosing the
right algorithm and continuing to choose the corpus domain and genre is crucial to the success of the research.
E- Proceedings of The 5th International Multi-Conference on Artificial Intelligence Technology (MCAIT 2021) [176]
Artificial Intelligence in the 4th Industrial Revolution