Page 187 - The-5th-MCAIT2021-eProceeding
P. 187
Literature Review: Information Extraction Using Named-Entity
Recognition with Machine Learning Approach
a*
b
R Fenny Syafariani , Rio Yunanto
ab Universitas Komputer Indonesia, Jl. Dipatiukur No. 112-116, Bandung, Indonesia
a
Email: r.fenny.syafariani@email.unikom.ac.id
Abstract
The purpose of this study is to help researchers identify and map machine learning algorithms from the results of previous
studies with the theme of recognizing named-entities. This study's research method examines works of literature on the topic
of introducing named-entities with the machine learning approach. The literature ranged from the year 2018 to 2020 and
was collected through the use of Google Scholar. In this study, one of the critical research questions to be answered is
whether machine learning algorithms have been used in named-entity recognition research. The introduction of named-
entities is able to use three approaches: 1) machine learning, 2) deep learning, and 3) a combination of both. From the result,
it was discovered that the combination of Conditional Random Field (CRF) machine learning and Bidirectional Long Short-
Term Memory (Bi-LSTM) deep learning were used in 4 out of 7 analyzed works of literature.
Keywords: NER;information;extraction;named-entity;review;
1. Introduction
Entity extraction process is widely known to be one of the important stages in information extraction. As one
of the methods, named-entity recognition can automatically extract entities in a particular text and determine its
category. It includes extracting object name, object, person, or company name (Wibisono & Khodra, 2018). As
an example, from the sentence "Flood and landslide in Nganjuk, 23 people reported missing", the recognition
process will result in a named-entity (often referred to as a mention) with "Nganjuk" as the type of location as
well as "Flood" and "landslide" as the type of event. It shows that the named-entity recognition process is able
to automatically recognize entities in a sentence or text and is able to categorize the entity according to the type
referred to in the text.
One of the ways that named-entity recognition can be done is through the formulation of a certain word or
phrase patterns. For example, the typical word pattern of the phrase “come from…” or “go away from…” would
be followed by location-type entity words. Various combinations of word patterns can be taught in machine
learning using training data to build knowledge on the algorithms used. Therefore, it further supports the fact
that the introduction of machine learning-based named-entity recognition will be able to detect named entities
automatically (Giarsyani, 2020).
We have conducted a literature study to determine a suitable machine learning algorithm that could open up
new research areas opportunities. Through literature study, we have created research questions as a guide in the
research process which includes: 1) What objects or datasets have been used in the research on recognizing
named-entity?, 2) What machine learning algorithms have been used in named-entity recognition research?, and
3) What are the results of applying machine learning algorithms in the research on named-entity recognition?
E- Proceedings of The 5th International Multi-Conference on Artificial Intelligence Technology (MCAIT 2021) [174]
Artificial Intelligence in the 4th Industrial Revolution