Page 212 - The-5th-MCAIT2021-eProceeding
P. 212
2. Proposed Methodology
Our proposed method exploited the LOD information to enrich the movie information. Thus, it may further
enhance the effectiveness of the clustering process. We selected two persistent attributes and representative of
the movie domain in which we performed the evaluation. Further resources could be iteratively expanded based
on these attributes. Movies in DBpedia, for example, provide essential details such as star cast and director. As
illustrated in Fig. 1, additional information about the actor who starred in the movie can be explored through
the LOD (e.g., the relation 'dbo:starring' existing between 'Keanu Reeves' and 'The Matrix').
Fig. 1. Movie relation based on DBpedia attribute
The research framework employed in this study is as illustrated in Fig. 2. It differentiates the workflow
applies between baseline and GRS-LOD model with four main components.
The GRS-LOD model is built on the first component, which comprises five major processes. The first two
processes, 'ML1M-DBpedia mapping' and 'DBpedia data extraction', include linking and extracting the two
datasets. While the third process entails data filtering and integration once the data has been enriched with
DBpedia information. A pre-clustering phase is employed in the fourth process, 'On-attributes similarity', that
finds similarities between users based on the investigated attributes. The generated user cluster based on the
attributes is then subjected to a rating prediction based on attribute similarity. Note that we implement the rating
prediction for five users on each selected data of attribute.
Fig. 2. Research framework
The GRS-LOD model produces an additional rating dataset based on the DBpedia attributes. The model is
then used to build clusters using the k-Nearest Neighbour (kNN) algorithm. This approach clusters
homogeneous user with an automatically detected group, and it alludes to the second component. The basic
principle of neighbourhood-based clustering is to find similarities between users. It represents each user's
neighbourhood is those other users who are most similar to him. We assume that two people have comparable
interests and are similar if they rated the movie similarly. We use cosine similarity in this study.
We apply the Average (AV) (1) and Most Pleasure (MP) (2) aggregation strategies along with the profile
aggregation approach. A brief description of each strategy, which ( , ) represents the group preferences for
the item , is the user preference for the item , and the group preferences is represented by .
E- Proceedings of The 5th International Multi-Conference on Artificial Intelligence Technology (MCAIT 2021) [198]
Artificial Intelligence in the 4th Industrial Revolution