Page 175 - The-5th-MCAIT2021-eProceeding
P. 175

Length-Controlled Abstractive Summarization Based on

                    Summary Output Area Using Transfer Learning


                    Sunusi Yusuf Yahaya , Nazlia Omar , Lailatul Qadri Zakaria
                                                           b
                                                                                    c
                                           a*
            a ,b,c  Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia 43600 UKM Bangi, Selangor, Malaysia
                                         * Email: yusufsunusi63@gmail.com

        Abstract

        The recent  state-of-the-art  abstractive summarization models based on encoder-decoder  models generate precisely one
        summary  per  source  text.  Length  controlled  summarization  is  an  important  aspect  for  practical  applications  such  as
        newspaper or magazine cover slots summary. Some studies on length-controllable abstractive summarization use length
        embeddings in the decoder module for controlling the summary length while others use a word-level extractive module in
        the encoder-decoder model. Despite the fact that the length embeddings can control where to stop decoding, they fail to
        determine which information should be included in the summary within the length constraint. Providing a specific summary
        length can be helpful but not in some cases where the requirement is to fit the summary in a specific slot/area. Contrary to
        previous models, this paper aims to propose a length-controllable abstractive summarization model that incorporates an
        image  processing  phase  which  determines  the  area  of  the  summary  output  slot  to  generate  abstractive  summary.  The
        proposed model uses T5 transfer learning model to generate summary that perfectly fits the slots. The proposed model
        generates a summary in three steps. First, it uses opencv to determine the area of the given output slot where the summary
        will be displayed, for example in a newspaper cover slot. Secondly, the area is used to obtain the minimum and maximum
        length of the summary and; these will be used in T5 model to generate an abstractive summary that fits the summary output
        slot perfectly. Finally, self-attention mechanism was incorporated in the model to enhance the quality of the length controlled
        abstractive summary generated. Experiments with the CNN/Daily Mail dataset show that the proposed model is able to
        successfully perform the length-controlled summarization based on the computed summary output area.

        Keywords: Natural Language Processing; Abstractive Text Summarization; Computer Vision; Summary Length Control.

        1.  Introduction

           In recent years, there has been a great demand for the use of data obtained from a variety of sources including
        scientific literature, medical reports, and social networks. Text summarization is the process of generating a
        brief fluid summary of a longer text document. Constraining summary length, while largely neglected in the
        past, is actually an important aspect of abstractive summarization. For example, given the same input document,
        if the summary is to be displayed on mobile devices, or within a fixed area of advertisement slot on a website,
        editors may want to produce a much shorter summary. Unfortunately, most existing abstractive summarization
        models are not trained to react to summary length constraints (Yizhu et al., 2018).
           Fan et al. (2017), who applies convolutional sequence to sequence model on multi- sentence summarization,
        converts length range as some special markers which are predefined and fixed. Unfortunately, this approach
        cannot generate summaries of arbitrary lengths. It only generates summaries in predefined ranges of length,
        thus  only  meets  the  length  constraints  approximately.  Miao  and  Blunsom  (2016)  extended  the  seq2seq
        framework  and  proposed  a  generative  model  to  capture  the  latent  summary  information,  but  they  did  not
        consider  the  recurrent  dependencies  in  their  generative  model  leading  to  limited  representation  ability.
        Magazine or newspaper editors tend to require summary that will fit into a certain slot in a cover, the current
        state-of-the-art  as  discussed  above  do  not  address  output  area  based  summary.  For  example,  given  a  long
        newspaper story that need to be summarized to fit a portion in the newspaper cover, the previous work will not
        be able to provide summary for this since they are all based on specified number of summary word length.






        E- Proceedings of The 5th International Multi-Conference on Artificial Intelligence Technology (MCAIT 2021)   [162]
        Artificial Intelligence in the 4th Industrial Revolution
   170   171   172   173   174   175   176   177   178   179   180