DiaASQ : A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis

1Wuhan University,   2National University of Singapore,
3Singapore Management University  

Abstract

The rapid development of aspect-based sentiment analysis (ABSA) within recent decades shows great potential for real-world society. The current ABSA works, however, are mostly limited to the scenario of a single text piece, leaving the study in dialogue contexts unexplored. To bridge the gap between fine-grained sentiment analysis and conversational opinion mining, in this work, we introduce a novel task of conversational aspect-based sentiment quadruple analysis, namely DiaASQ, aiming to detect the quadruple of target-aspect-opinion-sentiment in a dialogue. We manually construct a large-scale high-quality DiaASQ dataset in both Chinese and English languages. We deliberately develop a neural model to benchmark the task, which advances in effectively performing end-to-end quadruple prediction, and manages to incorporate rich dialogue-specific and discourse feature representations for better cross-utterance quadruple extraction. We hope the new benchmark will spur more advancements in the sentiment analysis community.

Introduction

Aspect-based Sentiment Analysis (ABSA) has been a hot topic in the field of natural language processing in recent years. Existing ABSA research primarily deals with single text pieces, such as reviews and comments. There lies immense potential for the application of ABSA within dialogue-based contexts, a potential that remains largely untapped in the real world. For instance, the successful deployment of ABSA on social media platforms could yield profound insights into the sentiments and perspectives of people as they engage in discussions on diverse topics, be it products, services, or political matters. Regrettably, ABSA analysis in dialogue contexts has been insufficiently explored.
Description of the image
Illustration of the DiaASQ.
To address this gap, inspired by previous ABSA quadruple research, we propose a new task called Conversational Aspect Sentiment Quadruplet Analysis (DiaASQ). This task focuses on analyzing multi-party dialogues to identify and extract aspect-level sentiment quadruplets. These quadruples capture opinions expressed about specific aspects related to a particular target, along with their corresponding sentiment polarity. For example, in Figure 1, we aim to extract quadruplets such as ('Xiaomi 11', 'WiFi module', 'bad design', 'negative'), ('Xiaomi 11', 'battery life', 'not well', 'negative'), and ('Xiaomi 6', 'screen quality', 'very nice', 'positive') from the dialogues.


Data Annotation

Description of the image
Data Acquisition Process.
We construct a new dataset to facilitate the DiaASQ task. The raw corpus is collected from the largest Chinese social media, Weibo. We crawl nine million posts and comments from the tweets history of 100 verified digital bloggers. Each conversation is derived from a root post, and multiple users (i.e., multiple speakers) are attended to reply to a predecessor post. The multi-thread and multi-turn dialogue forms a tree structure, as illustrated in the Figure. We preprocess the raw dialogues to make the contexts integrated. First, we filter the topic-related conversations by a manually created keyword dictionary in the mobile phone field, which includes hundreds of hot words, like phone band names, aspects words to describe a mobile phone, etc. Then, we normalize the tweet language expressions (e.g., abusive language, hate speech) by human examination or consulting lexicons; we prune away those meaningless replying branches that deviate too much from the main topic. We also limit the maximum number of utterances to ten for better controllable modeling. After a strict cleaning procedure, we obtain the final 1,000 dialogues. Then we labored well-trained annotators to label the quadruples in the dialogues with crowd-sourcing platforms. The detailed statistics of the dataset are shown in Table 1.


Description of the image
Dataset statistics.

Methods

Description of the image
Framework of proposed model.
We propose an end-to-end framework for DiaASQ extraction. As shown in Figure 4, our extraction process consists of four main steps: Firstly, during the encoding process, we utilize a pretrained language model to encode the original dialogue text. Next, we introduce a multi-view interaction mechanism between utterance pairs to enhance the interaction among utterance pairs from the same speaker, within the same thread, and with reply relationships. This mechanism strengthens the utilization of dialogue structural information. Simultaneously, we incorporate Rotated Position Encoding (RoPE) to enhance the relative positional information between token pairs. Finally, we design a novel grid encoding mechanism that maps the original quadruples into three different token pair relation matrices. By fitting the relation matrices and predicting their labels, we are able to successfully encode and decode the DiaASQ quadruples.

Experiment

Main results. Here are the performances on Chinese and English dataset:


Description of the image
Experiment Result.

Some analyses.


Poster

BibTeX

@inproceedings{li2023diaasq,
  title={DiaASQ: A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis},
  author={Bobo Li, Hao Fei, Fei Li, Yuhan Wu, Jinsong Zhang, Shengqiong Wu, Jingye Li, Yijiang Liu, Lizi Liao, Tat-Seng Chua, Donghong Ji}
  journal   = {Findings of the Annual Meeting of the Association for Computational Linguistics},
  year      = {2023},
}