MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation
arXiv (Cornell University)(2023)
摘要
Understanding events in texts is a core objective of natural languageunderstanding, which requires detecting event occurrences, extracting eventarguments, and analyzing inter-event relationships. However, due to theannotation challenges brought by task complexity, a large-scale datasetcovering the full process of event understanding has long been absent. In thispaper, we introduce MAVEN-Arg, which augments MAVEN datasets with eventargument annotations, making the first all-in-one dataset supporting eventdetection, event argument extraction (EAE), and event relation extraction. Asan EAE benchmark, MAVEN-Arg offers three main advantages: (1) a comprehensiveschema covering 162 event types and 612 argument roles, all with expert-writtendefinitions and examples; (2) a large data scale, containing 98,591 events and290,613 arguments obtained with laborious human annotation; (3) the exhaustiveannotation supporting all task variants of EAE, which annotates both entity andnon-entity event arguments in document level. Experiments indicate thatMAVEN-Arg is quite challenging for both fine-tuned EAE models and proprietarylarge language models (LLMs). Furthermore, to demonstrate the benefits of anall-in-one dataset, we preliminarily explore a potential application, futureevent prediction, with LLMs. MAVEN-Arg and codes can be obtained fromhttps://github.com/THU-KEG/MAVEN-Argument.
更多查看译文
关键词
Named Entity Recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要