Title2Event

What is Title2Event?

Title2Event is a open event extraction dataset with large-scale human annotated Chinese title. Title2Event contains more than 42,000 news titles in 34 topics collected from Chinese web pages. It is collected by researcher at Harbin Institute of Technology and QQ Browser Search.

For more details, please refer to our EMNLP 2022 paper:

(deng-etal-2022-title2event)

Quick Start

Title2Event is distributed under a CC BY-SA 4.0 License. The dataset can be obtained below:

Baidu Netdisk
Google Drive

For the baseline codes, please refer to our github repository.

baseline repo

If you want your results to be appeared on the official leaderboard here, please read the guideline following.

Leaderboard Guideline

Citation

If you use Title2Event in your research, please cite our paper.

@inproceedings{deng-etal-2022-title2event,
    title = "{T}itle2{E}vent: Benchmarking Open Event Extraction with a Large-scale {C}hinese Title Dataset",
    author = "Deng, Haolin  and
      Zhang, Yanan  and
      Zhang, Yangfan  and
      Ying, Wangyang  and
      Yu, Changlong  and
      Gao, Jun  and
      Wang, Wei  and
      Bai, Xiaoling  and
      Yang, Nan  and
      Ma, Jin  and
      Chen, Xiang  and
      Zhou, Tianhua",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, United Arab Emirates",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.emnlp-main.437",
    pages = "6511--6524",
    abstract = "Event extraction (EE) is crucial to downstream tasks such as new aggregation and event knowledge graph construction. Most existing EE datasets manually define fixed event types and design specific schema for each of them, failing to cover diverse events emerging from the online text. Moreover, news titles, an important source of event mentions, have not gained enough attention in current EE research. In this paper, we present Title2Event, a large-scale sentence-level dataset benchmarking Open Event Extraction without restricting event types. Title2Event contains more than 42,000 news titles in 34 topics collected from Chinese web pages. To the best of our knowledge, it is currently the largest manually annotated Chinese dataset for open event extraction. We further conduct experiments on Title2Event with different models and show that the characteristics of titles make it challenging for event extraction, addressing the significance of advanced study on this problem. The dataset and baseline codes are available at https://open-event-hub.github.io/title2event.",
}

Leaderboard

Methods	Trigger Ex.			Argument Ex.			Triplet Ex.
Methods	Precission	Recall	F1	Precission	Recall	F1	Precission	Recall	F1
EventGLM_gwn	70.4	70.7	70.5	58.5	58.3	58.4	50	50.2	50.2
ST-Seq2SeqMRC	-	-	-	57.9	58.6	58.2	49.8	50.1	49.9
ST-SpanMRC	-	-	-	60.1	54.9	57.4	44.5	44.8	44.7
SeqTag	69.5	69.9	69.7	50.8	51.2	51	41.1	41.3	41.2
Unsuper	21	32	25.4	12	15.5	13.5	4.5	6.8	5.4