Question Generation (QG), as a challenging Natural Language Processing task, aims at generating questions based on given answers and context.

Existing QG methods mainly focus on building or training models for specific QG datasets. These works are subject to two major limitations:

(1) They are dedicated to specific QG formats (e.g., answer-extraction or multi-choice QG), therefore, if we want to address a new format of QG, a re-design of the QG model is required

(2) Optimal performance is only achieved on the dataset they were just trained on. As a result, we have to train and keep various QG models for different QG datasets, which is resource-intensive and ungeneralizable

To solve these problems, we propose a model named Unified-QG based on lifelong learning techniques, which can continually learn QG tasks across different datasets and formats. Specifically, we first build a format-convert encoding to transform different kinds of QG formats into a unified representation.

Then, a method named STRIDER (SimilariTy RegularIzed Difficult Example Replay) is built to alleviate catastrophic forgetting in continual QG learning. Extensive experiments were conducted on 8 QG datasets across 4 QG formats (answer-extraction, answer-abstraction, multi-choice, and boolean QG) to demonstrate the effectiveness of our approach.

Experimental results demonstrate that our Unified-QG can effectively and continually adapt to QG tasks when datasets and formats vary. In addition, we verify the ability of a single trained Unified-QG model in improving 8 Question Answering (QA) systems’ performance through generating synthetic QA data.

This session will also be conducted online via Zoom: For the in-person option, please come along to the Liveris Building, 46-442. 


Dr Rocky Chen


Mr Wei Yuan

Wei Yuan is a first-year PhD candidate within the Data Science (DAS) group under the supervision of A/ Prof. Hongzhi Yin and Dr. Miao Xu. He received his master's degree in software engineering from Nanjing University, China, in 2021. His research interest includes machine learning, natural language processing, and recommender system.


About Data Science Seminar

This seminar series will be run as weekly sessions and is hosted by ITEE Data Science.


46-442 or via Zoom

Other upcoming sessions