CPII system ranks FIRST in BOTH SEEN and UNSEEN Tests in the 2022 DialDoc Workshop organized by The Association for Computational Linguistics (ACL)
EVENT | JUNE | 2022
We are pleased to announce that our Principal Investigator Professor Helen Meng’s system ranks an overall FIRST in the 2022 DialDoc Workshop which was organized by The Association for Computational Linguistics (ACL). This year, ACL organized the 2022 DialDoc Workshop which hosted the Shared Task on building open-book goal-oriented dialogue systems. Out of the many systems that entered the Challenge, the organizers conducted automatic evaluation on two tests. The CPII-NLP system topped both leaderboards. Then the organizers selected a fraction (100) of the dialogs for human evaluators who scored them manually. Thereafter the top 3 systems were-reranked based on human-scoring. The CPII system ranked top on the SEEN leaderboard and ranked second in the UNSEEN leaderboard, and achieved the BEST System Overall in the DialDoc@ACL 2022 Challenge!
In the Human-scored Re-ranking of the MultiDoc2Dial Challenge, Prof Helen Meng’s system ranked first overall.
This challenge provides the necessary data to contestants for free. The data include several thousand passages from US sources from over 480 US government webpages, such as the Department of Motor Vehicles, Veteran Affairs (va.gov), Social Security Administration (ssa.gov), Student Aid (studentaid.gov), etc. The contestants need to generate natural responses to answer questions that are likely asked by the general public seeking related information. The 2022 Shared Task, MultiDoc2Dial, is a new task with a dataset on modeling goal-oriented in multiple documents. The aim is to address more realistic scenarios where a goal-oriented information-seeking conversation involves multiple topics and hence is grounded on different documents.
Prof Meng’s team developed a dialog system and competed in the Shared Task. The evaluation includes both the SEEN test and UNSEEN test, depending on whether some of the test data may have been included (by the organizers) in the training data. Out of over 500 submissions from around the world, automatic evaluation shows that Prof Meng’s system ranks FIRST in BOTH SEEN and UNSEEN Tests.
The ranking results on the LeaderBoard:
LeaderBoard Ranking based on Automatic Evaluation Results on the SharedTask SEEN test
LeaderBoard Ranking based on Automatic Evaluation Results on the SharedTask UNSEEN test
More information can be found on the following websites: