Jesse Dunietz was surprised as he attended the 58th Annual Meeting of the Association of Computational Linguistics. For decades, natural-language processing (NLP), the AI branch that specializes in creating systems that analyze the human language, has been measuring the ability of these systems through benchmark data sets.
Listen beautiful relax classics on our Youtube channel.
Much of today’s reading comprehension research entails carefully tweaking models to eke out a few more percentage points on the latest data sets. “State of the art” has practically become a proper noun: “We beat SOTA on SQuAD by 2.4 points!”
But during this year’s meeting, Dunietz felt something different.
Attendees’ conversations were unusually introspective about the core methods and objectives of natural-language processing (NLP), the branch of AI focused on creating systems that analyze or generate human language. Papers in this year’s new “Theme” track asked questions like: Are current methods really enough to achieve the field’s ultimate goals? What even are those goals?
For Dunietz and his colleagues, the field needs a “ transformation, not just in system design, but in a less glamorous area: evaluation.”
More details about this over at Technology Review.
(Image Credit: insspirito/ Pixabay)