In another paper called “Right for the incorrect Reasons,” Linzen and their coauthors posted evidence that BERT’s performance that is high particular GLUE tasks may also be related to spurious cues within the training information for all those tasks. (The paper included an alternative data set made to especially expose the sort of shortcut that Linzen suspected BERT had been making use of on GLUE. The info set’s title: Heuristic Analysis for Natural-Language-Inference Systems, or HANS.)
Therefore is BERT, and all of its benchmark-busting siblings, really a sham?
Bowman agrees with Linzen that a few of GLUE’s training information is messy — shot through with simple biases introduced by the people whom created it, every one of which are possibly exploitable by a strong BERT-based neural community. “There’s noвЂcheap that is single’ that may allow it re re re solve every thing [in GLUE], but there are several shortcuts normally it takes that may really assist,” Bowman stated, “and the model can select through to those shortcuts.” But he doesn’t think BERT’s foundation is created on sand, either.
Read More