Monday, September 27, 2021

Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment (Paper Explained)


#neurips #peerreview #nips The peer-review system at Machine Learning conferences has come under much criticism over the last years. One major driver was the infamous 2014 NeurIPS experiment, where a subset of papers were given to two different sets of reviewers. This experiment showed that only about half of all accepted papers were consistently accepted by both committees and demonstrated significant influence of subjectivity. This paper revisits the data from the 2014 experiment and traces the fate of accepted and rejected papers during the 7 years since, and analyzes how well reviewers can assess future impact, among other things. OUTLINE: 0:00 - Intro & Overview 1:20 - Recap: The 2014 NeurIPS Experiment 5:40 - How much of reviewing is subjective? 11:00 - Validation via simulation 15:45 - Can reviewers predict future impact? 23:10 - Discussion & Comments Paper: https://ift.tt/3EFUTgS Code: https://ift.tt/3iaCsYj Abstract: In this paper we revisit the 2014 NeurIPS experiment that examined inconsistency in conference peer review. We determine that 50% of the variation in reviewer quality scores was subjective in origin. Further, with seven years passing since the experiment we find that for accepted papers, there is no correlation between quality scores and impact of the paper as measured as a function of citation count. We trace the fate of rejected papers, recovering where these papers were eventually published. For these papers we find a correlation between quality scores and impact. We conclude that the reviewing process for the 2014 conference was good for identifying poor papers, but poor for identifying good papers. We give some suggestions for improving the reviewing process but also warn against removing the subjective element. Finally, we suggest that the real conclusion of the experiment is that the community should place less onus on the notion of top-tier conference publications when assessing the quality of individual researchers. For NeurIPS 2021, the PCs are repeating the experiment, as well as conducting new ones. Authors: Corinna Cortes, Neil D. Lawrence Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ift.tt/3dJpBrR BitChute: https://ift.tt/38iX6OV Minds: https://ift.tt/37igBpB Parler: https://ift.tt/38tQU7C LinkedIn: https://ift.tt/3qcgOFy BiliBili: https://ift.tt/3mfyjkW If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://ift.tt/2DuKOZ3 Patreon: https://ift.tt/390ewRH Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

No comments:

Post a Comment