Can Generative AI and ChatGPT Outperform Humans on Cognitive-Demanding Problem-Solving Tasks in Science?
This study aimed to examine an assumption regarding whether generative artificial intelligence (GAI) tools can overcome the cognitive intensity that humans suffer when solving problems. We examine the performance of ChatGPT and GPT-4 on NAEP science assessments and compare their performance to students by cognitive demands of the items. Fifty-four 2019 NAEP science assessment tasks were coded by content experts using a two-dimensional cognitive load framework, including task cognitive complexity and dimensionality.
This study aimed to examine an assumption regarding whether generative artificial intelligence (GAI) tools can overcome the cognitive intensity that humans suffer when solving problems. We examine the performance of ChatGPT and GPT-4 on NAEP science assessments and compare their performance to students by cognitive demands of the items.