Critical evaluation underpins the practices of science. In a three-year classroom-based research project, we developed and tested instructional scaffolds for Earth science content in which students evaluate lines of evidence with respect to alternative explanations of scientific phenomena (climate change, fracking and earthquakes, wetlands and land use, and formation of Earth’s Moon). The present paper documents a quasi-experimental study where high school Earth science students completed these instructional scaffolds, including an explanation task scored for evaluative levels (erroneous, descriptive, relational, and critical), along with measures of plausibility reappraisal and knowledge. Repeated measures analyses of variance reveal significant increases in plausibility and knowledge scores for students completing instructional scaffolds that promoted students’ evaluations about the connections between lines of evidence and two alternative explanations, whereas evaluations about connections between lines of evidence and only one alternative show no change in scores. A structural equation model suggests that students’ evaluation may influence post instructional plausibility and knowledge. The results of this study demonstrate that students’ active evaluation of scientific alternatives and explicit reappraisal of plausibility judgments can support deeper learning of Earth science content.