This project builds upon the prior work by creating problem-solving measures for grades 3-5. The elementary assessments will be connected to the middle-grades assessments and will be available for use by school districts, researchers, and other education professionals seeking to effectively measure children's problem solving. The aims of the project are to (a) create three new mathematical problem-solving assessments and gather validity evidence for their use, (b) link the problem-solving measures (PSMs) with prior problem-solving measures (i.e., PSM6, PSM7, and PSM8), and (c) develop a meaningful reporting system for the PSMs.

#
Developing and Evaluating Assessments of Problem Solving (Collaborative Research: Bostic)

Current state standards in mathematics are strategically focused on problem-solving skills in both content standards and practice standards. Content standards describe what math students are expected to learn at each grade level while practice standards characterize math behaviors that all students should experience (e.g., perseverance while problem solving and reasoning effectively about real-world situations). Problem solving is found at every grade level. If math teachers are expected to engage students in problem solving during everyday instruction, then students' problem-solving performance must be assessed in a manner that produces meaningful, valid, and reliable scores, without unduly burdening teachers or students. Unfortunately, most problem-solving assessments are generally framed by a set of mathematics expectations that differ from state standards. Thus, results from those assessments are disconnected from the mathematics content that students learn in the classroom. Previously, this research team has built problem-solving measures for grades 6-8, which address this gap in framing and generates meaningful, valid, and reliable scores, and do not have unintended negative consequences on students. The current project, titled Developing and Evaluating Assessments of Problem Solving (DEAP), builds upon the team's prior work by creating problem-solving measures for grades 3-5. The elementary assessments will be connected to the middle-grades assessments and will be available for use by school districts, researchers, and other education professionals seeking to effectively measure children's problem solving.

Broadly speaking, the aims of DEAP are to (a) create three new mathematical problem-solving assessments and gather validity evidence for their use, (b) link the problem-solving measures (PSMs) with prior problem-solving measures (i.e., PSM6, PSM7, and PSM8), and (c) develop a meaningful reporting system for the PSMs. The research questions are: (a) What are the psychometric properties of the PSM3, PSM4, and PSM5 as they relate to students' problem-solving performance? (b) How does the evidence support vertical equating (linking) of the PSM3, PSM4, PSM5, PSM6, PSM7, and PSM8? (c) How do the PSM3, PSM4, and PSM5, and their related reporting systems impact teachers' instructional decision making when used formatively? Year 1 focuses on item and test development. The study will conduct cognitive interviews and administer tests with a small group of students to explore how items and tests function. Rasch (1-PL) measurement will be employed, similar to prior PSM development. Year 2 includes further pilot testing and gathering validity evidence through cognitive interviews and test administration. Year 3 has a final round of pilot testing and selection of linking items for vertical equating. Year 4 involves pilot testing the PSM series with linking items and developing a reporting system. DEAP's potential contributions to the field are three-fold. (1) Assessments will be available for use by the public. (2) A set of vertically equated problem-solving measures will allow users the opportunity to explore students' problem-solving performance as they matriculate across grade levels, which is currently not possible at the state or national level. (3) This project fills a need in the field as no set of measures uses vertical equating to assess elementary students' problem-solving performance in a rigorous fashion within the context of state testing.