Numeric relational reasoning and spatial reasoning are critical to success in later mathematics coursework, including Algebra 1, a gatekeeper to success at the post-secondary level, and success in additional STEM domains, such as chemistry, geology, biology, and engineering. Given the importance of these skills for later success, it is imperative that there are high-quality screening tools available to identify students at-risk for difficulty in these areas. The primary aim of this study is to develop mathematics screening assessment tools for Grades K-2 over the course of four years that measure students' abilities in numeric relational reasoning and spatial reasoning. The team of researchers will develop Measures of Mathematical Reasoning Skills system, which will contain Tests of Numeric Relational Reasoning (T-NRR) and Tests of Spatial Reasoning (T-SR). The measures will be intended for use by teachers and school systems to screen students to determine who is at-risk for difficulty in early mathematics, including students with disabilities. The measures will help provide important information about the intensity of support that may be needed for a given student. Three forms per grade level will be developed for both the T-NRR and T-SR with accompanying validity and reliability evidence collected. The Discovery Research K-12 program (DRK-12) seeks to significantly enhance the learning and teaching of science, technology, engineering and mathematics (STEM) by preK-12 students and teachers, through research and development of innovative resources, models and tools (RMTs). Projects in the DRK-12 program build on fundamental research in STEM education and prior research and development efforts that provide theoretical and empirical justification for proposed projects.

The development of the T-NRR and T-SR measures will follow an iterative process across five phases. The phases include (1) refining the construct; (2) developing test specifications and item models; (3) developing items; (4) field testing the items; and (5) conducting validity studies. The evidence collected and evaluated during each phase will contribute to the overall evaluation of the reliability of the measures and the validity of the interpretations made using the measures. Item models, test specifications, and item development will be continuously evaluated and refined based on data from cognitive interviews, field tests, and reviews by mathematics educators, teachers of struggling students, teachers of culturally and linguistically diverse populations, and a Technical Advisory Board. In the final phase of development of the T-NRR and T-SR, reliability of the results will be estimated and multiple sources of validity evidence will be collected to examine the concurrent and predictive relation with other criterion measures, classification accuracy, and sensitivity to growth. Approximately 4,500 students in Grades K-2 will be involved in all phases of the research including field tests and cognitive interviews. Data will be analyzed using a two-parameter IRT model to ensure item and test form comparability.