August 13, 2015 PDF 429k 16 pages Download Overview Measurement error and reliability are two important psychometric properties for large-scale assessments. Generalizability theory has often been used to identify sources of error and to estimate score reliability. The complicated nature of sparse matrix data collection designs in some assessments, however, can cause challenges in conducting generalizability analyses. The present study examines potential sources of measurement error associated with large-scale writing assessment scores by modeling multiple measurement components and conducting multistep analyses based on both univariate and multivariate generalizability theory. The study demonstrates how to use multiple generalizability analyses to produce approximate estimates of measurement error and reliability under complex measurement conditions when a single study design cannot capture and disentangle all measurement facets. Related Items Assessing the Reliability of GMAT™ Analytical Writing Assessment Fairness of Automated Essay Scoring of GMAT™ AWA Use of the GMAT™ Analytical Writing Assessment: Past and Present