Comparing Single-Level And Multi-Level Analyses With Complex Sampling Weights

Abstract

Large-scale assessment (LSA) data are increasingly utilized to inform education policy and practice worldwide. Because LSA adopts complex sampling designs with unequal probabilities of selection, the use of sampling weights is essential. However, research evidence is scarce, and uncertainty remains regarding how sampling weights should be used, and which approaches are preferrable. This study draws on data from the Early Childhood Longitudinal Study–Kindergarten, the Program for International Student Assessment, and the Trends in International Mathematics and Science Study to compare seven different approaches involving weighting, scaling, and modeling in the context of LSA data. Simulation results indicate that multi-level models without sampling weights have the smallest root mean square error (RMSE) whereas single-level models with overall sampling weights have the largest RMSE. Empirical findings suggest that, overall, weighting, scaling, and modeling choice do not substantially affect statistical significance. Both simulation and empirical analyses reveal that three approaches—multi-level models with school sampling weights, and multi-level models with two-level sampling weights with either size scaling or effective scaling—perform similarly. Discussion and practical recommendations are provided.

Department(s)

Psychological Science

Keywords and Phrases

complex sampling weights; large-scale assessment (LSA); Monte Carlo simulation; multi-level models; single-level models

International Standard Serial Number (ISSN)

1934-5739; 1934-5747

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2025 Taylor and Francis Group; Routledge, All rights reserved.

Publication Date

01 Jan 2025

Share

 
COinS