Imputing Attendance Data in a Longitudinal Multilevel Panel Data Set

Publication Date: April 9, 2015
Current as of:

Introduction

This report is geared toward a research audience and presents results from a series of analyses aimed at figuring out the methods for handling missing data that generate the most accurate estimates of child care center attendance. This topic is important given that proper linking of child care dosage to developmental outcomes requires accurate data on attendance. However, if a fair amount of data is missing, the accuracy of attendance estimates may be compromised. In order to address this issue, this report used data from Baby FACES to simulate data on children’s child care center attendance over the course of a year and compared different methods of handling missing attendance data.  The results indicate that when data are missing on one variable and at one level only, complete case analysis produces accurate estimates of average weekly attendance, regardless of the amount or type of missingness. When estimating total yearly attendance, complete case analysis is inaccurate, but both mean replacement and multiple imputation produce reasonable estimates. A lesson learned from this exercise is that, when the desired estimates are simple, univariate descriptive statistics, single imputation techniques such as mean replacement can perform as well as more complicated techniques such as multiple imputation.