Estimating Incremental Validity Under Missing Data

Dustin A. Fife, Jorge L. Mendoza, Christopher M. Berry

Research output: Contribution to journalArticlepeer-review


A common form of missing data is caused by selection on an observed variable (e.g., Z). If the selection variable was measured and is available, the data are regarded as missing at random (MAR). Selection biases correlation, reliability, and effect size estimates when these estimates are computed on listwise deleted (LD) data sets. On the other hand, maximum likelihood (ML) estimates are generally unbiased and outperform LD in most situations, at least when the data are MAR. The exception is when we estimate the partial correlation. In this situation, LD estimates are unbiased when the cause of missingness is partialled out. In other words, there is no advantage of ML estimates over LD estimates in this situation. We demonstrate that under a MAR condition, even ML estimates may become biased, depending on how partial correlations are computed. Finally, we conclude with recommendations about how future researchers might estimate partial correlations even when the cause of missingness is unknown and, perhaps, unknowable.

Original languageEnglish (US)
Pages (from-to)164-177
Number of pages14
JournalMultivariate Behavioral Research
Issue number2
StatePublished - Mar 4 2017

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Experimental and Cognitive Psychology
  • Arts and Humanities (miscellaneous)


Dive into the research topics of 'Estimating Incremental Validity Under Missing Data'. Together they form a unique fingerprint.

Cite this