The Basics of R and RStudio

Part 6: Missing Data

RStudio
R
Tutorial
Blog
Author

William Okech

Published

November 14, 2022

Introduction

R has two types of missing data, NA and NULL.1

NA

R uses NA to represent missing data. The NA appears as another element of a vector. To test each element for missingness we use is.na(). Generally, we can use tools such as mi, mice, and Amelia (which will be discussed later) to deal with missing data. The deletion of this missing data may lead to bias or data loss, so we need to be very careful when handling it. In subsequent blog posts, we will look at the use of imputation to deal with missing data.

NULL

NULL represents nothingness or the “absence of anything”. 2

It does not mean missing but represents nothing. NULL cannot exist within a vector because it disappears.

Supplementary Reading

  1. An excellent post from the blog “Data Science by Design” on the role of missingness.

Footnotes

  1. Adapted from Lander, J. P. (2014) R for everyone: Advanced analytics and graphics. Addison-Wesley.↩︎

  2. Adapted from Lander, J. P. (2014) R for everyone: Advanced analytics and graphics. Addison-Wesley.↩︎