Recommended Citation
Cronin RM, Feng X, Sulieman L, Mapes B, Garbett S, Able A, Hale R, Couper MP, Sansbury H, Ahmedani BK, and Chen Q. Importance of missingness in baseline variables: A case study of the All of Us Research Program. PLoS One 2023; 18(5):e0285848.
Document Type
Article
Publication Date
5-18-2023
Publication Title
PLoS One
Abstract
OBJECTIVE: The All of Us Research Program collects data from multiple information sources, including health surveys, to build a national longitudinal research repository that researchers can use to advance precision medicine. Missing survey responses pose challenges to study conclusions. We describe missingness in All of Us baseline surveys.
STUDY DESIGN AND SETTING: We extracted survey responses between May 31, 2017, to September 30, 2020. Missing percentages for groups historically underrepresented in biomedical research were compared to represented groups. Associations of missing percentages with age, health literacy score, and survey completion date were evaluated. We used negative binomial regression to evaluate participant characteristics on the number of missed questions out of the total eligible questions for each participant.
RESULTS: The dataset analyzed contained data for 334,183 participants who submitted at least one baseline survey. Almost all (97.0%) of the participants completed all baseline surveys, and only 541 (0.2%) participants skipped all questions in at least one of the baseline surveys. The median skip rate was 5.0% of the questions, with an interquartile range (IQR) of 2.5% to 7.9%. Historically underrepresented groups were associated with higher missingness (incidence rate ratio (IRR) [95% CI]: 1.26 [1.25, 1.27] for Black/African American compared to White). Missing percentages were similar by survey completion date, participant age, and health literacy score. Skipping specific questions were associated with higher missingness (IRRs [95% CI]: 1.39 [1.38, 1.40] for skipping income, 1.92 [1.89, 1.95] for skipping education, 2.19 [2.09-2.30] for skipping sexual and gender questions).
CONCLUSION: Surveys in the All of Us Research Program will form an essential component of the data researchers can use to perform their analyses. Missingness was low in All of Us baseline surveys, but group differences exist. Additional statistical methods and careful analysis of surveys could help mitigate challenges to the validity of conclusions.
Medical Subject Headings
Humans; Population Health; Surveys and Questionnaires; Health Surveys; Sexual Behavior
PubMed ID
37200348
Volume
18
Issue
5
First Page
0285848
Last Page
0285848