Experience and progress with developing guidance for the analysis of key topics in observational research
Willi Sauerbrei1,6, Michal Abrahamowicz2, Saskia Le Cessie3, Marianne Huebner4, Ruth Keogh5, James Carpenter5
1Medical Center - University of Freiburg, Germany; 2Department of Epidemiology and Biostatistics, McGill University, Montreal, Canada; 3Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands; 4Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA; 5Department of Medical Statistics, London School of Hygiene & Tropical Medicine, London, UK; 6for the STRATOS initiative
The STRATOS initiative was launched at ISCB 2013 and the first STRATOS paper summarized the motivation, mission, structure and aims of this international initiative (Sauerbrei, Abrahamowicz, Altman, le Cessie and Carpenter, 2014, www.http://stratos-initiative.org/). Providing accessible, evidence-based guidance for key topics in the design and analysis of observational studies is the main aim. Guidance is intended for applied statisticians and other data analysts with varying levels of statistical background and experience. The focus is on health sciences research, but the content is also relevant for applications of statistics in other empirical sciences.
In 2013 STRATOS started off with seven topic groups (TGs) focusing on different aspects of study design and analysis methodology (1- Missing data, 2- Selection of variables and functional forms in multivariable analysis, 3- Initial data analysis, 4- Measurement error and misclassification, 5- Study design, 6- Evaluating diagnostic tests and prediction models, 7- Causal inference). For their specific topic, each group provided a brief summary of the state of research, main issues, main aims and planned future research (Sauerbrei et al., 2014). Two further TGs were initiated in 2015 on the topics of Survival analysis (TG8) and High-dimensional data (TG9). Summaries are available on the STRATOS website.
To coordinate the activities of the initiative, and to help improve standards of both methodological and applied research, we started several cross-cutting panels, that work on issues common to all TG's, including simulation, visualization, and most recently about open science.
In this talk we will provide a short introduction illustrating the necessity of guidance for analysis of observational studies and outline experience and progress of the STRATOS initiative.
Initial data analysis plans are part of research projects
Marianne Huebner1,4, Carsten Oliver Schmidt2, Lara Lusa3
1Michigan State University, United States of America; 2University Medicine of Greifswald, Germany; 3University of Ljubljana, Slovenia; 4for TG3
Initial data analysis (IDA) is a systematic approach that aims for transparency and integrity by providing researchers with an analysis-ready data set and reliable information about its properties. It consists of metadata setup, data cleaning, data screening, initial data reporting, refining or updating the statistical analysis plan, and documenting and reporting IDA. Researchers have flexibility to make decisions throughout a research study, but irreproducibility results when these decisions are handled in an ad-hoc manner. DA is not routinely taught to data analysts thus is often conducted without a clear plan and is not well documented. We discuss a check list for developing a priori IDA plans and illustrate this with examples.
Level 1 guidance on conducting and reporting sensitivity analyses for missing data
Katherine Lee1, Rheanna Mainzer1, James Robert Carpenter2,3,4
1Murdoch Children's Research Institute, Melbourne, Australia; 2London School of Hygiene & Tropical Medicine, United Kingdom; 3MRC CTU at UCL, London UK; 4for TG1
Missing data are common in observational studies. When estimating a target parameter in the presence of missing data, the researcher (either implicitly or explicitly) makes assumptions about the unknown missingness mechanism.
An important, but often overlooked step of the analysis is examining the robustness of estimates and inferences to alternative plausible assumptions about the missingness mechanism, and in particular conducting analyses that allow the missingness to depend on the missing values themselves, sometimes referred to as a “missing not at random” or a “delta-adjusted” analysis.
We previously outlined a framework for handling and reporting the analysis of incomplete data in observational studies where we encourage researchers to think systematically about missing data and transparently report the potential effect on the study results[1]. In this talk, we extend this framework to the planning, conduct and reporting of sensitivity analyses which incorporate external information about how the missing values differ to those observed.
We illustrate the process using a case study from the Avon Longitudinal Study of Parents and Children, providing practical guidance that can be tailored to the problem at hand. We hope this guidance will make such sensitivity analyses more accessible to researchers, increasing its use in practice, and increasing the confidence in research findings from incomplete data.
Reference:
1. Lee, K. J., Tilling, K. M., Cornish, R. P., Little, R. J. A., Bell, M. L., Goetghebeur, E., Hogan, J. W. and Carpenter J. R. (2021) Framework for the treatment and reporting of missing data in observational studies: The Treatment And Reporting of Missing data in Observational Studies framework. Journal of Clinical Epidemiology, 134, 79-88.
|