Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
S32: Young Statisticians 2
Time:
Tuesday, 05/Sept/2023:
2:00pm - 3:40pm

Session Chair: Stefanie Peschel
Session Chair: Andrea Berghold
Location: Lecture Room U1.101 hybrid


Show help for 'Increase or decrease the abstract text size'
Presentations
2:00pm - 2:20pm

Enhancing replicability of exploratory variable selections based on clinical trial data

Manuela Rebecca Zimmermann, Mark Baillie, Matthias Kormaksson, David Ohlssen, Konstantinos Sechidis

Novartis, Switzerland

Clinical trials are at the core of clinical development and a major driver of costs in the pharmaceutical industry. However, their primary purpose is to answer only narrowly defined scientific questions. While detailed prespecification of such questions is required for regulatory purposes, it seems wasteful not to make more use of clinical trial data, given the enormous efforts required to obtain such data sets and their generally high validity. Indeed, there is a great interest in re-using clinical trial data to support scientific discovery, e.g. to identify prognostic measures of disease or biomarkers that predict treatment efficacy. The issue with the vast majority of such exploratory analyses, however, is that they can be overly optimistic about positive findings. Indeed, most exploratory analyses do not account for multiplicity, and thus contribute to the current replicability crisis. In the context of clinical development, this lack of control for false discoveries (type-I errors) results in increased patient burden, unnecessary research efforts, and avoidable costs.

The academic literature offers a robust and versatile framework for exploratory variable selection analyses under strict type-I error control, the knockoff framework. This framework can handle high dimensional data in a model-agnostic manner, which makes it a prime candidate for exploratory analyses in clinical development settings. However, its operation characteristics in practically relevant settings are largely unknown and generally depend on the myriad choices that can be made when applying the framework. We raise awareness for practical considerations, exemplify common pitfalls, and demonstrate practical performance of the knockoff framework in real case studies of Phase III trials. In addition to the methodology transfer from a more academic setting to drug development practice, we also further develop the methodology to increase computational efficiency in practical settings. As such, our work enables quantitative scientist to transform (clinical) data into knowledge quicker and, crucially, in a replicable manner.



2:20pm - 2:40pm

A flexible framework for interpretable and individualized reporting of model results

Hannah Kümpel, Sabine Hoffmann

LMU Munich, Germany

The results of statistical models are frequently intended to inform evidence-based decision-making. However, commonly reported effect size measures like odds ratios lack interpretability and are often misunderstood. In the same way as statistical significance, they are, therefore, not very well suited when it comes to deriving practical decision rules. This particularly applies to situations where a medical practitioner tries to determine the best course of action by balancing costs, benefits, and uncertainties for patients based on individual characteristics. For example, the odds ratios from a logistic regression model can not be used to answer questions such as 'By how much does receiving treatment change the expected probability of heart failure for women between fifty and sixty years of age?'.

To facilitate evidence-based decision-making based on statistical analyses for medical practitioners, we propose a framework for individualizing model output post-inference. Specifically, we generalize the concepts of marginal effects and adjusted predictions to then define point estimates and uncertainty regions for both the average expected target variable given specific patient characteristics and the average expected absolute change resulting from changes in these characteristics. Along with these quantities, we propose corresponding visualization techniques that may be reported alongside classical effect size measures to improve the interpretability of study results.

Furthermore, we present a method to not only visualize average expected change but to take into account both estimation and sampling uncertainty by visually comparing the expected distribution of the target variable resulting from changes in patient characteristics.

A notable benefit of the proposed framework is that it allows for the specification of each defined quantity according to the exact research question at hand rather than having to adjust one's reporting as a function of the correct interpretation of a given effect size measure. This is achieved by using probability measures for averaging over predictor values and, furthermore, distinguishing between three axiomatic assumptions regarding the dependence structure of these predictors.
We illustrate the proposed methodology by applying it to selected case studies in biomedical research, highlighting its practical relevance.



2:40pm - 3:00pm

Confirmatory studies in methodological statistical research: concept and illustration

F. Julian D. Lange1,2, Anne-Laure Boulesteix1,2

1Institute for Medical Information Processing, Biometry, and Epidemiology, LMU Munich, Germany; 2Munich Center for Machine Learning (MCML), Munich, Germany

Hypothesis-generating, exploratory research and hypothesis-testing, confirmatory research are both essential to progress in science. However, failing to separate the two types of research can lead to non-replicable results when exploratory findings are misperceived or intentionally presented as confirmatory. To transparently conduct strictly confirmatory analyses, the practice of publicly registering research plans before the data analysis has become increasingly popular. This process is called pre-registration. For a number of applied research fields and study types, templates to aid researchers in specifying sufficiently detailed plans are available. In the context of methodological statistical research, however, the distinction between exploratory and confirmatory studies has received little attention in the scientific literature so far. Consequently, there is no guidance available regarding the pre-registration of methodological research in particular. To address this gap, this work proposes an approach for a strictly confirmatory real-data study in this field and provides a corresponding pre-registration document checklist for comprehensively planning such a study. The suggested approach is illustrated with a large-scale benchmark experiment, and its results roughly confirm the findings of an existing simulation study by van der Ploeg et al. (2014). Specifically, the illustration indicates that untuned random forests (a) require more events per variable (EPV) than logistic regression to realize their predictive performance potential and (b) are prone to overfitting even when generated with a large number of EPV. It also demonstrates how pre-registration can prevent selective reporting and over-optimistic conclusions, thereby suggesting that the adoption of the proposed approach could lead to more credible methodological statistical research.

References

van der Ploeg, T., Austin, P. C., and Steyerberg, E. W. (2014). Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Medical Research Methodology, 14:137.



3:00pm - 3:20pm

Analysis of the Effect of Hyperparameters on Variable Selection in Random Forests

Lea L. Kronziel, Césaire J. K. Fouodo, Inke R. König, Silke Szymczak

Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Lübeck, Germany

Background: Random forests (RF) are a well-known and efficient method to predict a class or a regression value. RFs are particularly useful for high-dimensional datasets and thus also for analyses of genetic data. In addition, RFs provide the possibility to estimate the importance of each variable. Variable selection algorithms such as Vita and Boruta use these importance scores to statistically test each variable for its relevance for prediction. By selecting variables that are important for the prediction of a disease, insight into the genetic background as well as the biological effects of a disease can be gained.

Motivation: Several studies have shown that the hyperparameters of RFs have an impact on the prediction accuracy. Moreover, the calculated variable importances can be influenced by the choice of hyperparameters. However, it is unclear whether this applies analogously to variable selection procedures. Therefore, the aim of this study was to investigate the influence of different hyperparameters on variable selection by comprehensive simulation studies. Based on the results, recommendations will be given on how to select hyperparameters under specific conditions.

Methods: A large number of simulations were performed within three simulation studies. The focus was on gene expression data since these are usually high dimensional and it is a realistic application area. Initially, we focused on the correlated structure of these data, which were simulated in combination with a continuous target variable. For the simulations with binary target variables, real expression data was used to simulate the structure of the data. The hyperparameters considered were the number of decision trees, the number of split candidates, also known as mtry, and the minimal node size. For variable selection, the Vita and Boruta methods were used, since they have been recommended as those with the highest power. Various performance measures such as false discovery rate (FDR) and sensitivity were used to evaluate the impact of the hyperparameters on Boruta and Vita.

Results: As the number of trees increases, not only the sensitivity increases but also the FDR. Moreover, it turned out that the optimal number of split variables strongly depends on the proportion of associated variables as well as their effect sizes. Furthermore, the minimal node size does not show any relevant effects on variable selection.

Conclusion: The study demonstrates that the previous default values may lead to suboptimal results if the focus is not on prediction performance but on variable selection. As with classification or regression, it is recommended to significantly increase the number of trees for high sensitivity. Strategies must be developed to select optimal values for the number of split candidates for a specific data set.



3:20pm - 3:40pm

The best of two worlds? A systematic comparison of time-to-event model implementations between R and Python

Lukas Klein, Gunter Grieser, Antje Jahn

University of Applied Sciences Darmstadt, Germany

The German organ transplant registry (TxReg) has recently become available, offering a unique opportunity to study the post-transplant survival of organ recipients in a distinct patient population and healthcare system. In recent years an increasing interest in applying machine-learning (ML) methods for predicting post-transplant survival was observed. However, the issue of censoring in survival analysis prediction tasks seems to be often neglected in machine learning applications. Instead, the task is reduced to classification, a simplification that is only rarely seen in regression modeling. A potential reason for this discrepancy might be a lack of implementations and accompanying documentation in the typically applied machine learning ecosystem of Python for survival analysis. In the field of biostatistics R is more prominent. While R survival analysis packages have a decade-long history, the often-used PySurvival was discontinued in 2019, and only recently has work begun on scikit-survival to integrate survival analysis ML methods into the scikit-learn ecosystem. To guide researchers in selecting and applying ML software for survival predictions, we systematically compare the respective R and Python implementations. We start with comparing the availability of different ML implementations for survival analysis between R and Python. Our inspection also includes a comparison of tools for model inspection and investigating prediction performance with respect to discrimination, accuracy and calibration. Finally, we also compare computational speed when applied to large data. All comparisons are performed and illustrated on TxReg data analyzing kidney recipient post-transplant survival. Our findings show that the Python scikit-survival toolkit provides excellent possibilities for a central interface for implementing survival analysis modelling. In particular the integration of deep learning methods for censored data is an advantage. However, R still provides better tools for model inspection, for assessing prediction performance for example by calibration curves, and for scoring frameworks like TRIPOD. In some cases, such as the ranger package for random survival forests, Python and R share the same backend implementation. In our application, ML methods achieved an IPC weighted C-index of up to 0.72 with a gradient-boosted model. Given the increasing demand for ML and deep learning approaches and the deployment of models in commercial settings, the necessity for survival analysis methodology implemented in Python is expected to grow. Overall, our talk highlights the strengths and limitations of both R and Python for survival analysis and provides guidance for researchers to choose the appropriate toolkit for their analysis needs.



 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: CEN 2023
Conference Software: ConfTool Pro 2.6.149+TC
© 2001–2024 by Dr. H. Weinreich, Hamburg, Germany