Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
S31: Interpretable machine learning in biostatistics: Methods, applications and perspectives
Time:
Tuesday, 05/Sept/2023:
2:00pm - 3:40pm

Session Chair: Marvin N. Wright
Session Chair: Matthias Schmid
Location: Lecture Room U1.141 hybrid


Session Abstract

80 minutes presentations followed by 20 minutes of discussion


Show help for 'Increase or decrease the abstract text size'
Presentations
2:00pm - 2:20pm

Interaction difference test for prediction models

Thomas Welchowski

University Hospital Bonn, Germany

Machine learning research focuses on the improvement of prediction performance. During the past decade, major advances were made in the field of deep learning with imaging, audio or video data and ensemble models like bagging or boosting with matrix type data. These so called black-box models flexibly adapt to the given data and involve fewer assumptions about the data generating process than standard methods like linear regression or single decision trees. However, due to their increased complexity, black-box models are more difficult to interpret. To address this issue, techniques for interpretable machine learning have been developed; yet there is still a lack of methods to reliably identify interaction effects between predictors under uncertainty.

In this work we present a model-agnostic asymptotic hypothesis test for the identification of interaction effects in black-box machine learning models. The null hypothesis assumes that a given set of covariates does not contribute to interaction effects in the prediction model. The test statistic is based on the difference of variances of partial dependence functions with respect to the original black-box predictions and the (more restrictive) predictions under the null hypothesis. Properties of the proposed test statistic (in particular its power) were explored in simulations of linear and nonlinear models.

The proposed hypothesis test can be applied to any black-box prediction model under suitable consistency assumptions, and the null hypothesis of the test can be flexibly specified/modified according to the research question of interest. Furthermore, the test is computationally fast to apply as the null distribution does not require resampling and/or re-fitting black-box prediction models.



2:20pm - 2:40pm

Multi-Objective Counterfactual Explanations

Susanne Dandl1,2, Andreas Hofheinz1, Martin Binder1,2, Bernd Bischl1,2, Giuseppe Casalicchio1,2

1LMU Munich; 2Munich Center for Machine Learning (MCML)

In recent years, various methods have been proposed to make complex prediction models explainable. A method that explains the prediction of a point of interest in the form of "what if" statements are counterfactual explanations, or short counterfactuals. In the medical context, counterfactuals allow for statements such as "If you do not have diabetes and have a BMI of 25 instead of 30, your model-predicted risk of chronic heart disease would drop from 75% to 40%".

To be a counterfactual, a generated point must have (some of) the following properties: (1) The point's prediction should be equal to the desired prediction, (2) it should be close to the point of interest, (3) only a few feature changes should be proposed, and (4) the point should adhere to the data manifold.

Based on these desired properties, an optimization problem can be formulated to generate counterfactuals. In our work, we argue that this optimization task is inherently multi-objective. This is because some of the properties contradict each other (e.g., in order to reach a desired prediction, more feature changes are necessary), while all properties are equally important. Therefore, rather than just one counterfactual, a whole set of equally good counterfactuals for a point of interest exists.

With this in mind, we developed the multi-objective counterfactuals (MOC) method, which is model-agnostic and works for all types of features. On a dataset on indicators for chronic heart disease (Centers for Disease Control and Prevention), we demonstrate how the method provides interesting insights into the underlying predictive model. Therefore, we use the MOC implementation in the counterfactuals R package available on CRAN.

Since a large set of returned counterfactuals can be overwhelming for users, we also address how numerous counterfactuals can be visualized, what options users have to select individual counterfactuals, and how multiple similar counterfactuals can be combined.



2:40pm - 3:00pm

Interpreting Neural Networks: A Biostatistical Perspective

Niklas Koenen1,2, Marvin N. Wright1,2,3

1Leibniz Institute for Prevention Research and Epidemiology - BIPS; 2University of Bremen; 3University of Copenhagen

Throughout the past decade, neural networks have unleashed a tremendous surge of attention and infiltrated almost all conceivable domains of science, medicine, and public life. However, for sensitive applications – besides predictive performance – an understanding of the black box decision process is essential to assess its reliability, gain insights, or extract knowledge from the data. Due to the complexity of deep neural networks, well-established statistical methods and model-agnostic approaches are challenging and often too computationally intensive to apply.

Driven by this lack of neural network-specific interpretability methods, many techniques have been proposed to fill this gap. In this context, so-called feature attribution methods have been developed to reveal variable-wise insights and effects captured in the black box model. Even though these approaches primarily focused on image data and their explanatory quality was frequently assessed by visual impressions, many of these techniques generalize to tabular data, e.g., used in biometric areas. However, the critical question of this generalization is whether and in which situations, feature attribution methods provide reliable explanations and trustworthy data insights. From a biostatistical perspective, we review these methods and challenge them in a simulation study with a known data-generating process and discuss potential pitfalls that may arise in real-world applications. Equipped with the revealed theoretical guidelines, we examine how these methods perform on actual biomedical data and whether they reflect results already found by more established approaches.



3:00pm - 3:20pm

Explainability of machine learning models for survival analysis: current state and challenges

Mateusz Krzyziński1, Przemysław Biecek1,2

1MI2.AI, Warsaw University of Technology, Poland; 2MI2.AI, University of Warsaw, Poland

The prognostic capabilities of machine learning models for survival analysis match or even surpass those of classical statistical learning approaches like Cox Proportional Hazard models. However, the widespread use of ML models is hindered by their high complexity and lack of interpretability. They are considered black-box, i.e., it is not possible to know directly what influences their prediction internally. Especially in biostatistics, where time-to-event analysis constitutes a fundamental task, there is a demand for techniques that enable the analysis and explanation of machine learning survival models – so-called explainable artificial intelligence (XAI) or interpretable machine learning (IML) methods.

One of the first such techniques was the SurvLIME method [Kovalev et al., 2020], which aims to approximate a complex model using a surrogate model – a well-established Cox model whose coefficients are interpretable and constitute an explanation. It was then expanded with further refinements using the same intuition but with a different optimization context. However, another notable approach is to use time-dependent survival explainable machine learning methods like SurvSHAP(t) [Krzyziński et al., 2023], which decomposes the model's prediction into the effects of individual covariates. Such techniques scrutinize the models' behavior across varying time horizons by analyzing survival or cumulative hazard functions. This time aspect is particularly significant since many ML models do not assume proportional hazards. Thus, analyzing such explanations allows for checking how elastic models reason when this assumption is violated. It also helps uncover uncommon effects of covariates, such as an effect that changes from positive to negative over time. In this talk, we will describe both these methods and demonstrate how the SurvSHAP(t) results can be aggregated to draw conclusions about a model of interest in a global context. This can be achieved by applying functional data analysis concepts and statistical tests to assess the significance of covariate effects in machine learning models.

However, while initial survival XAI methods have shown promising results, several challenges remain to overcome. One such challenge is the computational complexity of analyzing data at multiple time points, which can be both difficult and time-consuming. Furthermore, current methods often rely on analyzing simple right-censored data without accounting for competing risks. To advance the field, it is also essential to establish a stronger conceptual connection between explanations and the underlying biological processes that influence the event of interest. While IML methods can provide explanations for model predictions, they do not directly explain the biological mechanisms that drive the data generating process. We believe that this talk will spark discussion about addressing these challenges and gaps in understanding. It will help improve the effectiveness of XAI methods for survival analysis and ultimately enable more accurate predictions in this crucial field.



 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: CEN 2023
Conference Software: ConfTool Pro 2.6.149+TC
© 2001–2024 by Dr. H. Weinreich, Hamburg, Germany