Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
S20: Statistical Modeling III
Time:
Monday, 04/Sept/2023:
4:10pm - 5:50pm

Session Chair: Tomasz Burzykowski
Session Chair: Tobias Mütze
Location: Seminar Room U1.197 hybrid


Show help for 'Increase or decrease the abstract text size'
Presentations
4:10pm - 4:30pm

Optimal Subsampling Design for Polynomial Regression

Torsten Reuter

Otto von Guericke University Magdeburg, Germany

Data Reduction is a major challenge as technological progress has led to a massive increase in data collection to the point where traditional statistical methods fail or computing power cannot keep up. Subsampling reduces data size by selecting a subset from the original data. We study D-optimal subsampling designs for polynomial regression, where the goal is to select a given percentage of the full data that maximizes the determinant of the information matrix. We derive D-optimal subsampling designs under several standard distributional assumptions on the covariates, in particular focusing on the resulting shapes of the subsampling designs. For example, for quadratic regression the D-optimal subsampling design typically has a support of three disjoint intervals. We take a look at the percentage of mass of the optimal subsampling design on the outer intervals compared to the inner one, which changes drastically given the distribution of the covariate, particularly for heavy-tailed distributions. In addition, we examine the efficiency of uniform random subsampling to illustrate the advantage of the optimal subsampling designs. The thus obtained subsampling designs provide simple rules on whether to accept or to reject a data point and therefore allow for an easy algorithmic implementation. We propose a generalization of the Information-Based Optimal Subdata Selection method (IBOSS) to quadratic regression which does not require prior knowledge of the distribution of the covariate and which performs remarkably well compared to the optimal subsampling design. We present an extensive simulation study showing the advantages of our methods over the IBOSS method among others and discuss their computing times. Further we discuss how results extend to other optimality criteria like A- and E-optimality from the Kiefer’s Φq-class of optimality criteria, IMSE-optimality for predicting the mean response, or optimality criteria based on subsets or linear functionals of parameters.



4:30pm - 4:50pm

Optimal design for identifying alert concentrations

Kirsten Schorning1, Kathrin Möllenhoff2

1TU Dortmund University, Dortmund, Germany; 2Heinrich Heine University, Düsseldorf, Germany

The determination of alert concentrations, where a pre-specified threshold of the response variable is exceeded, is an important goal of concentration-response studies. Recently, several model-based testing procedures were developed that provide the identification of alerts at concentrations, which were not measured during the study. These model-based approaches are based on the fits of nonlinear concentration-response curves and therefore their quality strongly depends on the set of concentrations at which observations were taken.

In this talk, we address the optimal design problem for the identification of alert concentrations in order to improve these model-based testing procedures with respect to their power. Consequently, an optimal design minimizes the maximum variance of the estimator of potential alert concentration. Optimal design theory (equivalence theorem, efficiency bounds) is developed for this design problem and the results are illustrated in several examples identifying the alert concentration under the assumption of different dose-response relationships. In particular, it is demonstrated within a simulation study that using the optimal design results in more powerful tests for identifying alerts than using other commonly used “non-optimal” designs.



4:50pm - 5:10pm

Bias through endogenous time-varying covariates in the analysis of cohort stepped-wedge trials: a simulation study

Jale Basten1, Daniel Claus1, Katja Ickstadt2, Nina Timmesfeld1

1Department of Medical Informatics, Biometry and Epidemiology, Ruhr-University Bochum, Germany; 2Faculty of Statistics, TU Dortmund University, Germany

One of the major advantages of stepped-wedge cluster-randomised trials (SW-CRTs) over cluster-randomised trials in parallel design is that all participating clusters (e.g. practices) receive the intervention, because they all unidirectional crossover from the control to intervention conditions, which is conducive to recruitment rates [1].

Depending on the intervention, two different approaches can be chosen for SW-CRTs: On the one hand, cross-sectional data can be collected, so that different patients are observed in each step; on the other hand, a cohort of patients can be observed across several steps (cohort data) [2].

Due to the staggered design of SW-CRTs, observations collected under the control condition are, on average, from an earlier calendar time than observations collected under the intervention condition. Thus, the stepped wedge design is susceptible to time effects, such as secular trends, e.g. public policies or seasonal fluctuations. In a cohort design, correlation between measurements within a participant are dependent of the timing in which the observations are made. This raises the possibility that responses may vary over time due to secular trends (external time effects), changes in cohort characteristics (internal time effects), as well as because of changes in treatment. Therefore, a model that allows for time effects is essential [3].

In a longitudinal study, fixed effects can be exogenous or endogenous. Examples of exogenous covariates include baseline variables (age, gender, etc.), function of time, and time-varying variables that are not impacted by prior treatment or prior outcome. In contrast, endogenous covariates are impacted by prior treatment or prior outcome, e.g. the frailty of a participant impacts mental health, but prior intervention (e.g. care program) and prior mental health condition may also impact the frailty of a participant [4].

To analyse the intervention effect in SW-CRTs, we use linear mixed-effects (LME) models with two random effects used to account for clustering (within-cluster correlation) and multiple measurements on participants (within-individual correlation). We will compare model specifications with different fixed effects to investigate which model specification yields unbiased intervention effect estimates in spite of external and internal time effects.

If time-varying confounders are exogenous, we already demonstrated with an extent Monte-Carlo simulation that LME models with fixed categorical time effects additional to the fixed effect of intervention and two random effects used to account for the within-cluster and within-individual correlation seem to produce unbiased estimates of the intervention effect in SW-CRTs with closed and open cohort data even if time-varying exogenous confounders or their functional influence on outcome were unmeasured or unknown and if secular trends occurred [5].

In this talk we will present our results of a simulation study extended to endogenous time-varying covariates that influence participants’ responses in cohort SW-CRTs. We seek to find a model approach with the best performance in terms of bias for different realistic data scenarios. Both closed and open cohort data will be considered and the results will be compared.



5:10pm - 5:30pm

Estimating the conditional distribution in functional regression problems

Thomas Kuenzer1, Siegfried Hörmann2, Gregory Rice3

1Medical University of Graz, Austria; 2Graz University of Technology, Austria; 3University of Waterloo, Canada

We consider the problem of consistently estimating the conditional distribution of a functional data object given functional covariates, assuming that the response and the predictor are related by a functional regression model. A nonparametric method is proposed that is based on the empirical distribution of the estimated model residuals. In the case of functional linear regression, consistent estimation of the conditional distribution can be achieved. This permits to describe interesting path properties of the response in a simple way. The usefulness of the method is demonstrated using both simulated and real data.



5:30pm - 5:50pm

Using Item response theory for testing assumptions underlying clinical scores

Daniel Schulze, Ulrike Grittner

Charité - Universitätsmedizin Berlin, Germany

Scores are frequently used in medicine to aggregate patient features or to sum up questionnnaires. Group comparisons on such scores are just as common. Relatively little attention is paid to the assumptions of simple unweighted aggregations across several characterstics. Item response theory (IRT) is a framework to understand four implicit assumptions that are made in scores: 1) the aggregated features (or items) reflect a single underlying cause, 2) the features contribute equivocally to the score, 3) the features are measured without any error, and 4) the measurement properties are exactly the same in two groups whose comparison is of interest to the researcher. Violations of these assumptions result in potentially severe bias in an effect of interest, e.g. in group mean comparisons. Depending on which assumption is violated, bias can diminish or exaggerate an effect. We thus advocate testing these assumptions by means of IRT modeling. We will discuss the concept of latent variables and IRT in its application to medical research. We introduce the most common models and discuss model parameters, model testing, and measurement invariance. IRT concepts are introduced with the help of real data from the Danish alcohol and drug consumption survey and are accompanied by necessary programming in R. We will discuss pitfalls and limitations.



 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: CEN 2023
Conference Software: ConfTool Pro 2.6.149+TC
© 2001–2024 by Dr. H. Weinreich, Hamburg, Germany