11:00am - 11:20amBayesian Uncertainty Quantification in Deep Generative Models for Synthesis of Tabular Medical Data
Patric Tippmann, Kiana Farhadyar, Daniela Zöller
Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center – University of Freiburg, Germany
Medical research and patient care rely on the collection, analysis, and reuse of medical data. However, data access is often limited (e.g., due to data protection constraints). Synthetic data generation is a promising solution to enable researchers to generate synthetic medical data that preserves the statistical properties of the original data while ensuring privacy. However, interpreting the results using synthetic data needs to account for uncertainty in the generation process, especially when using deep generative models. We aim to address this problem by employing Bayesian inference techniques for uncertainty quantification.
Specifically, we focus on Variational Autoencoders (VAEs) to generate tabular medical data since they provide a probabilistic framework by explicitly modeling the probability distribution of the data while simultaneously providing a latent low-dimensional data space for additional investigations. However, the overconfidence of deep neural networks (DNNs) for anomalous or Out-of-Distribution (OOD) data is an unsolved problem, which can lead to unreliable model predictions or inflated probability estimates in the downstream analysis of the synthetic data. Reliable methods for quantifying uncertainty are necessary to address this issue, yet previous work has mostly addressed it in the context of supervised discriminative models. Very recent literature also explores Bayesian methods for uncertainty estimation in VAEs, albeit in the context of image data. Tabular medical data impose greater challenges as they often contain missing or anomalous values, which can introduce bias or lead to inaccurate models if not handled properly. Moreover, such datasets have heterogeneity in the data distributions (e.g., complex and multi-modal) and data types (e.g., discrete and continuous variables) that require specialized treatment.
In our approach, we address these challenges with a suitably tailored VAE framework and compare two Bayesian inference methods to quantify epistemic (knowledge-related) model uncertainty based on model averaging and Markov-Chain Monte Carlo techniques. We review and apply robust metrics to evaluate the quality of the generated data, showing that our approach preserves important properties of the original data. Based on both real and simulation data, we demonstrate how the proposed approach improves the faithfulness of downstream task performance, such as classification and regression, by providing more accurate and reliable synthetic data.
In summary, our work covers the use of Bayesian inference for uncertainty quantification in a new area, namely synthesizing tabular medical data using VAEs. We highlight the challenges and considerations that arise with tabular synthetic data generation. Using real and simulated medical datasets, we show how our approach and framework can increase model usefulness. Our work contributes towards developing more reliable deep generative models for medical applications.
11:20am - 11:40amCombining Boosting with Neural Networks for Structuring Latent Representations of Single-Cell RNA-Sequencing Data
Niklas Brunn, Maren Hackenberg, Harald Binder
Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Stefan-Meier-Straße 26, 79106, Freiburg, Germany
Dimension reduction is an important step in the analysis of single-cell RNA-sequencing (scRNA-seq) data for identifying underlying patterns. The corresponding low-dimensional representation should have properties such as disentangled dimensions, i.e., different dimensions correspond to distinct underlying factors of variation, and interpretability, e.g., by identifying a small set of characteristic genes for each dimension. For structuring the representation of scRNA-seq data accordingly, we propose to combine feature selection based on componentwise boosting with neural networks for dimension reduction. More precisely, we use an autoencoder architecture that is implicitly regularized by componentwise boosting when minimizing the reconstruction loss. There, componentwise boosting allows capturing a small number of explanatory features in each dimension, hence arriving at an interpretable representation. To derive pseudo-targets for the boosting approach, we use constrained versions of the negative gradients of the reconstruction loss w.r.t. the different components of the current representation. Specifically, a constraint ensures that, for a given dimension, only features are selected that are complementary to the information already encoded in the other dimensions, thus resulting in disentangled dimensions. We use differentiable programming for differentiating through the boosting step in the joint optimization of the boosting component and the neural networks. For illustration, we apply our approach to scRNA-seq data from cortical neurons of mice. The results show that we can identify a small subset of genes for each dimension that characterizes distinct cell types. We furthermore illustrate how our approach can be extended to incorporate temporal development patterns, such as cellular differentiation programs.
11:40am - 12:00pmOne-stage and two-stage detectors comparison in the task of pollen grains recognition
Elżbieta Kubera, Agnieszka Kubik-Komar, Krystyna Piotrowska-Weryszko, Agata Konarska
University of Life Sciences in Lublin, Poland
Pollen monitoring carried out using volumetric traps is a complex and time-consuming task. The final goal of our research is to create a system for automatically recognizing and counting pollen grains of individual taxa from microscopic images.
In this work, we compare three types of object detectors in terms of the recognition correctness of Alnus, Betula, Corylus, and Carpinus pollen grains - two one-stage detectors: YOLOv5 (in two versions) and RetinaNet, and a two-stage detector - Faster RCNN. Our dataset contains microscopic photos of the reference material. Individual images show only pollen grains of one taxon, thus we were able to avoid errors during the dataset annotation. Each detector model was built three times, so 12 models were obtained (4 detectors x 3 repetitions). The detector's training lasted for 500 epochs and consisted of fine-tuning some pre-trained models on our dataset.
We used the PyTorch library to build the Faster RCNN and RetinaNet models instead of the Detectron2 framework used in our previous studies. This change enables us to choose the best model in the training process based on its evaluation using the validation set. Therefore the final models were the best-rated models on the validation set.
When recognizing pollen grains for counting purposes, precision is crucial. In contrast, the standard metric of detection quality - mAP (mean Average Precision, which takes into account location errors) is less important in this case. Therefore, we used classification quality measures to evaluate each model, with particular emphasis on precision. Accordingly, we consider two types of YOLOv5 detectors, which differ in the model’s fitness measure, used both at the training and evaluation stages. In addition to the standard fitness metric expressed by mAP, we propose considering only the precision and recall at 70% and 30%, respectively.
The same test set consisting of reference pollen pictures was used to compare the quality of each detector.
Values of classification measures were compared using a nonparametric rank-based alternative for ANOVA with repeated measures. We used the F1-LD-F2 design with taxon as a whole-plot factor(4 levels) as well as the detector (4 levels) and its repetition (3 levels) as the first and second sub-plot factor variables, respectively. In addition, multiple comparisons with the Holm-Bonferroni adjustment were applied.
The obtained results allowed us to indicate the YOLO detector as preferable to RetinaNet regarding all classification measures and to Faster RCNN for recall and F-score. There were no differences in the distributions of these measures between default and modified YOLO models. Additionally, there were no preferences in specific taxon classification precision by any of the studied detectors. However, we found that the recall distribution for investigated taxa differs significantly regarding final detectors. RetinaNet and Faster RCNN more often omitted partially visible pollen grains located near the picture border than YOLO detectors, which probably resulted in a difference in recall results.
|