8:00am - 8:20amComparing the performance of regularized maximum likelihood and maximum pseudolikelihood estimation methods for ERGMs
Alexander James Gordon Murray-Watters, Carter Butts
University of California, Irvine, United States of America
Although alternatives exist, the primary methods for parameter estimation for exponential family random graph models (ERGMs) are maximum pseudolikelihood (MPLE) and approximate (Markov chain Monte Carlo) maximum likelihood inference (MCMCMLE). Both approaches have been quite successful, with MPLE typically used in modern settings as an effective initializer for MCMCMLE. Challenges, however, remain. As is well-known, the MLE does not exist for data sets whose observed statistics are sufficiently "extreme" relative to the set of possible statistics (the convex hull problem), requiring inelegant workarounds that lead to serviceable but non-optimal simulation behavior. Other challenges arise in high-dimensional models, for which optimization becomes difficult and accidental collinearity or near-collinearity of predictors can become a risk; this occurs most notably for models with sociality, expansiveness, or popularity terms, individual-level fixed effects that can be difficult to estimate in practice. Among the solutions proposed to the above challenges is regularized inference, where the likelihood and/or pseudolikelihood is penalized (typically by the sum of squared (L2) or absolute values (L1) of model parameters) during inference. Prior work has also suggested the potential for regularization to improve the performance of the MPLE, which often has good first order properties but which is sometimes unstable and/or poorly calibrated. Here, we examine the behavior of regularized MPLE and MCMCMLE estimators for ERGM model parameters, introducing a "pesudo-cross validation" strategy for calibration of the regularization parameter. We compare regularized and non-regularized estimators both in conventional, low-dimensional settings, and in cases with individual-level fixed effects. Based on these observations, we suggest practical guidance and useful directions for ERGM inference in challenging circumstances.
8:20am - 8:40amDistinguishing Notions of Centrality in Directed Networks
Gordana Marmulla, Ulrik Brandes
ETH Zurich, Switzerland
An ever-growing number of centrality indices is proposed, but more often than not they are constructed ad-hoc. Consequently, the interpretation and comparison of centrality indices is generally based on intuition built from their definition. For undirected graphs, the preservation of the neighborhood-inclusion preorder has been identified as the core axiom shared by centrality rankings. This has recently been extended to directed graphs by defining three vertex preorders based on directed neighborhood inclusion. The preorders formalize the rather conceptual notions of radial, medial and hierarchical centralities and can be used as criteria to discriminate between them. In this presentation, we differentiate common centrality indices according to which of the criteria they do or do not preserve. The findings illustrate effects that are highly relevant when centrality indices are applied in practice. These include, for example, implications of the choice of a specific index, the concrete functional form of an index, the symmetrization of networks, and potential issues when substituting seemingly related indices.
8:40am - 9:00amExpert Surveys: Optimizing Snowball Elicitation
Dimitris CHRISTOPOULOS1,2, Alex Jose1, Marta Campi3
1Heriot Watt University, United Kingdom; 2MU University, Vienna; 3Institute Pasteur, Paris
Eliciting expert opinion often relies on sampling small populations among those who possess specialized, experiential, or privileged knowledge. Traditional recruitment methods for expert opinion or judgment are susceptible to significant selection and response biases. This has the potential to compromise data validity and reliability. This paper presents a quantitative framework for optimizing snowball sampling to minimize such biases in expert recruitment. Through systematic simulation experiments across different network structures (Erdős-Rényi, Power Law, Scale-Free, and Small World), we evaluate critical methodological parameters including the number of seeds, number of waves, and sampling termination criteria. Our analysis determines a means to capture a statistically significant fraction of the underlying expert population while maintaining representativeness. Our results demonstrate that sampling efficiency varies significantly with network topology. We validate our findings using Kolmogorov-Smirnov tests on the distribution of expertise. We conclude by offering practical guidelines for researchers employing snowball sampling in selecting experts. This methodology can extend beyond expert surveys to other hidden or partially hidden populations, including stakeholder analysis, identifying elite members, and recruiting members of peripheral social networks, offering a robust framework for a sampling design in partially hidden populations.
9:00am - 9:20amLatent Variable Models for Clustering Network and Nodal Behavioural Data
Isabella Gollini, Alberto Caimo
University College Dublin, Ireland
We propose a latent variable model for the joint analysis of network structure and nodal information. We extend latent space and stochastic block models by introducing a framework that captures both latent group memberships and continuous nodal positions in a latent space. Each node is assigned to a latent group via a multinomial process, influencing both its network connections and its observed behaviour. The network structure is modelled using a combination of within- and between-group connectivity parameters, alongside a distance-based latent space representation. Simultaneously, nodal information is governed by group-specific and individual-level parameters, allowing for flexible clustering. The model naturally accommodates missing data by leveraging the latent structure to impute unobserved values. Crucially, the ability to jointly model nodal attributes and network structure makes this approach particularly well-suited for scenarios where data are missing or only partially observed. Depending on the nature of the nodal data, the model can be applied to attribute information or relational interactions, offering a unified approach to latent structure exploration in complex systems. To enable scalability, we develop a fast inferential approach based on variational inference.
|