Conference Agenda

Session

OS-2: Advanced Mathematical and Statistical Network Methodology

Time:

Thursday, 26/June/2025:

8:00am - 9:40am

Session Chair: Martin Everett

Location: Room 112

16

Session Topics:

Advanced Mathematical and Statistical Network Methodology

Presentations

8:00am - 8:20am

Comparing the performance of regularized maximum likelihood and maximum pseudolikelihood estimation methods for ERGMs

Alexander James Gordon Murray-Watters, Carter Butts

University of California, Irvine, United States of America

Although alternatives exist, the primary methods for parameter estimation for exponential family random graph models (ERGMs) are maximum pseudolikelihood (MPLE) and approximate (Markov chain Monte Carlo) maximum likelihood inference (MCMCMLE). Both approaches have been quite successful, with MPLE typically used in modern settings as an effective initializer for MCMCMLE. Challenges, however, remain. As is well-known, the MLE does not exist for data sets whose observed statistics are sufficiently "extreme" relative to the set of possible statistics (the convex hull problem), requiring inelegant workarounds that lead to serviceable but non-optimal simulation behavior. Other challenges arise in high-dimensional models, for which optimization becomes difficult and accidental collinearity or near-collinearity of predictors can become a risk; this occurs most notably for models with sociality, expansiveness, or popularity terms, individual-level fixed effects that can be difficult to estimate in practice. Among the solutions proposed to the above challenges is regularized inference, where the likelihood and/or pseudolikelihood is penalized (typically by the sum of squared (L2) or absolute values (L1) of model parameters) during inference. Prior work has also suggested the potential for regularization to improve the performance of the MPLE, which often has good first order properties but which is sometimes unstable and/or poorly calibrated. Here, we examine the behavior of regularized MPLE and MCMCMLE estimators for ERGM model parameters, introducing a "pesudo-cross validation" strategy for calibration of the regularization parameter. We compare regularized and non-regularized estimators both in conventional, low-dimensional settings, and in cases with individual-level fixed effects. Based on these observations, we suggest practical guidance and useful directions for ERGM inference in challenging circumstances.

8:20am - 8:40am

Distinguishing Notions of Centrality in Directed Networks

Gordana Marmulla, Ulrik Brandes

ETH Zurich, Switzerland

An ever-growing number of centrality indices is proposed, but more often than not they are constructed ad-hoc. Consequently, the interpretation and comparison of centrality indices is generally based on intuition built from their definition. For undirected graphs, the preservation of the neighborhood-inclusion preorder has been identified as the core axiom shared by centrality rankings. This has recently been extended to directed graphs by defining three vertex preorders based on directed neighborhood inclusion. The preorders formalize the rather conceptual notions of radial, medial and hierarchical centralities and can be used as criteria to discriminate between them. In this presentation, we differentiate common centrality indices according to which of the criteria they do or do not preserve. The findings illustrate effects that are highly relevant when centrality indices are applied in practice. These include, for example, implications of the choice of a specific index, the concrete functional form of an index, the symmetrization of networks, and potential issues when substituting seemingly related indices.

8:40am - 9:00am

Expert Surveys: Optimizing Snowball Elicitation

Dimitris CHRISTOPOULOS^1,2, Alex Jose¹, Marta Campi³

¹Heriot Watt University, United Kingdom; ²MU University, Vienna; ³Institute Pasteur, Paris

Eliciting expert opinion often relies on sampling small populations among those who possess specialized, experiential, or privileged knowledge. Traditional recruitment methods for expert opinion or judgment are susceptible to significant selection and response biases. This has the potential to compromise data validity and reliability. This paper presents a quantitative framework for optimizing snowball sampling to minimize such biases in expert recruitment. Through systematic simulation experiments across different network structures (Erdős-Rényi, Power Law, Scale-Free, and Small World), we evaluate critical methodological parameters including the number of seeds, number of waves, and sampling termination criteria. Our analysis determines a means to capture a statistically significant fraction of the underlying expert population while maintaining representativeness. Our results demonstrate that sampling efficiency varies significantly with network topology. We validate our findings using Kolmogorov-Smirnov tests on the distribution of expertise. We conclude by offering practical guidelines for researchers employing snowball sampling in selecting experts. This methodology can extend beyond expert surveys to other hidden or partially hidden populations, including stakeholder analysis, identifying elite members, and recruiting members of peripheral social networks, offering a robust framework for a sampling design in partially hidden populations.

9:00am - 9:20am

Latent Variable Models for Clustering Network and Nodal Behavioural Data

Isabella Gollini, Alberto Caimo

University College Dublin, Ireland

We propose a latent variable model for the joint analysis of network structure and nodal information. We extend latent space and stochastic block models by introducing a framework that captures both latent group memberships and continuous nodal positions in a latent space. Each node is assigned to a latent group via a multinomial process, influencing both its network connections and its observed behaviour. The network structure is modelled using a combination of within- and between-group connectivity parameters, alongside a distance-based latent space representation. Simultaneously, nodal information is governed by group-specific and individual-level parameters, allowing for flexible clustering. The model naturally accommodates missing data by leveraging the latent structure to impute unobserved values. Crucially, the ability to jointly model nodal attributes and network structure makes this approach particularly well-suited for scenarios where data are missing or only partially observed. Depending on the nature of the nodal data, the model can be applied to attribute information or relational interactions, offering a unified approach to latent structure exploration in complex systems. To enable scalability, we develop a fast inferential approach based on variational inference.

9:20am - 9:40am

Testsing in Restricted Multigraphs: Balance Correlation

Pavel Krivitsky¹, David Dekker², David Krackhardt³, Patrick Doreian⁴

¹University of New South Wales; ²Heriot-Watt University, United Kingdom; ³Carnegie Mellon University; ⁴University of Pittsburgh

Understanding structural balance in signed graphs is a central challenge in network science, with applications in social networks, international relations, and organizational structures. One emerging approach to quantifying balance behavior is through balance correlation, a measure that captures the extent to which triadic relations follow balance theory principles. However, existing statistical tests for balance correlation rely on the expected degree distribution, which imposes strong assumptions about the underlying probability distributions. These assumptions can lead to inefficiencies in generating random graphs and, consequently, a loss of statistical power.

Our study introduces a new Fixed Degree Test for assessing balance correlations in signed and multigraphs. Unlike the expected degree distribution test, which generates random networks under more restrictive conditions, our approach preserves the observed degree marginals while allowing for more flexible network structures. Through extensive simulations, we demonstrate that the Fixed Degree Test improves the power of balance correlation significance tests, ensuring more reliable detection of balance-driven behavior in real-world networks.

Our results indicate that the expected degree test, while widely used, may over-constrain network structures, leading to misleading conclusions about balance prevalence. In contrast, the Fixed Degree Test provides a more accurate baseline, making it particularly useful for studying balance in networks with heterogeneous tie distributions. Beyond balance correlation, we will also explore how this test generalizes to other network statistics, offering a versatile framework for analyzing signed and multigraph structures.

By refining the statistical toolkit for signed network analysis, our work contributes to a more robust and flexible approach to studying balance theory in complex networks. We invite discussion on its applications across disciplines and its potential integration into broader statistical models for network dynamics.