Conference Agenda

Session

Advanced methods in demografic research

Time:

Wednesday, 04/June/2025:

4:30pm - 6:00pm

Session Chair: Giancarlo Ragozini

Location: Aula 12

60 seats

Presentations

Applying Transformers to Predict Life Course Sequences

Linda Vecgaile¹, Emilio Zagheni¹, Luiz Felipe Vecchietti², Alessandro Spata³

¹Max Planck Institute for Demographic Research, GERMANY; ²Institute for Basic Science; ³Independent Researcher

Building on life course theory, which highlights the cumulative effects of life events, this study aims to improve predictive modeling of life course sequences. We explore whether past sequences of life events (ages 18-55) can predict future sequences (ages 56-60), such as transitions into employment, unemployment, or retirement. Using the Transformer encoder-decoder framework, known for its strength in analyzing sequential data, we develop a model that treats life events over time like words in a sentence, capturing patterns and temporal dependencies. The model is tested on German Pension Insurance data, which includes 11 social employment states and basic demographic information. Our analysis shows that the model achieves 85.5% accuracy, performing well for individuals with stable life paths, who are the majority. It also predicts future states that deviate from recent patterns, reflecting its ability to account for earlier life experiences. As we refine the model, we expect it to provide insights into the predictability of life trajectories and help identify early-life patterns that may lead to precarious employment later in life, a critical stage in the life course.

A Bayesian Model to Estimate Male and Female Fertility Patterns at a Subnational Level

Riccardo Omenti¹, Monica Alexander², Nicola Barban¹

¹Università Alma Mater Studiorum di Bologna, ITALY; ²University of Toronto, CANADA

Accurate subnational fertility estimates are crucial for shaping policy decisions across diverse sectors, including education, health care, and social welfare. However, these estimates are difficult to obtain in small populations, in which data on births classified by maternal and paternal ages may be lacking or inadequate. In this paper, we describe a Bayesian model tailored to estimate the period total fertility rates (TFR) for both men and women at a subnational level. Building on previous work by Schmertmann and Hauer (2019), the model utilizes population counts from age-sex pyramids and models age-specific mortality and fertility patterns accounting for uncertainty and allowing for spatial and temporal dependencies. Testing the model with simulated data that mimic Australian regions, as well as with real data from US counties, demonstrates its ability to generate reasonable TFR estimates. This model shows promise for analyzing male and female fertility patterns across various subregions and time periods.

Propensity score matching for cross-classified data structures. An application to the estimation of the effect of parenting style on the educational performance of children of immigrants

Bruno Arpino¹, Daniela Bellani²

¹Università degli Studi di Padova, ITALY; ²Università Cattolica del Sacro Cuore, ITALY

Cross-classified data structures appear in contexts where units are grouped along multiple, non-nested dimensions. For instance, the study of parenting styles’ effects on immigrant children’s educational performance involves grouping by both country of origin and country of destination, creating a cross-classified structure. Additionally, specific immigrant communities within destination countries can influence parenting styles and educational outcomes, making it essential to account for these community effects.

To address these complexities, we propose an adapted propensity score matching method for cross-classified data. Our approach prioritizes matching treated and control units within communities; if not possible, it searches within destination or origin countries, with a customizable preference order. Treatment estimates are obtained by comparing outcomes across groups directly on the matched data or through a multilevel regression model, with matching improving confounder balance and reducing sensitivity to parametric assumptions.

We evaluate this method’s effectiveness in balancing confounders and reducing bias using simulations and apply it to real data from the Program for International Student Assessment (PISA), demonstrating its utility for analyzing educational outcomes among immigrant children across diverse national and community contexts.

Geospatial Patterns of Population Change: A GWPC Analysis

Antonella Congedi^1,2, Federico Benassi³, Maria Carella², Sandra De Iaco^1,4,5, Anna Paterno²

¹University of Salento, ITALY; ²University of Bari Aldo Moro, ITALY; ³Università degli Studi di Napoli Federico II, ITALY; ⁴National Centre for HPC, Big Data and Quantum Computing, ITALY; ⁵National Biodiversity Future Center, ITALY

The evaluation of local heterogeneity in population change across Italian municipalities must consider the effects of different demographic components (natural and migratory), while also incorporating the dynamics of both Italian and foreign populations. In this context, it is crucial to account for the spatial distribution of these variables. To this end, a geographically weighted principal components analysis (GWPCA), leveraging geostatistical tools, is proposed. This technique is applied to a set of demographic rates calculated for the period 2011–2019. A comparative analysis with spatial blind source separation modeling will also be discussed.