Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
OS-38: Network Analysis for Textual Data in Social Media
Time:
Friday, 27/June/2025:
8:00am - 9:40am

Location: Room E

Session Topics:
Network Analysis for Textual Data in Social Media

Show help for 'Increase or decrease the abstract text size'
Presentations
8:00am - 8:20am

Enhancing Sentiment Analysis Using Formal Linguistic Tools

Mario Monteleone1,2

1Dipartimento di Scienze Politiche e della Comunicazione, Università degli Studi di Salerno, Italy; 2X23 Science In Society, Bergamo (Italy)

Generative Artificial Intelligence (GAI) text production is crucial to research fields as Data Science (DS) and Network Textual Data Analysis (NTDA), the main purposes of GAI being to simulate human language production, exploiting both Machine Learning (ML) and Large Language Models (LLMs).

However, as pre-trained probabilistic models, LLMs are biased when built on non-perfectly balanced data as for retrieval sources, taxonomy, ontology interconnections and linguistic inference. This is most relevant to DS and NTDA, as it can contribute in social media to spreading fake news, conspiracy theories, counterproductive narratives, and online hate speech. Equally relevant is GAI being devoid of a reality formal model (Pearl and Mackenzie, 2018), causing GAI to have no ethics, as it cannot identify and correct its inaccuracies. This brings LLMs and GAI to suffer from effectiveness and reliability issues, showing tendency to prompt incorrect and discriminatory information, and hallucinations.

Newborn Neuro-Symbolic Artificial Intelligence (NSAI) tries to cope with these issues building elementary ontologies to integrate human symbolic reasoning principles with ML and Artificial Neural Networks (ANNs). Here we will demonstrate that better results come integrating also formalized morphosyntactic and semantic information, as those relating to Italian negation grammar. Therefore, to tackle on-line hate speech, we propose here a method of Sentiment Analysis (SA) that uses NooJ software (Silberztein 2016) to build formal ontologies and syntactic grammars within graphs representing finite state automata/transducers. While ontologies will conceptualize sets of word having contiguous contextualized meanings, syntactic grammars will parse texts using Italian formalized morphosyntax and semantics.



8:20am - 8:40am

Considerations and Challenges in Dealing with Online Italian Content Related to Social Issues: Constructing Datasets of Online Opinions for Human Annotation.

Alex Cucco1, Emiliano del Gobbo2, Lara Fontanella1, Sara Fontanella3, Luigi Ippoliti1

1University G. d'Annunzio Chieti-Pescara; 2University of Foggia; 3Imperial College London

Ensuring a diverse representation of opinions, sentiments, and topics in social discourse is essential when curating data for machine learning and statistical models, particularly in contexts requiring explainability. Online comments offer a rich source of public opinion; however, they often exhibit an imbalanced distribution of perspectives, amplifying specific viewpoints while underrepresenting others. Such biases can lead to unfair models that reinforce stereotypes and reduce the reliability of analytical outcomes.

To address this challenge, we propose an approach able to capture a wide spectrum of sentiments and topics, facilitating the creation of a balanced dataset for human annotation and further analysis. Our approach focusses on targeted sampling strategies leveraging on network analysis and node sampling techniques to ensure comprehensive topic and sentiment representation.

We illustrate the effectiveness of this method through a simulated case study and an application analyzing online discourse on migration, leveraging social media data. We introduce a refined sampling technique aimed at improving coverage across different viewpoints. By adopting this approach, we seek to support the development of fair and transparent models capable of accurately interpreting complex social debates.



8:40am - 9:00am

Enhancing Sentiment Analysis Using Formal Linguistic Tools

Mario Monteleone

Dipartimento di Scienze Politica e della Comunicazione, Università degli Studi di Salerno, Italy

Generative Artificial Intelligence (GAI) text production is crucial to research fields as Data Science (DS) and Network Textual Data Analysis (NTDA), the main purposes of GAI being to simulate human language production, exploiting both Machine Learning (ML) and Large Language Models (LLMs).

However, as pre-trained probabilistic models, LLMs are biased when built on non-perfectly balanced data as for retrieval sources, taxonomy, ontology interconnections and linguistic inference. This is most relevant to DS and NTDA, as it can contribute in social media to spreading fake news, conspiracy theories, counterproductive narratives, and online hate speech. Equally relevant is GAI being devoid of a reality formal model (Pearl and Mackenzie, 2018), causing GAI to have no ethics, as it cannot identify and correct its inaccuracies. This brings LLMs and GAI to suffer from effectiveness and reliability issues, showing tendency to prompt incorrect and discriminatory information, and hallucinations.

Newborn Neuro-Symbolic Artificial Intelligence (NSAI) tries to cope with these issues building elementary ontologies to integrate human symbolic reasoning principles with ML and Artificial Neural Networks (ANNs). Here we will demonstrate that better results come integrating also formalized morphosyntactic and semantic information, as those relating to Italian negation grammar. Therefore, to tackle on-line hate speech, we propose here a method of Sentiment Analysis (SA) that uses NooJ software (Silberztein 2016) to build formal ontologies and syntactic grammars within graphs representing finite state automata/transducers. While ontologies will conceptualize sets of word having contiguous contextualized meanings, syntactic grammars will parse texts using Italian formalized morphosyntax and semantics.



9:00am - 9:20am

Exploring Semantic Networks to Assess Latent Attitudes Toward Migrants

Alex Cucco1, Lara Fontanella1, Giuseppe Giordano2, Michelangelo Misuraca2, Annalina Sarra1

1University "G.d'Annunzio" of Chieti-Pescara, Italy; 2University of Salerno

The growing influence of social media platforms has provided an unprecedented opportunity to assess public sentiment and attitudes toward various social issues, including migration. While traditional methods, such as questionnaires, are commonly used to retrieve latent traits about attitudes, the increasing volume of free text on social media presents a dynamic, alternative data source. Questionnaires that include both open-ended responses and scales such as the Semantic Differential and the Bogardus Social Distance Scale allow for the measurement of respondents’ attitudes and their similarity-based semantic expressions in free text regarding migration. These latent traits, estimated through models like the Graded Response Model (GRM) within Item Response Theory (IRT), offer valuable insights into public perceptions. However, social media comments provide an additional layer of data, capturing spontaneous expressions and shifting sentiments in real time.

This study aims to explore the connection between latent attitudes derived from questionnaire responses and the language used in social media posts. By evaluating the semantic networks within public comments, the study investigates whether the latent traits of social media users can be inferred from their online discourse. This approach leverages publicly available data to assess migration-related attitudes, an area traditionally reliant on structured surveys. Specifically, the study examines the potential of textual network analysis for this purpose and evaluates a semi-supervised approach to improve the assessment of online latent traits.



9:20am - 9:40am

How to Trigger Public Figures’ Engagement on Social Media

Shahar Lavian1, Gilad Ravid1, Alon Bartal2

1Industrial Engineering and Management Department, Ben Gurion University of the Negev, Israel; 2The School of Business Administration, Bar-Ilan University

Public figures such as celebrities, politicians, and influencers who post online attract numerous replies from users but only respond to selective users. The factors influencing public figures’ selective engagement are largely unknown. We analyzed a dynamic network of public figure interactions with specific users who replied to posts of a public figure. These networks are sparse since most users' replies to the original posts of a public figure remain unaddressed by the public figure. Given a user who replied to an original post of a public figure, our goal is to predict if a public figure will engage with a user's reply. To define this population, we employed a filtering methodology using ranking lists from reputable sources, such as Forbes and TIME, alongside an American filter. This approach ensures that the selected public figures hold significant influence and visibility, making their engagement behavior on social media particularly relevant for the study.

We analyzed 250,000 user replies to posts originated by public figures on X, collected between 2022 and 2024. Each user reply is labeled as ‘engaged by the public figure’ (1) or not (0), allowing a systematic examination of engagement patterns. To explore potential homophily in digital discourse, we construct a multi-dimensional user similarity graph incorporating linguistic features, emotion intensity, and temporal engagement patterns. We apply k-nearest neighbors (k=50) to link users who communicate in similar ways, filtering edges based on cosine similarity (<0.3). Our network analysis reveals a high assortative coefficient (0.6799), suggesting strong homophily. Users with similar emotional tone, linguistic complexity, and response timing tend to receive similar levels of engagement from public figures.

To predict whether a public figure will respond to a user reply, we trained 3 classifiers, incorporating network-based attributes, emotion-based attributes, and time between a post and a reply. We trained XGBoost, Random Forest, and a Hybrid Siamese Convolutional Network (HSCNN). XGBoost outperformed all other models with an ROC-AUC score of 0.96. The most important predictive factors include the time interval between an original post and a reply, the intensity of anger expressed in the reply (dominant anger levels), and the complexity of language used (lexical diversity). We find that public figure engagement is shaped by systematic patterns in user communication styles and response behaviors.

By integrating social network analysis with predictive modeling, this research advances our understanding of the selective engagement of public figures in online discourse. Future work should explore temporal evolution in engagement homophily and examine cross-platform variations in reply behaviors.



9:40am - 10:00am

Invisible ties: Shared Content Exposure on Twitter Among Survey Participants

Paulo Matos Serôdio

University of Essex, United Kingdom

How independent are our online content exposures? Using data from Understanding Society’s Innovation Panel Twitter Study (2007–2023), we reconstruct shared exposure networks of survey participants based on their engagement with tweets, accounts, and topics. This approach enables us to assess the extent to which two randomly selected individuals from a nationally representative sample are connected online—even when offline links are remote. Our preliminary analysis reveals that, on average, 32% of the Twitter accounts respondents engage with are shared with other survey participants,. We further explore how shared exposure varies by the type of platform behaviour (e.g., retweets and replies), while controlling for engagement metrics to mitigate biases from viral content. In addition to account-based networks, we construct content-based networks by leveraging transformer models and word embeddings to derive latent topics from retweeted content. Each individual’s topic profile is created by aggregating the topic distributions of the tweets they engage with, allowing us to cluster users based on the mix of content they consume. We then examine whether these content clusters are independent of—or driven by—socioeconomic gradients such as age, occupation, and income. Our findings challenge traditional assumptions of respondent independence in survey research and offer novel insights into how digital environments both reflect and transcend offline social structures. Implications for social network research and digital survey methodologies are discussed.



10:00am - 10:20am

Online Incivility: An Exploration of Brexit 2016 Discussions on Twitter

Cristina Chueca Del Cerro1, Kyriaki Nanou1, Moritz Osnabrügge1, Julio Amador Diaz Lopez2

1Durham University, United Kingdom; 2Independent researcher

Social media platforms have become a frequent forum for uncivil discourse. We conceptualise incivility as language containing ill-mannered expressions, insults, swear words, or disrespectful attacks against a person or group. This language undermines democratic discourse and contributes to the fragmentation of public conversation. This paper explores the use of uncivil language on Twitter during the Brexit referendum campaign (6 January - 23 June 2016). This was a period of renegotiating the UK-EU relationship that greatly divided the UK public. We analyse the temporal patterns of uncivil language using 23M Tweets, tracking how the tone of public conversation shifted as the referendum date approached. To classify Tweets, we fine-tune the BERTweet model on 30,000 annotated Tweets, which were created using professional annotators from Appen. The analyses revealed significant regional differences in the prevalence of incivility, which fluctuated as key political events and statements shaped public sentiment. To further understand the dissemination of uncivil language, we visualise retweet networks, mapping how public officials became the target of incivility over time. Our findings demonstrate that while incivility was pervasive throughout the campaign, its intensity varied by region and was strongly influenced by interactions with political elites.



10:20am - 10:40am

Investigating the Structure of Racist and Xenophobic Discourse: A Causal Inference Approach

Anthony Cossari1, Paolo Carmelo Cozzucoli1, Michelangelo Misuraca2

1University of Calabria, Italy; 2University of Salerno, Italy

The alarming rise of hostile rhetoric targeting both (im)migrants and marginalised groups has permeated online discussions, particularly across social networking platforms. These digital arenas, notably popular sites like Facebook and X, have unfortunately become breeding grounds for the dissemination of prejudiced views and intolerance, often fueled by misinformation and divisive narratives. This research critically examines the prevalence of intolerance and xenophobic discourses within the Italian social media landscape. By systematically collecting freely posted comments, we employ a community detection procedure on a term-by-term matrix to uncover the primary issues that emerge from these online debates. Furthermore, we use a Bayesian Belief Network (BBN) to elucidate, from a probabilistic perspective, the intricate relationships between these issues and other relevant covariates, including the discourse's emotional valence and propagation dynamics. This comprehensive integration not only facilitates a causal inference approach but also unveils the key drivers and amplifiers of hateful and racist discourse, thereby underscoring the urgent need for informed intervention strategies against digital hate.

This work is part of the research project PRIN-2022 PNRR “Identification and Critical Analysis of Online Racism and Xenophobia against (Im)migrants and Roma people” (Project Code: P2022APKJL), funded by the European Union – Next Generation EU.



10:40am - 11:00am

When Deep Learning Meets Social Network: A Hybrid Approach to Manage Online Incivility

Jyun-Cheng Wang1, Kai-Yi Chu1, Halim Budi Santoso2

1Institute of Service Science, National Tsing Hua University, Taiwan; 2Information System Department, Universitas Kristen Duta Wacana, Indonesia

Introduction

Online incivility disrupts digital interactions, negatively affecting users' mental well-being and engagement. Current detection methods rely on natural language processing (NLP) and keyword-based filtering, often producing high false positive rates. This study integrates Graph Neural Networks (GNN) with Social Network Analysis (SNA) and NLP to enhance incivility detection while minimizing impacts on everyday discourse.

Literature Review

Prior research highlights the detrimental effects of incivility, including increased polarization, emotional distress, and disengagement. Traditional NLP-based classifiers primarily focus on textual content but fail to consider relational context and user interactions, leading to misclassification of sarcasm, nuanced discussions, and indirect hostility. Although Graph Neural Networks (GNNs) and Social Network Analysis (SNA) have been used separately in social computing, there is a lack of studies integrating these techniques to enhance context-aware incivility detection. Our research addresses this gap by combining text-based, network-based, and deep learning-based approaches to improve accuracy.

Research Methodology

We collected 3,210 comments from Reddit’s "worldnews" subreddit and labeled them. Our GNN model incorporated NLP-derived sentiment scores as edge features and SNA metrics (e.g., centrality, in-degree) as node features. The trained model was evaluated using F1-score, precision, and recall.

Findings

Our GNN model achieved an F1-score of 78.42%, outperforming traditional NLP models. Integrating network metrics significantly improved incivility detection accuracy, reducing false positives while maintaining high recall.

Discussion and Contributions

This study advances computational social science methodology by integrating deep learning and network structure approaches to moderate online incivility in social media. Our approach leverages existing methods by offering a hybrid method to manage online incivility. Practically, our approach can improve content regulation strategies, providing a hybrid approach for managing civility on social media.



 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: INSNA Sunbelt 2025
Conference Software: ConfTool Pro 2.6.153+TC
© 2001–2025 by Dr. H. Weinreich, Hamburg, Germany