Veranstaltungsprogramm

Eine Übersicht aller Sessions/Sitzungen dieser Veranstaltung.
Bitte wählen Sie einen Ort oder ein Datum aus, um nur die betreffenden Sitzungen anzuzeigen. Wählen Sie eine Sitzung aus, um zur Detailanzeige zu gelangen.

 
 
Sitzungsübersicht
Sitzung
ad_appCoSoSci: Ad-hoc-Gruppe - Applied computational social sciences
Zeit:
Montag, 23.08.2021:
17:00 - 19:00

Chair der Sitzung: Dimitri Prandner, Johannes Kepler Universität Linz
Chair der Sitzung: Tobias Wolbring, FAU Erlangen-Nürnberg
Ort: digital
Den Link zur digitalen Sitzung finden Sie nach Anmeldung zum Kongress bei Eventbrite.

Zeige Hilfe zu 'Vergrößern oder verkleinern Sie den Text der Zusammenfassung' an
Präsentationen

Social inequality at transition to higher education: What can we learn from machine learning?

Michael Grüttner, Frauke Peter, Buchholz Sandra

Deutsches Zentrum für Hochschul- und Wissenschaftsforschung (DZHW), Deutschland

Classical sociological theories of social reproduction in educational decisions suggest modelling primary and secondary effects of social origin (Boudon 1974; Breen & Goldthorpe 1997). As previous research has shown, these models are fruitful but cannot fully explain socially unequal educational choices. We aim to improve the understanding of transition outcomes by incorporating insights from economics of education (Cunha & Heckman 2007) and by using machine learning techniques. Computational Social Sciences is promising for advancing the analysis of innovative questions that could hardly be studied validly so far due to methodological restrictions. We think that their potential should not remain unused for repeated and in-depth investigation of classical questions in social sciences. Therefore, we examine the transitions from the acquisition of the higher education entrance qualification to university studies applying machine learning techniques. We draw on data from the DZHW Panel Study of School Leavers with a Higher Education Entrance Qualification, which includes measures for modelling primary and secondary effects as well as non-cognitive skills. The Least Absolute Selection and Shrinkage Operator (LASSO) allows efficient modelling of complex interaction effects and can reveal complementary links. As a result, we show to what extent the consideration of non-cognitive skills can expand our knowledge about primary and secondary effects for educational decisions.

Keywords: educational decisions, higher education, machine learning, social inequality

References

Boudon, R. (1974). Education, Opportunity, and Social Inequality. Changing Prospects in Western Society. New York/ London/ Sydney/ Toronto: John Wiley & Sons.

Breen, R., & Goldthorpe, J. H. (1997). Explaining Educational Differentials. Towards a Formal Rational Action Theory. Rationality and Society, 9(3), 275‐305.

Cunha, F., & Heckman, J. J. (2007). The Techonology of Skill Formation. American Economic Review, 97(2), 31‐47.



The distributive justice of fairness metrics in automated decision-making: A critical evaluation

Matthias Kuppler1, Ruben Bach1, Christoph Kern1, Frauke Kreuter2

1University of Mannheim, Germany, Deutschland; 2Department of Statistics, LMU Munich, Germany

The advent of powerful prediction algorithms led to increased automation of high-stakes policy decisions regarding the allocation of scarce public resources. Examples include the allocation of support to jobseekers based on predicted unemployment risk and the allocation of police forces to neighbourhoods based on predicted risk of burglary. Automation bears the risk of perpetuating unwanted discrimination against vulnerable and historically disadvantaged groups.

Research on algorithmic discrimination predominantly originates from computer science, where a plethora of fairness metrics was developed to detect and correct discriminatory prediction algorithms.

Drawing on robust sociological and philosophical discourse on distributive justice, we identify the limitations and problematic implications of prominent fairness metrics. The metrics implement formal equality of opportunity (FEO). Roughly speaking, FEO demands that individuals who only differ on a set of sensitive attributes (e.g., sex, ethnicity, disability) should receive equal decisions. We show that FEO-metrics only apply when resource allocations are based on deservingness, which is often the case for decisions in the for-profit sector (e.g., hiring and lending). FEO-metrics fail when resource allocations should reflect concerns about egalitarianism, sufficiency, and priority – concerns prevalent in the public sector. In these situations, unequal decisions are morally required to correct unfair disadvantages individuals face due to sensitive attributes.

The findings suggest that existing fairness metrics are of limited use in the public sector. To address this limitation, we discuss methods to incorporate egalitarianism, sufficiency, and priority into automated decision-making, thereby broadening the field of Fair ML from a sociological perspective.



Erkenntnisse und Herausforderungen in der Kombination von Umfrage- und Twitter-Daten: Eine Untersuchung der gesellschaftlichen Polarisierung in der COVID-19 Debatte im deutschsprachigen Raum

Beate Klösch1, Markus Reiter-Haas2, Markus Hadler1, Elisabeth Lex2

1Institut für Soziologie, Universität Graz; 2Institute of Interactive Systems and Data Science, Technische Universität Graz

Die Nutzung von Big Data bietet der Soziologie eine Erweiterung zu klassischen sozialwissenschaftlichen Methoden. Die Kombination dieser unterschiedlichen Datentypen wurde aufgrund der Neuheit dieses Forschungsansatzes, fehlender gemeinsamer Datenquellen und ethischer Hürden in der aktuellen Literatur bisher wenig behandelt (vgl. Al Baghal et al. 2020). Diesen Herausforderungen widmet sich unser interdisziplinäres Forschungsprojekt mithilfe einer Kombination von soziologischen Umfragedaten und Twitter-Daten. Dazu wurde im Sommer 2020 eine für Internetnutzer*innen repräsentative Online-Umfrage zu diversen gesellschaftspolitischen Themen in der DACH-Region durchgeführt, in deren Rahmen zugleich private Twitter-Benutzernamen sowie das Einverständnis zur wissenschaftlichen Verarbeitung der öffentlich geteilten Tweets erhoben wurden. Es zeigt sich, dass die Nutzung von Twitter im deutschsprachigen Raum gering ist und der Großteil der befragten Personen der Analyse ihrer Twitter-Daten nicht zustimmt. So nahmen insgesamt 2560 Personen an der Umfrage teil, wovon lediglich 79 Twitter-Accounts für die Analyse verblieben. Liegt der Fokus darüber hinaus auf spezifischen Themen, so wird die Anzahl der zu analysierenden Twitter-Accounts noch geringer, da nicht alle Nutzer*innen zum jeweiligen Thema tweeten. Weiters können die verschiedenen Datentypen nicht direkt miteinander verglichen werden, da Meinungen auf Twitter in Form von unstrukturierten Texten anstatt von strukturierten Antworten wiedergegeben werden. Dieser Herausforderung treten wir mit Approximationsverfahren (z.B. Sentiment Analyse), qualitativen Inhaltsanalysen und statistischen Kennwerten entgegen, da andere gängige Methoden wie Netzwerkanalysen hier nicht direkt anwendbar sind. Schlussendlich drängt sich die Frage nach diversen Verzerrungen und der fehlenden Generalisierbarkeit der Ergebnisse auf. Am Beispiel der Polarisierung der öffentlichen Meinung hinsichtlich COVID-19-Maßnahmen werden Vorteile und Schwierigkeiten dieser interdisziplinären Daten- und Methodenkombination präsentiert und diskutiert.

Quelle:

Al Baghal, T., Sloan, L., Jessop, C., Williams, M. L., and Burnap, P. (2020): Linking Twitter and Survey Data: The Impact of Survey Mode and Demographics on Consent Rates Across Three UK Studies. In: Social Science Computer Review, 38(5), S.517-532.



Mining Everyday Texts for Sociologically Relevant Phenomena – Evidence from Online Market Exchanges

Wojtek Przepiorka, Ana Macanovic

Utrecht University, Niederlande

The emergence of big data introduced amounts of information that elude conventional sociological text analysis approaches. Addressing these new developments, sociologists have started to utilize capabilities of text mining in exploring narratives and discourses from texts in public spheres. However, less has been done to explore the potential of text mining methods in tracing the complexities of the world underlying communication of individuals in less formal, everyday settings. We use textual data from illegal online markets to explore motives and norms behind individuals’ writing of feedback on their market transactions. This feedback is essential in sustaining market reputation systems. To do so, we build on work exploring individual psychological states, opinions, meanings, values and motives from text. We evaluate the performance of three families of text mining approaches – dictionary, supervised and unsupervised machine learning methods – in replicating the work of trained human coders coding for complex phenomena of motives and norms in short texts. We introduce several methods that approach the success of human coders and address some challenges practitioners face in adapting methods from natural language processing and computer science to social scientific applications. Our research not only contributes to sociological understanding of the functioning of markets, but also showcases that text mining methods can help sociologists utilize streams of everyday, informal textual data written by “ordinary” individuals to draw conclusions on sociologically relevant phenomena.



Spatio-Temporal Heterogenous Information Networks in the study of transnational issue publics on social media

Wolf J. Schünemann1, Alexander Brand1, Tim König1, John Ziegler2

1Stiftung Universität Hildesheim, Deutschland; 2Universität Heidelberg, Deutschland

Sociological research has traditionally shown great interest in the transnationalisation of mediatized public spheres. Large scale social media data opens new trajectories for transnationalisation research by providing access to multinational communication environments. However, this type of data provides new methodological challenges, as user-generated data is often non-standardized or incomplete. This is especially true for geolocated data, resulting in a need for new techniques to extract the relevant information and extrapolate geo-spatial information from heterogenous data types. Furthermore, the procedural character of transnationalisation requires a longitudinal approach, able to identify relevant events and processes over time. To address these problems, we propose Spatio-Temporal Heterogeneous Information Networks (HINs) as a novel research approach. By linking various entities (like URLs, hashtags, topics, named entities) in a time-varying network and applying multiple techniques to geolocate the different entities, we utilize those networks as a solution for the longitudinal observation of transnational issue publics on Twitter. To operationalize transnationalisation we measure user references to entities located in a foreign country. On the one hand, this enables us to apply established clustering algorithms for heterogeneous networks and, on the other hand, to trace topic conjunctures via time series analysis methods. For our analysis, we use the GESIS TweetsCOV19 corpus with more than 3 million rehydrated tweets, which we annotated via the Geonames, Nominatim and ipinfo APIs to geolocate the different types of entities.



 
Impressum · Kontaktadresse:
Datenschutzerklärung · Veranstaltung: DGS ÖGS Soziologiekongress 2021
Conference Software - ConfTool Pro 2.6.142
© 2001 - 2021 by Dr. H. Weinreich, Hamburg, Germany