Skip to main content

Event Categories

Looking for upcoming events? Explore the DIF by location or by category to find out more about digital innovation and learn something new.

See What's On

Post an event

Sign in

WIKIHISTORIES 2024: WIKIPEDIA AND/AS DATA

ONLINE EDITION AVAILABLE UPON REQUEST

Cover image: Invasion Day Melbourne 2021, Matt Hrkac, Creative Commons Attribution 2.0

What is Wikipedia’s relationship to data? What should Wikipedia’s relationship to data be?

The 2024 wikihistories symposium is co-located with ICA Gold Coast and brought to you by the wikihistories project at the University of Technology Sydney in partnership with the Centre for Media Transition, the ARC Centre of Excellence in Automated Decision-Making and Society (ADM+S) and Wikimedia Australia.

Wikipedia has always been a critical source of data for computer science projects, offering data scientists a massive store of open data. Researchers and developers use Wikipedia to work on natural language processing (NLP) tasks and applications, model user interactions with content and other users, deliver factual statements to users in automated question-answering tasks, and find nearby features as represented by Wikipedia articles (Iliadis, 2022; Iliadis & Ford, 2023).

These practitioners use Wikipedia as a store of facts assuming that it expresses an established consensus as a result of its policies and processes. Yet, Wikipedia’s natural language could contain meanings that resist translation into data and whose classifications might be open to interpretation and critique (Ford & Iliadis, 2023). For example, articles about complex topics such as Jerusalem do not easily align with standard ways of representing entities like cities. Jerusalem’s infobox reflects Wikipedia’s power to make important decisions about how we understand facts and the meanings that are associated with them (Ford & Graham, 2016). This power is intensified when entire Wikipedia articles are translated into structured datafied knowledge bases of machine-readable statements – by the Wikidata project, for example, which started in 2012 as a project of the Wikimedia Foundation (Ford, 2020).

How researchers measure Wikipedia’s sociocultural biases also depends on the datafication of Wikipedia’s content and how such processes may be questioned rather than taken for granted. Measuring the extent to which Wikipedia represents Australians, for example, could simply be achieved by counting articles that are categorised in the “Australians” data category, and yet this category itself is not an objective representation of Australianness but rather the result of particular practices that resist stable referents (Falk et al., 2023).

As Wikipedia’s content is increasingly used to power virtual assistants such as Amazon Alexa and more recently large language model applications like ChatGPT and Google’s Bard, Wikipedia participates in the global information ecosystem in ways that go well beyond its role as a web-based encyclopaedia (McDowell & Vetter, 2023). Thus, it is important to understand Wikipedia’s relationship to data, not as a given, but as something to be critically investigated.

This symposium will gather together social scientists, humanists, critical technologists, and others to investigate Wikipedia’s connection to data and the importance of this relationship for the global information ecosystem and the production of knowledge. The workshop will be organised as a day-long, face-to-face event prior to the annual International Communication Association conference on the Gold Coast in Australia. 

To participate, please complete the web form, and include a 250-300 word abstract for your presentation as related to the symposium themes.

Lead curator and contact: Heather Ford