2019:Research/Wikipedia graph mining dynamic structure of collective memory
This is an Accepted submission for the Research space at Wikimania 2019. |
Abstract
[edit | edit source]Wikipedia is the biggest encyclopedia ever created and the fifth most visited website in the world. Tens of millions of people surf it every day, seeking answers to various questions. Collective user activity on its pages leaves publicly available footprints of human behavior, making Wikipedia an excellent source for analysis of collective behavior.
If we think of web pages as neurons, Web networks resemble the brain. Indeed, interconnections have complicated structure; nodes produce time-series of activations (visits to web pages and spike-trains of neurons). Here, we focus on memory properties of Wikipedia web network and show in what way it is similar to human memory. We demonstrate that the clusters, which we extracted from Wikipedia Web network using our algorithm, comprise pages related to certain collective memories of humankind. If we look carefully at the structure of these clusters, we will see that they have associative nature. This insight led us to another interesting question, on which we are going to focus on in this talk. Are these structures similar to artificial models of human memory?
In this talk, we present a new method to analyze and retrieve collective memories, the way social groups remember and recall the past. We use the Hopfield network model as an artificial memory abstraction to build a macroscopic collective memory model. To reveal memory patterns, we analyze the dynamics of visitors activity on Wikipedia and its Web network structure. Each pattern in the Hopfield network is a cluster of Wikipedia pages sharing a common topic and describing an event that triggered human curiosity during a finite period of time.
Authors
[edit | edit source]Volodymyr Miz, Benjamin Ricaud, Kirell Benzi, Nicolas Aspert, Pierre Vandergheynst (EPFL)
Relevance to Wikimedia Community
[edit | edit source]Large-scale collective behavior research based on Wikipedia viewership statistics.
Session type
[edit | edit source]22-min presentation.