A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
In his first book, Wikipedia, Work, and Capitalism. A Realm of Freedom?,[1] Arwid Lund, lecturer in the program of Information Studies (ALM: Archives, Libraries and Museums) at Uppsala Universitet, Sweden investigates the ideologies that he believes are shared by participants in peer-production projects like Wikipedia. The author typologizes the ways that Wikipedians understand their activities, including “playing v. gaming” and “working v. labouring,” (113-115) to explore his hypothesis that “there is a link between how Wikipedians look upon their activities and how they look upon capitalism.” (117) Lund characterizes peer-production projects by their shared resistance to information capitalism—things like copyright and pay-walled publishing, which they see as limiting creativity and innovation. His thesis is provocative. He claims that the anti-corporatist ideologies intrinsic to peer production and to Wikipedia are unrealistic because capitalism always finds a way to monetize free content. Overall, the book touches on many issues not usually discussed within the Wikipedia community, but which might be a useful entry point for those who want to consider the social impacts of the project.
Lund uses a combination of social critique and qualitative interviews conducted in 2012 to provide supporting evidence for his thesis. One recurrent theme is that Wikipedia is part of a larger trend in gamification—a design technique developed in Human–computer interaction (HCI) to describe the process of using features associated with "play" to motivate interaction and engagement with an interface. One example he gives is that editors report that they find Wikipedia's competitive and confrontational elements to be game-like. (143-144) He also claims that Wikipedians' descriptions of their work and play balance changes as they take on more levels of responsibility and professionalism in the community, such as adminship. Still, it’s highly questionable whether the 8 interviews, which mainly focus on the Swedish Wikipedia, are a sufficient sample size to make his claims scalable.
The culture of Wikipedia valorizes altruism in its embrace of volunteering for the project to produce information for the greater good. Lund argues that Wikipedians' belief in the altruistic aspect of the project, makes it easy for them to depoliticize their work and to ignore the how Wikipedia participates in the corporate, information economy. To him, Wikipedia is symptomatic of the devaluation of digital work, when in past generations, making an encyclopedia might be a source of income and employment opportunities for contributors.
So, he argues, contributors believe that peer production represents a space of increased autonomy, democracy, and creativity in the production of ideas. But from his view, attempts at a “counter-economy,” “hacker communism,” or “gift economies” (239, 303) are prone to manipulation, because we can’t create utopian bubbles within capitalism that aren’t privy to its influence. Still, peer production projects operate as if creation of value outside of the capitalist system is possible. Lund argues that Wikipedia cannot avoid competition with proprietary companies which see Wikipedia as a threat, and have an interest in harvesting its content for their own benefit. (218) Yet it would be nice if he brought in more examples to make this claim. The reader is left wondering who these corporate interests are, and what exactly they derive from Wikipedia. Having this information would help us understand where Lund is coming from.
Although the word “work” in the title might suggest that Lund focuses on wage labour, the author’s aims are more broad, and he uses the word to connote a variety of aspects of social, value-producing activities. (20) Namely, the production of “use-value,” the Marxist term for the productive social activity of creating things which are deemed useful and thus of value to be bought and sold in the market (even if producers don’t consider their work to be commodities). He draws from Marxist thinkers and semioticians, among them V.N. Volosinov, Terry Eagleton, and Louis Althusser, to unpack different approaches to describing why Wikipedians might feel like they are playing when they are really working. (107-108) Marxists call such assumptions “false consciousness,” but the concept is difficult because it requires us to analyze manifest and latent (discursive and non-discursive) awareness. It would have been useful for Lund to look at how the fields of anthropology or psychology talk about ideology. Both fields have extensively researched the topic. More stringent ethnographic or qualitative methods might have also made his argument more convincing. But, based on the references he provides, it seems that the book's target audience may be media theorists and social scientists, people who already familiar with Marxist political economy.
Lund makes a compelling case that capitalism instrumentalizes freely-produced knowledge for its own monetary gains. Meanwhile, he says, Wikipedia's design and its heavily ideological agenda, make it difficult for the community to address the issue. The book is an interesting contribution to ongoing conversations about how Wikipedia and projects motivated by copyleft principles can be defined from a social perspective.
A discussion paper titled "Economic Downturn and Volunteering: Do Economic Crises Affect Content Generation on Wikipedia?"[2] investigates how "drastically increased unemployment" affects contribution to and readership of Wikipedia. To study this question statistically, the authors (three economists from the Centre for European Economic Research (ZEW) in Mannheim, Germany) regarded the Great Recession that began in 2008 as an "exogeneous shock" that affected unemployment rates in different European countries differently and at different times. They relate these rates to five metrics for the language version of Wikipedia that corresponds to each country:
For each of these, the Wikimedia Foundation publishes monthly numbers. Since the researchers did not have access to country-level breakdowns of this data (which is not published for every country/language combination due to privacy reasons, except for some monthly or quarterly overviews which the authors may have overlooked, but only start in 2009 anyway), "to study the relationship of country level unemployment on an entire Wikipedia, we need to focus on countries which have an (ideally) unique language". This excluded some of the European countries that were most heavily affected by the 2008 crisis, e.g. the UK, Spain or Portugal, but still left them with 22 different language versions of Wikipedia to study.
An additional analysis focuses on district-level (Kreise) employment data from Germany and the German Wikipedia, respectively. None of the five metrics are available with that geographical resolution, so the authors resorted to the geolocation data for the (public) IP addresses of anonymous edits (which for several large German ISPs is usually more precise than in many other countries).
In both parts of the analysis, the economic data is related to the Wikipedia participation metrics using a relatively simple statistical approach (difference in differences), whose robustness is however vetted using various means. Still, since in some cases the comparison only included 9 months before and after the start of the crisis (instead of an entire year or several years), this leaves open the question of seasonality (e.g. it is well-known that Wikipedia pageviews are generally down in the summer, possibly due to factors like vacationing that might differ depending on the economic situation).
Summarizing their results, the authors write:
While leaving open the precise mechanism of these effects, the researchers speculate that "it seems that new editors begin to acquire new capabilities and devote their time to producing public goods. While we observe overall content growth, we could not find robust evidence for an increase in the number of new articles per day [...]. This suggests that the increased participation is focused on adding to the existing knowledge, rather than providing new topics or pages. Doing so requires less experience than creating new articles, which may be interpreted as a sign of learning by the new contributors."
The paper also includes an informative literature review summarizing interesting research results on unemployment, leisure time and volunteering in general. (For example, that "conditional on having Internet access, poorer people spend more time online than wealthy people as they have a lower opportunity cost of time." Also some gender-specific results that, combined with Wikipedia's well-known gender gap, might have suggested a negative effect of rising unemployment on editing activity: "Among men, working more hours is even positively correlated with participation in volunteering" and on the other hand "unemployment has a negative effect on men’s volunteering, which is not the case for women.")
It has long been observed how Wikipedia relies on the leisure time of educated people, in particular by Clay Shirky, who coined the term "cognitive surplus" for it, the title of his 2010 book. The present study provides important insights into a particular aspect of this (although the authors caution that economic crises do not uniformly increase spare time, e.g. "employed people may face larger pressure in their paid job", reducing their available time for editing Wikipedia). The paper might have benefited from including a look at the available demographic data about the life situations of Wikipedia editors (e.g. in the 2012 Wikipedia Editor survey, 60% of respondents were working full-time or part-time, and 39% were school or university students, with some overlap).
While human-created knowledge bases (KBs) such as Wikidata provide usually high-quality data (precision), it is generally hard to understand their completeness. A conference paper titled "Assessing the Completeness of Entities in Knowledge Bases"[3] proposes to assess the relative completeness of entities in knowledge bases, based on comparing the extent of information with other similar entities. It outlines building blocks of this approach, and present a prototypical implementation, which is available on Wikidata as Recoin (https://www.wikidata.org/wiki/User:Ls1g/Recoin).
Information extraction (IE) from text has largely focused on relations between individual entities, such as who has won which award. However, some facts are never fully mentioned, and no IE method has perfect recall. Thus, it is beneficial to also tap contents about the cardinalities of these relations, for example, how many awards someone has won. This paper[4] introduces this novel problem of extracting cardinalities and discusses the specific challenges that set it apart from standard IE. It present a distant supervision method using conditional random fields. A preliminary evaluation that compares information extracted from Wikipedia with that available on Wikidata shows a precision between 3% and 55%, depending on the difficulty of relations.
See the research events page on Meta-wiki for upcoming conferences and events, including submission deadlines.
Other recent publications that could not be covered in time for this issue include the items listed below. contributions are always welcome for reviewing or summarizing newly published research.
Discuss this story
Nicely done. --Piotr Konieczny aka Prokonsul Piotrus| reply here 07:37, 23 June 2017 (UTC)[reply]
Even less prevalent examples of gamification on Wikipedia would be: CitationHunt and The WikiData game and two suggested gamification elements would be: an auto-congratulatory feature and Edit counts of subject-area editors / WikiProject leaderboards.