A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
Example award (author's translation, from the paper)
In a large-scale randomized experiment on the German Wikipedia,[1] new editors who were presented with a barnstar-like award on their user talk page were 20% more likely to remain active during the following month. This statistically significant increase in the number of users coming back to contribute more persisted for a full year (four quarters). The effect also appeared when only considering article (mainspace) edits.
The "Edelweiss-Auszeichnung" (Edelweiss with Star) was awarded in a monthly process. All users who had just made their first article edit and at least one other edit, with at least five days between their first and last edit, were considered initially eligible for the award. This was followed by a semi-automated screening process, "developed in consultation with experienced community members", to remove e.g. blocked users, corporate accounts and "advertisers". Apart from this, the award (in its lowest level) was not based on an assessment of the quality of the user's contributions. Its "description does not contain any explicit performance criteria for getting the award, other than that the editors have made their first contributions to the German language Wikipedia in the previous month; it is mentioned that there were more than 4,000 newcomers as potential candidates in a given month." The award was handed out by the author, using a role account, to around 150 users per month. She notes that
The screening process seems to have been reasonably effective in weeding out bad-faith contributors, with only 2% of the awarded users and 3% of the control group having been blocked after more than two years.
The paper also emphasizes that close coordination with the editor community, and the attachment to a thematic portal (Portal Switzerland, similar to a WikiProject on the English Wikipedia) were important to the award's success:
In contrast, a team of Carnegie Mellon researchers recently withdrew a similar research project proposal on the English Wikipedia due to community opposition. See previous coverage from The Signpost.
See also our earlier coverage of related research: "A Preliminary Study on the Effects of Barnstars on Wikipedia Editing", "Recognition may sustain user participation"
See the research events page on Meta-wiki for upcoming conferences and events, including submission deadlines, and the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
"... we draw on the theory of social representation to build an analytical tool, WikiGen ["Wikipedia Genealogy Generator", available at http://wikigen.org/ ], and develop a methodology for examining the evolution of collective knowledge on Wikipedia. We demonstrate the usefulness of the tool and methodology by applying it to an illustrative case study, the Wikipedia article on cloud computing." (from the abstract[2])
"The surprisingly high reliability of Wikipedia has often been seen as a beneficial effect of the aggregation of diverse contributors, or as an instance of the wisdom of crowds phenomenon; additional factors such as elite contributors, Wikipedia's policy or its administration have also been mentioned. We adjudicate between such explanations by modelling and simulating the evolution of a Wikipedia entry. The main threat to Wikipedia's reliability, namely the presence of epistemically disruptive agents such as disinformers and trolls, turns out to be offset only by a combination of factors: Wikipedia's administration and the possibility to instantly revert entries, both of which are insufficient when considered in isolation." (from the abstract[3])
"We combine motivational data from two surveys of Wikipedia newcomers with data of two periods of editing activity. We find that persistence in editing is related to fun, while the amount of editing is not: individuals who persist in editing are characterized by higher fun motives early on (when compared to dropouts), though their motives are not related to the number of edits made. Moreover, we found that newcomers' experience of fun was reinforced by their amount of activity over time: editors who were initially motivated by fun entered a virtuous cycle, whereas those who initially had low fun motives entered a vicious cycle." (from the abstract[4])
See also earlier coverage of a related paper by some of the same authors: "Emergent Role Behaviours in Wikipedia – The 'How' and 'Why'".
"... we study the applicability of a leading technology as deep learning to the problem of vandalism detection. The first set is obtained by expanding a list of vandal terms taking advantage of the existing semantic-similarity relations in word embeddings and deep neural networks. Deep learning techniques are applied to the second set of features [...]. The last set uses graph-based ranking algorithms to generate a list of vandal terms from a vandalism corpus extracted from Wikipedia. These three sets of new features are evaluated separately as well as together to study their complementarity, improving the results in the state of the art." (from the abstract[5])
"...we model whether a new user will become an established member of the community based on their initial activity. ... we are primarily interested in determining positive and negative impacts to new user retention." (From the abstract[6])
"The Wikiscanner tool, which traced the origin of edits on Wikipedia, stirred media scandals throughout the world. Relying on a 'trace ethnography' method, following the discussion on Wikipedia articles, this article deals with the Japanese edition reaction to the scandals. I argue that this reaction represents a unique form of online publicity that facilitates anonymous normative discussion. In addition [...], the article contends that Wikipedia enables a rare model of anonymous public debate which bridges earlier Japanese conceptions of anonymity and publicity." (from the abstract[7])
"... the problem of automatically assessing the quality of Wikipedia articles is considered. In particular, the focus is on the analysis of hand-crafted features that can be employed by supervised machine learning techniques to perform the classification of Wikipedia articles on qualitative bases. [... This approach] produced encouraging results with respect to the considered features." (from the abstract[8])
"we explore challenges in compiling a pedagogic resource like a textbook on a given topic from relevant Wikipedia articles, and present an approach towards assisting humans in this task. We present an algorithm that attempts to suggest the textbook structure from Wikipedia based on a set of seed concepts (chapters) provided by the user. We also conceptualize a decision support system where users can interact with the proposed structure and the corresponding Wikipedia content to improve its pedagogic value. The proposed algorithm is implemented and evaluated against the outline of online textbooks on five different subjects. We also propose a measure to quantify the pedagogic value of the suggested textbook structure." (from the abstract[9])
"... we propose relational event models to analyze dynamic network effects explaining the allocation of contributor attention to Wikipedia articles about migration-related topics. Among others, we test for the presence of a rich-get-richer effect in which articles edited by many users are likely to receive even more contributions in the future and uncover which users start working on less popular articles. We further analyze local clustering effects in which pairs of users tend to repeatedly collaborate on the same articles ..." (from the abstract[10])
Discuss this story