The Signpost

Recent research

"Ignore all rules" in deletions; anonymity and groupthink; how readers react when shown talk pages

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Wikipedia's "Ignore all rules" policy (IAR) is a double edged sword in deletion arguments

A beetle larva ignoring the rules while negotiating deletion with the frog.[mediasource 1]

A paper presented at last month's CSCW Conference, titled "Keeping eyes on the prize: officially sanctioned rule breaking in mass collaboration systems"[1] observes that "Mass collaboration systems are often characterized as unstructured organizations lacking rule and order", yet Wikipedia has a well developed body of policies to support it as an organization. Rule breaking in bureaucracies is a slippery slope quickly leading to potentially dangerous exceptions, so Wikipedia has a mechanism called "Ignore all rules" (WP:IAR) for officially sanctioned rule breaking. The researchers have considered IAR's impact within the scope of deletion requests. The results show that the IAR policy has meaningful influences on deliberation outcomes, which rather than wreaking havoc, provides a positive, functional governance mechanism.

This paper is another welcome addition to the growing literature on AfD, examining the effectiveness of rule breaking using WP:IAR within these discussions. It starts with an in depth examination of rule breaking within collaborative environments. Then these six hypotheses are postulated:

  1. Invocation of WP:IAR in support of vote correlates with increased likelihood of the decision that the vote will be on the winning side.
  2. This effect is expected to increase with the number of policies cited in the deletion proposal (since they may be contradicting each other).
  3. Invoking IAR to override the deletion proposal’s policy citation tends to reduce the proposal’s likelihood of success.
  4. When IAR is used together with another policy domain (e.g. Content/Conduct/Legal) as the proposal’s rationale, it will negate the proposal’s success.
  5. Increased dissonance between policies arising in the discussion will increase the chance that the IAR argument will be successful.
  6. IAR will increase in effectiveness as the policies invoked increase in complexity.

To test these, the researchers scoured AfD discussions starting from April 2006 to October 2008, collecting those where WP:IAR had been invoked. These were then supplemented by randomly drawing a control group from non IAR AfD discussions from the same date. The resulting dataset contained 555 AfD discussions. These were coded by Outcome, for Keep/Delete and IAR usage in Keep/Delete vote, Policy Match and Category Match. Each hypothesis and the control were fitted to a linear regression model. The results were as follows:

H1 was supported only in cases where IAR is used in keep vote, but showed insignificant impact as a delete argument. H2, H3 & H4 look for conditions in which IAR's impact on the ultimate decision would be strengthened. H2 was supported only marginally; H3 was not supported; H4 was not supported and actually indicated that in the case where a keep voter has invoked IAR with another policy this will only increase the chance of a delete outcome! H5 and H6 consider if IAR fares better when pitted against increasingly contradictory or complicated policies and both of these are supported. Overall, the authors conclude that IAR plays a significant role in Wikipedia's policies, and recommend its use to other communities. They point out that IAR is also an indicator of where policy is weak in addressing the community's needs.

Activity of content translators on Wikipedia examined

The Japanese phrase “itterashai”, uttered by a budgerigar.[mediasource 2]
This image of the flatworm Pseudorhabdosynochus morrhua has descriptions in currently 29 languages, i.e. 10% of Wikipedia languages.

Another CSCW paper titled "Could someone please translate this?": activity analysis of Wikipedia article translation by non-experts"[2] analyzes the work of a volunteer translator of Wikipedia articles. It goes into great detail: it breaks down the big translation task into many sub-activities, such as looking up complicated words in the source language, choosing the right translation, using editing software, etc. It presents all the activities according to the Activity theory methodology. Though there are other papers that deal with translation of Wikipedia content, it is the first paper to examine the actual volunteer translator's activity.

Interestingly, this paper notes the importance of the Simple English Wikipedia several times, as a tool that may help people translate the content, with the assumption that the language of the main English Wikipedia may frequently be complex and challenging (this assumption is based on another paper, which compared the English and Simple English Wikipedias). It relies on the Simple English Wikipedia a bit too much, though; for example, it cites its main page as a source for some statistics, which would better be obtained directly from stats.wikimedia.org, Wikimedia's main statistics site.

It has some shortcomings, which should be addressed in future works on the subject:

Despite these shortcomings, this paper is valuable for several reasons:

Finally, the article promises further research and suggestions about building tools for translator support, which would be very interesting to read.

Comparison of collaborative editing in OpenStreetMap and Wikipedia

A preprint titled "Has OpenStreetMap a role in Digital Earth Applications?"[3] studies OpenStreetMap, the wiki-based collaboratively editable map, as a predominant example of Volunteered Geographical Information projects. The paper addresses two main research questions: 1) How successful is the OSM project in providing spatial data and to which extent can it be compared to Wikipedia in this sense, 2) what are the main characteristics of OSM stemming from its crowd-sourced nature? The paper gives a very comprehensive overview of the work-flow of OSM, reviews the main characteristics of its collaborative mapping process very well, and tries to compare these characteristics with those of Wikipedia: In contrast with Wikipedia, the administrative structure of OSM is unknown and not very well defined within the community of its editors; however both platforms show the same Zipfian characteristics among their editors; a few editors are responsible for large numbers of contributions and many editors have only a few contributions. Although the criteria are quite different on the two platforms, the paper finds that the relative population of OSM Featured Objects is evidently larger than the ones of Wikipedia (Featured Articles). In the conclusion, the authors express that they "believe that OSM will continue its growth for the foreseeable future". However, the route to this conclusion is not very well described in the manuscript.

Wikipedia's coverage of breaking news stories is still a fertile field of research

The Chelyabinsk meteor on February 15 did not just leave its traces in the Ural region, but Wikipedia entries on the event had been started in 29 languages by the end of that day. Today, there are 44.

In MJ no more: Using Concurrent Wikipedia Edit Spikes with Social Network Plausibility Checks for Breaking News Detection[4] by Thomas Steiner, Seth van Hooland and Ed Summers, the controversial (per WP:Recentism and WP:RS) field of breaking news articles is investigated. Motivated by the overloading of Wikipedia during the breaking of the news of Michael Jackson's death, researcher Thomas Steiner created an open source exploratory tool called The Wikipedia Live Monitor. This tool allowed his team to examine clusters of related activity based on edit spikes in a 5 minute window within multiple streams fed by Wikipedia's recent changes; Twitter Feeds; Google+ and Facebook. The main research question posed is: are edit spikes in Wikipedia, clustered with related social network activity, useful indicators for identifying breaking news events, and with what delay? By considering action along multiple streams, they are able to cross-check the plausibility of information being disseminated by many less reliable sources.

Their approach is based on prior work by S. Petrović, M. Osborne, and V. Lavrenko in Streaming First Story Detection with Application to Twitter, who used the document vector space model from classic information retrieval to cluster twitter feeds. But in this case the researchers are clustering multiple streams which can potentially hold far more information when a story breaks and can therefore detect these very quickly. While they could locate breaking news, they may need more work to optimize the timing parameters of the algorithm. Further research is planned into automating the classification of edits, which could reform future use of non-reliable sources.

A WikiSym 2012 paper titled Staying in the Loop: Structure and Dynamics of Wikipedia’s Breaking News Collaborations[5] looked at the trajectory of article construction which captures the collaboration structure embedded in the creation of breaking news stories. They have shown that these stories, fueled by mass media and social networks, tend to create a social melting pot surrounding the editing of these events. A social network analysis of the relations between editors of breaking news stories located editors in diverse social roles, such as Creators, early contributors, the highly centralized activity coordinators (admins) and the marginal vandals and their tireless opponents, the spam fighting bots and recent changes patrollers. Another result is that most articles - those which are not breaking news stories - lack the dense creation trajectories found in breaking news stories.

Exposing talk page discussions leads to drop in perceived article quality

As once observed by Ward Cunningham, one important feature by which Wikipedians improved his invention, the wiki, was to introduce "a talk page or a discussion page behind every page, so you don't actually have to see the discussion and it makes a much more finished product". Yet surfacing this deliberation could engender trust in the process if the deliberation process appears fair, well-reasoned, and thorough. Alternatively, it could encourage doubts about content quality, especially if the process appears messy or biased. In a CSCW '13 paper titled "Your process is showing: controversy management and perceived quality in wikipedia",[6] the researchers report on an experiment in which they found that exposing discussions generally led to a drop in the perceived quality of the related article, especially if the discussion revealed conflict.

Motivated by how university students learn to assess reliability of controversial articles such as Supreme Court decisions or about individuals like Pope Pius XII and Yasser Arafat, the researchers considered how beneficial it would be to reveal the process of articles creation. In wikis the discussions used to produce the articles are hidden from view using talk pages and other coordination spaces. It was believed that when deliberations appear fair, well-reasoned, and thorough it should engender trust in the reader and that a process which appears biased or chaotic should diminish the confidence in the article's quality. The paper outlines the issues involved in assessing the credibility of online information sources. The paper first considers prior work on article quality but reframes the issues based on an idea presented in the recent best seller Thinking, Fast and Slow by economics Nobel laureate Daniel Kahneman. The research questions posed are:

These questions are then interpreted using Kahneman's System 1 (slower deliberative thinking) and System 2 (faster associative thinking). The questions were investigated in an experiment run on Amazon's mechanical Turk — a crowdsourcing platform allowing micropayments. Beginning with 3500 controversial articles, the researchers selected featured articles, and discarded newsworthy items leaving only 50 articles. Elite Turkers were then shown ten brief vignettes illustrating talk page discussion about a selected controversy, meant to display one of ten forms of editor coordination or conflict activities. They then had to answer a questionnaire, and complete two reading comprehension tasks. The researchers noticed that exposing Wikipedia readers to such discussions with any type of conflict generally led to a drop in the perceived quality of the related article. They point out that the magnitude of the reader's negative perception depends on the type of editors’ interaction. Finally they note that while participants may have suffered a confidence crisis with respect to specific articles, at the same time they gained respect for Wikipedia in general. A final conclusion is that while the experiment, especially the comprehension task, was designed to engage readers in System 1 thinking, watching the discussions may well have triggered a System 2 critical response.

In brief

References

  1. ^ Keeping eyes on the prize: officially sanctioned rule breaking in mass collaboration systems. https://dl.acm.org/citation.cfm?doid=2441776.2441898 Closed access icon
  2. ^ "Could someone please translate this?": activity analysis of Wikipedia article translation by non-experts https://dl.acm.org/citation.cfm?id=2441883 Closed access icon
  3. ^ Peter Mooney and Padraig Corcoran: Has OpenStreetMap a role in Digital Earth Applications?http://www.cs.nuim.ie/~pmooney/websitePapers/V3_IJDE_2012_MooneyCorcoran-CORRECTED_1.pdf Open access icon
  4. ^ Thomas Steiner, Seth van Hooland and Ed Summers: Using Concurrent Wikipedia Edit Spikes with Social Network Plausibility Checks for Breaking News Detection http://www.lsi.upc.edu/~tsteiner/papers/2013/mj-no-more-using-concurrent-wikipedia-edit-spikes-with-social-network-plausibility-checks-for-breaking-news-detection-ramss2013.pdf Open access icon
  5. ^ Brian Keegan, Darren Gergle and Noshir Contractor: Staying in the Loop: Structure and Dynamics of Wikipedia’s Breaking News Collaborations http://wikisym.org/ws2012/bin/download/Main/Program/p4wikisym2012.pdf [dead link]
  6. ^ http://dx.doi.org/10.1145/2441776.2441896 Your process is showing: controversy management and perceived quality in wikipedia Closed access icon
  7. ^ R. Stuart Geiger, Aaron Halfaker: Using Edit Sessions to Measure Participation in Wikipedia http://www-users.cs.umn.edu/~halfak/publications/Using_Edit_Sessions_to_Measure_Participation_in_Wikipedia/geiger13using-preprint.pdf Open access icon
  8. ^ http://www.uld-conference.org/paper.php?p=328&l=en Between Wictionary [sic] and a Thesaurus : Some Dilemmata of a Sign Language Dictionary Closed access icon
  9. ^ Jaehun Joo, Ismatilla Normatov: Determinants of collective intelligence quality: comparison between Wiki and Q&A services in English and Korean users. Service Business, February 2013 PDF Closed access icon
  10. ^ Garry R. Thomas, Lawson Eng, Jacob F. de Wolff, and Samir C. Grover: An Evaluation of Wikipedia as a Resource for Patient Education in Nephrology. Seminars in Dialysis—Vol 26, No 2 (March–April) 2013 pp. 159–163. DOI: 10.1111/sdi.1 http://onlinelibrary.wiley.com/doi/10.1111/sdi.12059/abstract Closed access icon
  11. ^ Al Khatib, Khalid; Hinrich Schutze; Cathleen Kantner (December 2012). "Automatic Detection of Point of View Differences in Wikipedia" (PDF). Proceedings of COLING 2012. Retrieved 25 March 2013.
  12. ^ Townsend, Stephen; Gary Osmond; Murray G. Philips (2013). "Wicked Wikipedia? Communities of Practice, the Production of Knowledge and Australian Sport History". The International Journal of the History of Sport. 30 (5): 545. doi:10.1080/09523367.2013.767239. S2CID 145732434. Closed access icon
  13. ^ Scott A. McGreal: The Misunderstood Personality Profile of Wikipedia Members. Contrary to prior claims, Wikipedians are hardly "grumpy and close-minded" March 11, 2013 http://www.psychologytoday.com/blog/unique-everybody-else/201303/the-misunderstood-personality-profile-wikipedia-members
  14. ^ Yair Amichai–Hamburger, Naama Lamdan, Rinat Madiel, and Tsahi Hayat. CyberPsychology & Behavior. December 2008, 11(6): 679-681 http://dx.doi.org/10.1089/cpb.2007.0225
  15. ^ Michail Tsikerdekis: The effects of perceived anonymity and anonymity states on conformity and groupthink in online communities: A Wikipedia study. Journal of the American Society for Information Science and Technology DOI:10.1002/asi.22795 Closed access icon Preprint online at http://tsikerdekis.wuwcorp.com/10.1002-asi.22795 Open access icon
  16. ^ http://www.10yetis.co.uk/global-journalist-research.html Closed access icon
  17. ^ Michael Szajewski: Using Wikipedia to Enhance the Visibility of Digitized Archival Assets. D-Lib Magazine http://www.dlib.org/dlib/march13/szajewski/03szajewski.html Open access icon
  18. ^ Europeana Impressions: Pinterest, Facebook & Wikipedia http://pro.europeana.eu/web/guest/pro-blog/-/blogs/1600355 Open access icon
  19. ^ Timothy Allan Brunet: Accommodating the Wikipedia Project in Higher Education: A University of Windsor Case Study. http://scholar.uwindsor.ca/etd/504 Open access icon
  20. ^ Cheryl Lillian Moy: Investigation of Disassembling Polymers and Molecular Dynamics Simulations in Molecular Gelation, and Implementation of a Class-Project Centered on Editing Wikipedia http://deepblue.lib.umich.edu/handle/2027.42/96104 Open access icon
  21. ^ Erik Zachte: Monthly edits on Wikimedia wikis still on the rise. March 9, 2013 Open access icon
  22. ^ http://www.zerogeography.net/2013/03/who-edits-wikipedia-map-of-edits-to.html Open access icon
  23. ^ Graham, M. 2013. Geographies of Information in Africa: Wikipedia and User-Generated Content. In R-Link: Rwanda’s Official ICT Magazine. Kigali: Rwanda ICT Chamber 40-41. PDF Open access icon
  24. ^ http://www.zerogeography.net/2013/03/what-percentage-of-edits-to-english.html Open access icon
  25. ^ Travis L., Bauer; Rich Colbaugh; Kristin Glass; David Schnizlein (January 2013). "Use of Transfer Entropy to Infer Relationships from Behavior" (PDF). Retrieved 25 March 2013. {{cite journal}}: Cite journal requires |journal= (help) Open access icon
  26. ^ Sameer Singh, Amarnag Subramanya, Fernando Pereira, and Andrew McCallum.Wikilinks: A Large-scale Cross-Document Coreference Corpus Labeled via Links to Wikipedia. Technical Report Department of Computer Science, University of Massachusetts, Amherst.UMASS-CS-2012-015, October, 2012 https://web.cs.umass.edu/publication/docs/2012/UM-CS-2012-015.pdf Open access icon
Image sources
  1. ^ Wizen, G.; Gasith, A. (2011). Rands, Sean A (ed.). "An Unprecedented Role Reversal: Ground Beetle Larvae (Coleoptera: Carabidae) Lure Amphibians and Prey upon Them". PLOS ONE. 6 (9): e25161. Bibcode:2011PLoSO...625161W. doi:10.1371/journal.pone.0025161. PMC 3177849. PMID 21957480. Open access icon
  2. ^ Eda-Fujiwara, H.; Imagawa, T.; Matsushita, M.; Matsuda, Y.; Takeuchi, H. A.; Satoh, R.; Watanabe, A.; Zandbergen, M. A.; et al. (2012). Hausberger, Martine (ed.). "Localized Brain Activation Related to the Strength of Auditory Learning in a Parrot". PLOS ONE. 7 (6): e38803. Bibcode:2012PLoSO...738803E. doi:10.1371/journal.pone.0038803. PMC 3372503. PMID 22701714. Open access icon
  3. ^ Chi, L.; Saarela, U.; Railo, A.; Prunskaite-Hyyryläinen, R.; Skovorodkin, I.; Anthony, S.; Katsu, K.; Liu, Y.; et al. (2011). Samakovlis, Christos (ed.). "A Secreted BMP Antagonist, Cer1, Fine Tunes the Spatial Organization of the Ureteric Bud Tree during Mouse Kidney Development". PLOS ONE. 6 (11): e27676. Bibcode:2011PLoSO...627676C. doi:10.1371/journal.pone.0027676. PMC 3219680. PMID 22114682. Open access icon
  4. ^ Schwitzer, G.; Mudur, G.; Henry, D.; Wilson, A.; Goozner, M.; Simbra, M.; Sweet, M.; Baverstock, K. A. (2005). "What Are the Roles and Responsibilities of the Media in Disseminating Health Information?". PLOS Medicine. 2 (7): e215. doi:10.1371/journal.pmed.0020215. PMC 1181881. PMID 16033311. Open access icon

















Wikipedia:Wikipedia Signpost/2013-03-25/Recent_research