The Signpost

Recent research

IP editors, inclusiveness and empathy, cyclones, and world heritage


A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"The Hidden Costs of Requiring Accounts: Quasi-Experimental Evidence From Peer Production"

Wikipedia is unusual among user-generated content websites in that it allows contributions – even changes to widely read existing content – without creating an account, in exchange to contributors agreeing to have their IP address published alongside their edit. This has been a core feature since the project's beginnings, but has come to be questioned from several angles in recent years. In "The Hidden Costs of Requiring Accounts: Quasi-Experimental Evidence From Peer Production"[1], well-known Wikipedia researchers Benjamin Mako Hill and Aaron Shaw provide some insights on what might happen if IP editing were to be disabled on Wikimedia wikis. As they summarize in the abstract:

"We conduct an empirical test using longitudinal data from 136 natural experiments where would-be contributors to wikis were suddenly required to log in to contribute. Requiring accounts leads to a small increase in account creation, but reduces both high- and low-quality contributions from registered and unregistered participants. Although the change deters a large portion of low-quality participation, the vast majority of deterred contributions are of higher quality."

These 136 wikis are drawn from a larger dataset from Wikia (today known as Fandom, a commercial wiki hoster co-founded by Wikipedia founder Jimmy Wales), consisting of wikis where local administrators had requested for IP editing to be disabled.

According to the paper's literature review (of general research on peer production),

"One body of prior studies argue that stable identifiers work as catalysts of cooperation by facilitating accountability, group identification, boundaries, and commitment that lead to higher quality participation. Others contend that requiring accounts imposes costly obstacles that deter participation. A related set of studies suggests that unregistered contributions may introduce diverse perspectives and stimulate activity among experienced community members ..."

Based on this, the authors come up with four hypotheses for the effects of requiring account registration:

  • "an increase in the number of newly registered accounts (H1)"
  • "a reduction in the number of subsequently removed contributions (H2)" (operationalized as reverted edits)
  • "a reduction in high quality contributions (H3)" (operationalized via non-reverted edits and via content persistence)
  • "a reduction in contributions from participants who contributed with accounts prior to the requirement (H4)"

In their analysis, the authors find "substantial discontinuous shifts in each of our dependent variables when wikis began requiring accounts", supporting all four hypotheses.

The authors first presented such results at Wikimania 2015 and published them in peer-reviewed form last year. Just a few months afterwards, a large Wikimedia wiki enacted such a change for the first time (see e.g. the Signpost's coverage: Portuguese Wikipedia bans IP editing). A recent evaluation of its impact by the Wikimedia Foundation does not quite match the results from the Wikia wikis, finding "no significant negative impact in the analysis conducted thus far" (see also the more detailed summary in this issue's "News and notes" section: "Portuguese IP ban on track"). The statistical techniques used in that analysis appear to be somewhat simpler (e.g. only considering year-over-year changes in monthly counts of active editors, in contrast to Hill's and Shaw's analysis of weekly data using a regression discontinuity design in the paper). On the other hand, the WMF analysis also considered additional metrics designed to measure "Impacts on administration", namely the number of user blocks, page protections, and Checkuser checks, finding (somewhat unsurprisingly) "a significant reduction in administration actions".

There are other reasons to question how much the Wikia results would generalize to Wikipedias or other Wikimedia wikis. For example, in recent years the Wikimedia Foundation has invested significantly in technical measures to improve the experience of newcomers. But IP editors do not benefit from such improvements ("newcomers" meaning newly created accounts), so requiring them to register might have the added benefit that every contributor to a wiki will have been reached by such newbie-supporting measures as they start to edit.


Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

"New maps for an inclusive Wikipedia: decolonial scholarship and strategies to counter systemic bias"

From the abstract:[2]

"Since early in the development of the project, Wikipedia editors have been concerned with overcoming “systemic biases” in coverage of the world’s knowledge, especially those rooted in forms of social marginalization. [...] However, many Wikipedia editors and observers have argued that the systemic biases of Wikipedia are inherent to current global distribution of knowledge production, and can only be overcome by changing the encyclopedia’s standards of inclusion. This article reframes this debate by comparing the project of “countering systemic bias” on Wikipedia with the effort within Western/Northern academia to decolonize and diversify scholarship. Since this project began at least fifty years ago, it has led to abundant peer-reviewed scholarship, all of which qualifies as “reliable sources” for Wikipedia articles. [...] The article proposes that critical scholarship, historical maps, and maps in contemporary scholarship can all contribute to addressing Wikipedia’s systemic biases."


"Empathy plasticity: Decolonizing and reorganizing Wikipedia and other online spaces to address racial equity"

From the abstract:[3]

"... Focussing on a social psychological concept of “empathy plasticity”, the critical developmental window in which humans begin to formulate attitudes and dispositions on phenomena around them, this commentary explores how a re-organization of online spaces, such as Wikipedia and social media, can be used to more holistically present knowledge on topics in the realm of race, ethnicity, class, and gender. Such a technocratic shift is one means of content democratization and dialectical gatekeeping to ensure our contemporary epistemology acquires and maintains methodological, ideological and ethical merit."


"Wikiproject Tropical Cyclones: The most successful crowd-sourced knowledge project with near real-time coverage of extreme weather phenomena"

From the paper:[4]

"Wikiproject Tropical Cyclones delivers the highest quality content of all Wikiprojects on English Wikipedia. We also find that both readership and editing of the articles on tropical cyclones are highly correlated with cyclones’ occurrences, which indicates that Wikipedia is a go-to source for many people [...] For the purpose of identifying the most successful Wikiprojects, we have ranked all projects based on the total number of FA and GA articles that are within their scope [...] Since projects vary in size, and the sheer number of articles can impact the relative number of quality articles, we normalized the totals by the square root of the number of articles that a project curates."


"World Heritage sites on Wikipedia: Cultural heritage activism in a context of constrained agency"

From the abstract:[5]

"... we investigate the patterns of production, consumption, and spatial and temporal distributions of Wikipedia pages for World Heritage cultural sites. We find that Wikipedia provides a distinctive context for investigating how people experience and relate to the past in the present. The agency of participants is highly constrained, but distinctive, behind-the-scenes expressions of cultural heritage activism are evident. Concerns about state-like actors, violence and destruction, deal-making, etc. in the World Heritage inscription process are present, but rare on Wikipedia’s World Heritage pages. Instead, hyper-local and process issues dominate controversies on Wikipedia."

References

  1. ^ Hill, Benjamin Mako; Shaw, Aaron (2020-05-26). "The Hidden Costs of Requiring Accounts: Quasi-Experimental Evidence From Peer Production". Communication Research. doi:10.1177/0093650220910345. Closed access icon Author's copy
  2. ^ Bjork-James, Carwil (2021-01-06). "New maps for an inclusive Wikipedia: decolonial scholarship and strategies to counter systemic bias". New Review of Hypermedia and Multimedia. 0 (0): 1–22. doi:10.1080/13614568.2020.1865463. ISSN 1361-4568. Closed access icon Author's copy (accepted manuscript version) and blog post: https://woborders.blog/published-elsewhere/new-maps/
  3. ^ Ezell, Jerel M. (2021-01-05). "Empathy plasticity: Decolonizing and reorganizing Wikipedia and other online spaces to address racial equity". Ethnic and Racial Studies. 0 (0): 1–13. doi:10.1080/01419870.2020.1851383. ISSN 0141-9870. Closed access icon
  4. ^ Jemielniak, Dariusz; Rychwalska, Agnieszka; Talaga, Szymon; Ziembowicz, Karolina (2021-07-21). "Wikiproject Tropical Cyclones: The most successful crowd-sourced knowledge project with near real-time coverage of extreme weather phenomena". Weather and Climate Extremes: 100354. doi:10.1016/j.wace.2021.100354. ISSN 2212-0947.
  5. ^ Marwick, Ben; Smith, Prema (2021-01-01). "World Heritage sites on Wikipedia: Cultural heritage activism in a context of constrained agency". Big Data & Society. 8 (1): 20539517211017304. doi:10.1177/20539517211017304. ISSN 2053-9517.


+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
  • Just wanted to say - thanks, TB, for keeping this column running for years and years. --Piotr Konieczny aka Prokonsul Piotrus| reply here 04:12, 30 August 2021 (UTC)[reply]
  • Have read only the excerpt quoted here, but Jemielniak et al (2021)'s findings that Wikiproject Tropical Cyclones delivers the highest quality content of all Wikiprojects on English Wikipedia surprise me. I expected MilHist. Learn something everyday. Rotideypoc41352 (talk · contribs) 18:26, 30 August 2021 (UTC)[reply]
    • Paper is paywalled so can't say for sure, but it says it's using the square root of the total articles? A quick check shows MILHIST has 1,334 FAs out of 200,359 articles total. Tropical cyclones has 165 FAs but out of mere ~3200 articles. That seems like it should still favor MILHIST, but I guess Tropical Cyclones huge number of GAs helped it win out - which seems a bit off to me, I'm curious if the writers included A-class MILHIST articles that were also GAs, and also, there's only so many GANs that can really be processed, so that would hit larger projects more than smaller projects. (Regardless, both of those projects are doing good work.) SnowFire (talk) 23:30, 30 August 2021 (UTC)[reply]
      We ran into a similar problem in our 2015 paper about the misalignment between quality and viewership of articles (covered in the April 29, 2015 Signpost's research section: Popularity does not breed quality (and vice versa)). In that case, we were trying to understand topics of FAs with relatively low viewership, and used relative risk to measure it. The approach has a couple of parameters that can be tuned: the minimum number of articles in a project, which removes small projects that often focus on a very specific topic; and the minimum number of articles in the dataset, which controls how general the listed topics are (smaller values lead to more specific topics, IIRC).
      Both MILHIST and Tropical Cyclones were in our paper (Table 9, page 7). The difference in relative risk was large: MILHIST ranked #7 with an RR of 5.3, whereas Tropical Cyclones ranked #2 with an RR of 99.3. I'm unsure what the number of articles in the projects were at that point in time, we don't mention it and instead focus on how MILHIST also has some articles with high viewership but relatively low quality. One was NATO, now a GA which made me happy to see! The other was Vietnam War, which is still a C-class article (and it makes sense that it still is, we had good discussions on the talk page of the Signpost back in 2015 about some of the challenges of writing those types of articles). Cheers, Nettrom (talk) 08:01, 1 September 2021 (UTC)[reply]
    • Disturbingly, Wikipedia:WikiProject Tropical cyclones is the subject of a very large copyright investigation opened in May 2021. "For the most part, this project remains rife with direct copy and pastes, unattributed PD copying, unattributed copying within Wikipedia, possible but unconfirmed translation vio, possible but unconfirmed cross-wiki translation vio, and possible but unconfirmed paywall vio, mostly to newspapers.com we believe." NebY (talk) 20:51, 1 September 2021 (UTC)[reply]
      Yes, I'm always concerned when researchers use Wikipedia's FA label to judge the quality of the content here. The whole Clean Wermacht controversy with the military history articles a while back showed how problematic accepting FA status as an indicator of actual quality can be. AugusteBlanqui (talk) 19:14, 2 September 2021 (UTC)[reply]
      • An article is an artistic creation, much as a book or movie is, but GAFA does not operate entirely like book reviews and movie awards. It works more by checklist, and the article is judged by how well it meets each of the requirements. So, some subjects like warships and hurricanes are born a dozen or three per year. They have very precise notability requirements and usually very precise official sources, and well-established traditions of article structure and checklists of what goes in the article and what does not. Checking off the boxes for the checklists of GAFA is easier for those whose own checklists are clear, than for articles about a war or a musical genre or a technical development, where the boundaries and criteria for aspects to be included are seldom so clear. This of course does not make GAFA unimportant, but like any tool its powers and limitations have to be understood.Jim.henderson (talk) 15:02, 4 September 2021 (UTC)[reply]
        Courtesy link: Wikipedia:Wikipedia Signpost/2018-04-26/Op-ed § World War II Myth-making and Wikipedia Rotideypoc41352 (talk · contribs) 18:14, 7 September 2021 (UTC)[reply]

















Wikipedia:Wikipedia Signpost/2021-08-29/Recent_research