The Signpost

Special report

The plight of the new page patrollers

This week the Wikimedia Foundation reported the results of its New Page Patrol Survey, part of a project to increase understanding of editors who work on new page patrol (NPP). NPP is the first line of defense against articles that do not belong on Wikipedia, be they unsuitable pages, copyright violations, or attack pages.

New page patrolling has long been a prominent problem area for the Wikipedian community. While patrollers complain of being overworked, other editors have raised issues with the patrollers themselves, characterizing them as mostly young and inexperienced volunteers who lack clue, are ignorant of deletion rules, and often mis-tag new articles. The NPP Survey, first suggested by Kudpung, aimed to collect information on current patrollers to base improvements to the NPP process. The results: Most new page patrollers are over 18 years of age, most have at least an undergraduate degree, and most are very much clued into what they are doing. The report – while providing a better view of who the average page patroller is and what the average page patroller does – substantially refutes some stereotypes.

History of the process

Patrolled edits is a software feature that went live on Wikipedia in November 2007, after a proposal to re-enable anonymous page creation (see previous Signpost story). The feature draws on Special:NewPages, which lists newly created articles on Wikipedia, although it can be extended to any other page category by using a drop-down option. The interface lists recently created pages in descending order (the most recent first), allowing users to easily browse recently created pages. New pages are kept on the list for 30 days, after which they disappear from view.

A de facto new pages patrol group has existed since at least March 2004. Its members used the Special:NewPages page directly, a page that has been part of Wikipedia's infrastructure since the introduction of the Phase II software in January 2002; before then, the New topics page was used for this purpose. Problematic new articles were a large part of the votes for deletion process, the unified deletion process that covered all namespaces in earlier days of Wikipedia. (Today, there are separate discussions depending on the type of page to be deleted.)

The introduction of patrolled edits provided a better process for patrolling new pages. Unpatrolled new pages are highlighted in yellow, and patrollers can choose to browse just those unpatrolled new articles. Once viewed, any autoconfirmed editor can mark an article as patrolled by clicking the "[Mark this page as patrolled]" link at the bottom right corner of the page. If the patroller thinks the article is ready for mainspace, or they have added multiple maintenance tags or nominated it for deletion, the article can be marked as patrolled. Articles created by administrators were automatically marked as patrolled by the system; this was unbundled in June 2009 (or, rather, a system of whitelisting was formalised with the creation of the autopatrolled user right), so that now any prolific creator of valid articles can have this privilege. This keeps most of the articles created by established editors (familiar with inclusion criteria) out of the pool of those needing to be patrolled.

Because Special:NewPages only holds onto articles for 30 days, unpatrolled articles that survive that long drop off the list, making them exceedingly difficult to find later. The length of the queue of unpatrolled articles has oscillated for years, but there have been occasions at least as far back as 2009 when there were insufficient numbers to patrol all new articles. Due to the 30-day cutoff there is a constant pressure to keep up with the list, in the face of what many patrollers see as a shortage of reviewers. Coupled with the pressure to "get it right the first time" (because once an article is marked as reviewed, other patrollers no longer see it), the process has been characterized as stressful by some patrollers. The front of the queue can also be stressful if an individual patroller tries to keep pace with the flow of new articles.

The patrollers themselves have come under fire; critics contend that they "deal with new users inappropriately, scaring them off, that they have an unacceptably high error rate when tagging pages for deletion, and that they are more interested in using New Page Patrol as a route to gaining higher userrights [like adminship] than in actually patrolling productively and helping improve new articles" (NPP survey). Many critics consider the underlying issue to be that patrollers were overwhelmingly young and inexperienced editors, much more likely to act immaturely or without regard for new editors or new articles. Others believe that it is the stress of the process that drives the perceived errors among its users.

In response to perceived problems with new page patrolling, on April 3, 2011, a proposal was put forward to reduce the stress in the system by limiting article creation to autoconfirmed users. Snottywong summarized the reasons for this with an analysis that showed that only 17.5% of autoconfirmed users' articles were deleted, compared with 72.5% of new editors' contributions. A second proposal soon after established a "clear consensus for a six-month trial, followed by a one-month period of discussion to determine the trial's effects"; the specifics were finalized at Wikipedia:Autoconfirmed article creation trial.



Articles created by autoconfirmed users: deleted versus kept.
Articles created by non-autoconfirmed users: deleted versus kept.
Note: The large jump in April was caused by bot activity.


The Zoom interface.

Yet when the technical solution was submitted for implementation on Bugzilla, it was shot down by the Wikimedia Foundation, being marked as a "RESOLVED WONTFIX". WMF deputy director Erik Möller said that "creating a restriction of this type is a strong a [sic] statement of exclusion, not inclusion, and that it will confuse and deter good faith editors."

Instead, the WMF proposed that there be improvements for the interface that managed the new page backlog, with one goal being to better welcome constructive editors into the community. The refusal to accept community consensus was extremely controversial; many editors considered it damaging to the relationship between the Foundation and Wikipedia's editors. The Foundation focused on a "New Page Triage", an initiative that Möller hoped would "reduce the work involved in patrolling new pages by simplifying and smoothing out the process ... a system that is self-explanatory for newer editors; someone who hasn't done New Page Patrol before can look at it and gain an instinctive understanding of what they're expected to do and how they should work."

The upshot of these developments was the new Zoom interface, a redesign of Special:NewPages. It has a dynamic construction that allows users to mark a page with any of several specific templates, both for maintenance and deletion, before saving their changes and continuing on to the next article. One other change that is currently under consideration by developers is the addition of a patroller user-right to control who can access the interface.

It was quickly found that there was no single way that patrollers did their work, and that they were using a variety of third-party software for their work. The lack of understanding page patrollers led to Kudpung creating a survey to find out more about what an average page patroller did, and who they were.

Survey results

[W]e can confirm that the common stereotype of patrollers as young, poorly educated and ignorant is almost entirely without basis. The vast majority of patrollers are over 18 and have undergraduate degrees or above – in some cases, actually exceeding the average for editors overall ... They are largely familiar with relevant policies, and greatly exceed the expectations set by the stereotype. Indeed, the only major difference between patrollers and any other editor is that patrollers choose to patrol.

New Page Patrol survey

Editors were identified as patrollers and surveyed based on three separate sources: 2,504 from a script by Snottywong, the 1,300 editors with a {{User wikipedia/NP Patrol}} template on their userpage, and those 133 with a {{User Newpages with Twinkle}} template. Of the editors asked to fill in the survey, 1,255 did so, but after removing surveys with incomplete answers, errors, and obviously fallacious data (e.g. 10-year-olds from Africa with doctorates), and making adjustments because a number of editors had been mistakenly asked to participate, the total included in the survey was reduced to 309 participants. To supplement the survey, an analysis was done on the top quartile of editors by number of patrol actions. A summary of the results follows.

Demographics


Demographically, New Page Patrollers were found to indeed be overwhelmingly North American and European, with these regions accounting for 85 percent of those surveyed. Only 8% identify as females, consistent with cross-wiki average found during the April 2011 Editor Survey.

More than 60 percent of patrollers have been editing since 2006 or earlier, the antithesis of the stereotype of inexperienced editors. But this creates a new concern: the absence of new editors, accentuating an emerging wikigeneration gulf and the recently highlighted new editor retention collapse, and strongly supporting the need for a more usable, intuitive interface.

Between 79 and 82 percent of responders were over the age of 18; more than 90 percent had completed secondary schooling, and 63 percent had an undergraduate degree or postgraduate qualifications. In short, new page patrollers are not much different from the rest of those who edit Wikipedia.

Demographics: new page patrollers by...
Gender: 89% male.
WikiAge: 60 percent editing since 2006
Decade of birth: 79 to 82 percent are over 18


Editing activity

Patrolling distribution was found to have a prominent Long Tail-distribution, with 89% of the work done by 25 percent of the patrollers. The report states that "we clearly need to make involving more users, and involving patrollers to a greater degree, a priority". 64 percent of new page patrollers spend between 1 and 3 hours a day reading and editing Wikipedia. 46 percent of patrollers have made 10,000 edits or more; this is a marked difference from the Editor Survey 2011, in which only 20 percent of editors had reached this count, more evidence against the belief that many patroller problems come from inexperience. In terms of user rights, more than half of patrollers were rollbackers and reviewers, and more than 40 percent had autopatrol rights.

A question on non-patrolling activities found that patrollers did other things: 97 percent were active in anti-vandalism efforts in some form, and 95 percent were adding content to Wikipedia, by creating and editing new articles. As expected, Articles for Deletion, speedy deletion, and similar venues were tied in with the process, and many editors who patrolled new pages also participated in discussions there.

An analysis of tool usage found that a large percentage of page patrollers were aware of and use semi-automated tools like AutoWikiBrowser and especially Twinkle in their work.

Editing activity: new page patrollers...
Editors who are most active at Special:NewPages, showing a long tail type distribution
Editors by edit count, showing double the usual rate for users over 10,000 edits
Editors by user-right, with high numbers who are rollbackers, autopatrolled, and reviewers.


Patrolling

The vast majority of patrollers originally learned about New Page Patrol passively, for instance from a userbox on somebody's userpage that advertised New Page Patrol, or through seeing a new page at Special:RecentChanges and navigating to Special:NewPages from there. Most patrollers also give positive reasons for their motivation: they want to "keep vandalism and bad-faith pages out of Wikipedia" (83 percent) and "watch over the quality of new articles" (80 percent). 35 percent of patrollers were motivated because it "provides experience that may be valuable further down the road".

Activity is unevenly distributed, with almost 40% of patrollers spending 1 hour or less on New Page Patrol per week. How long it took to patrol an article was more evenly distributed, splitting half-and-half at the one-minute line. 28 percent of patrollers found that "trying to decide what should be deleted/if something should be deleted" was the most stressful part of the job, with the rest listing a slew of other reasons. An overwhelming majority have read the various relevant deletion guidelines, and very nearly 100 percent have read the speedy deletion guidelines. About 45 percent of editors reviewed from the front of the Special:NewPages buffer (the newest articles), just under 30% from the back, and 15% chose "Other."

New page patrolling habits
Hours per week spent patrolling.
Policy awareness—an overwhelming "yes."
Where new page patrollers patrol from


Improvements

By 53 percent to 45 percent, a majority of patrollers disagreed with the implementation of a patroller user-right. If the right were to be instituted, the largest group of respondents felt that it should be granted automatically at some point, with slightly less support for distribution through Requests for permission. Based on the results, the survey concludes that "it seems clear that some variation on 'X edits and Y months as an editor' is likely to be the most acceptable criteria, but ... any attempt to get firm consensus on this point, whether made by the community or by the Foundation, is likely to be drawn-out and gruelling." Finally, 60 percent of new page patrollers wanted to see technical changes implemented, while 20 percent wanted to see cultural and policy-based decisions. The remaining 20 percent did not comment, or felt that the current system basically works.

The future

What now? The results of the NPP Survey clearly refute many of the views held of patrollers by their critics; nonetheless, it is often the patrollers themselves that are clamoring most loudly for changes.

According to the survey's conclusion, "the next step for the Foundation is to use this data to continue developing the Zoom interface. We have already identified a representative sample of patrollers, and contacted them for detailed interviews and to provide "screencasts" of their patrolling work. With these, developers can examine the process of patrolling, get more details on precisely how patrollers do their work, and try to identify unnecessarily difficult areas that can be simplified to make patrolling easier."

The survey broke new ground by giving everyone a clearer view of a subset of Wikipedian editors; perhaps, in the future, similar surveys will provide detailed information about other specialized groups of editors.

The New Page Patrol survey was conducted by the Wikimedia Foundation's Oliver Keyes, with contributions from staffers Howie Fung and Dario Tarborelli and from Wikipedians Kudpung and Tom Morris. The raw data, sans gender and contact information (per policy), will be made available to anyone willing to sign a non-disclosure agreement (requests can be sent to okeyes@wikimedia.org).

Have a keen interest in or strong feelings on new page patrolling or another issue of relevance to the English Wikipedia community? The Signpost is recruiting reporters and soliciting opinion essay submissions; those interested should apply to wikipediasignpost@gmail.com, leave a note in the newsroom, or contact an editor directly.


















Wikipedia:Wikipedia Signpost/2012-02-20/Special_report