The Signpost


Op-ed

Wikipedia needs more administrators

RfA was founded on 14 June 2003 by Camembert, and the first promotion via the system, that of Quercusrobur, occurred on the same day. Before the invention of RfA, admin promotions took place through mailing lists. The first discussion on WT:RFA was started by Tim Starling six days later on 19 June, which was a humorous discussion. Discussions similar to the ones we have today began soon after, with the first apparent one concerning election standards. The first serious complaint about the process appears to have been made by Greenmountainboy on 8 January 2004, in a thread called "Attacked by everybody", in which he stated that RfA had turned into a place where everyone attacked each other. Most disagreed with the assertion.

As long ago as 2006, Aaron Schulz had recognised in an essay the same issues that have been perennially discussed for nearly a decade. The first serious RfA reform project, known as WP:RFA2011, was launched in 2011. It was created by Kudpung in his userspace on 25 March, and upon encouragement by others he subsequently moved it to Wikipedia space. The project accumulated a task force of over forty established editors, including senior Wikimedia Foundation staff. The launch of this project followed a comment made by Jimmy Wales that March, in which he stated that RfA was a "horrible and broken process". This comment was in response to the retirement of My76Strat (now John Cline) due to his failed RfA. Large amounts of data were compiled, but unfortunately no proposals were put forth as a result of the project.

Following RFA2011, the next serious reform project occurred in 2013. It consisted of a series of three RfCs, starting in late January and ending in early April. All proposals which survived to Round 3 failed. To my knowledge, there have been no large-scale reform projects since.

We need more admins

Why?

Wikipedia currently has about 1,330 users with the sysop user right. At a glance, this seems like a large number. Therefore, some might say that we have more than enough admins. What's with all this fuss over the years about needing more? It is important to realize that the raw admin count is a deceiving number. Using the AdminStats tool, I determined that assuming an activity standard of at least 30 admin actions in 2 months (adapted from this standard, except that I changed it to admin actions, which is more relevant, rather than simply edits), only about 250 of our admins are active! This means that of our 1,330 admins, only about one-fifth (20%) actively contribute to administrative work. To look at this another way, 80% of users who have the sysop bit are (semi-)inactive as admins. It may occur to some that we can fix this problem by getting inactive admins to return to activity. However, many users become inactive for reasons beyond our control, such as loss of interest or inability to continue editing.

Now, some might even feel that 250 is sufficient, but the size of this website must be considered. For a small wiki, 250 admins would be more than enough. However, Wikipedia has almost five million articles, dozens of vandals to block every day, numerous noticeboards to monitor, and administrative backlogs that are always growing. According to Alexa, we are the seventh most popular website in the world, even surpassing Twitter, which ranks ninth. We have a relatively tiny group of a couple of hundred admins to handle all this. Many of these active admins have performed hundreds, or sometimes even thousands, of admin actions within the past two months. Yet the backlogs still exist. What must this mean? It can only mean that we don't have enough admins. By depending upon a relatively small group of admins to perform hundreds or thousands of actions in a short time, we first of all put too much burden upon these individuals. Secondly, the retirement of even a few of these admins, especially those who perform many thousands of actions within a short period of time, would cause a noticeable increase in work for the others. This is a WP:VOLUNTEER service. It is more fair to all users to distribute the workload more evenly.

Stats

Since January 1, 2015, there have been 47 closed RfAs as of October 3, 2015. A mere 15 of these (about 32%) were successful, and 32 were unsuccessful. This means that, on average, RfA has been responsible for only 1.7 promotions per month. Such a low number was unheard of a few years ago. In fact, months with no promotions at all are becoming more common. The first month with no promotions in recent years was September 2012, and that was the first in over a decade. However, just over the past year, 3 out of 12 months (25%) have been without any promotions. The problem is simply becoming worse. If you look at WereSpielChequers' chart, you will see a total of four empty months under the "2014" and "2015" columns.

However, we have another method of getting "new" admins: when ones who have previously resigned request a resysopping. Since the beginning of the year, 10 users have requested resysopping at WP:BN for adminship they had lost before the start of 2015, not counting the three who regained their adminship via RFA. So, by adding this number to the number of admins sysopped via RfA (10 + 15), we get 25.

But there are two other questions to be asked. Namely, these questions are: (1) How many admins have we lost? (2) How many (re)sysopped users are actually active admins? To answer to first question, about 65 users have been desysopped this year, for varying reasons. Secondly, it turns out that although 25 users have been sysopped, only about 20 meet the activity standard of 30 actions over the past 2 months. Therefore, we are losing admins three times faster than we are really gaining them. (After all, we really haven't gained an admin if they contribute very little.)

Back a few years ago, this was not a problem at all. For instance, a record 408 admins were promoted in 2007. Even before that, the promotion of a few hundred admins per year was the norm. However, since 2008, the number of promotions has been perpetually declining. The chart at the top of this article, based upon WereSpielChequers' data that I previously mentioned, shows the number of RfA promotions per year since 2002. The number of promotions decreased sharply in 2008 and has been in a state of perpetual decline ever since. There has not been a single year since 2007 in which there was a considerable increase in promotions. The last year in which there was an increase was 2013, and even that was only by 6. The difference mostly seemed to be in February and March, which was when the reform RfCs were occurring. It therefore looks as if it may have merely been a brief surge inspired by the reform efforts.

What happened?

Why has this decrease happened? In my opinion, two of the most likely reasons are: (1) Higher standards; (2) Hostile/stressful environment. It could also be a combination of these two.

I will start with the first possibility. Current (Oct. 3, 2015) data from User:Everymorning/RFA study shows that the median successful 2015 RfA candidate has eight years of experience and 41,000 edits. The average for 2015 candidates is 7.2 years of experience and 36,500 edits. (Note that I excluded Ser Amantio di Nicolao's RfA, since he had over one million edits and therefore would have a disproportionate impact upon the average.) Although the details may fluctuate slightly, using these statistics we can broadly conclude that the typical 2015 RfA candidate has around six to nine years of experience and 30,000–50,000 edits. If this really is the standard, this is much too high. However, simple statistics such as this might not be of much worth. After all, we have no way of knowing whether or not the numbers I gave in the paragraph above are reflective of the actual standards. It might, or it might not. Perhaps it's just a coincidence that users with such high statistics choose to run. The only way to find out what the experience standards are is to get a less experienced user to run.

However, simple tenure and edit count stats are far from being the only things measured at RfA. Some users who have even more edits and experience than the range I mentioned above have failed. Performance, such as scope of participation, accuracy rates, and behavior, is considered as well. And of course, these things should be considered to a certain extent. However, when these things are scrutinized to an unreasonably high degree, the standards will become higher, and when the standards become higher, fewer candidates will pass. For instance, it is relatively common to oppose a candidate because their "hit rate" at AfD isn't good enough (how does a "hit rate" affect their ability to judge consensus?), or they haven't made (number) of edits to a particular administrative page (even if they don't want to work there, or have said they will proceed very cautiously). Some users have quite stringent requirements concerning content. This has been a rather major theme as of late, so I will discuss it in some detail.

It has been becoming more apparent that lack of substantial content work will actually cause an RfA to fail. For instance, a certain user recently said, "The purpose of admins should be to keep the riff-raff away from the content creators." Although he is partially correct, this isn't entirely true. The purpose of admins is to keep order throughout the site. If this means blocking a content creator who is in some way causing disorder, that is also part of an admin's job. All good-faith editors have a beneficial function. Gnomes and copy editors fix errors and formatting issues that a content creator might not notice, while users dedicated to anti-vandalism (including admins) do indeed keep the riff-raff away from the content creators' articles by reverting and blocking vandals who harm articles they have written. In the early days of Wikipedia, it is true, content creation was more important than anything else. However, as the website has grown in size and popularity, the importance of maintaining it has increased as well. Without admins, uncivil users would be unrestricted and could do or say whatever they wanted, vandals could just continue vandalizing articles no matter how many times they were reverted, etc. Without anti-vandals, the content creators would have to be online 24/7 to monitor all their articles. In short, Wikipedia would plunge into ruin. Now, before I'm misunderstood, I fully support content creation, but what I am opposed to is the notion that other user groups are unimportant. I fully appreciate and in fact admire the tireless content work of some users.

My ultimate point with the paragraphs above is that high standards do not do anything to fix our obvious admin shortage problem. If we are to gain more admins, we must not be so restrictive as to who becomes one.

If the !voters' opinion cannot be changed, one way to neutralize overly-stringent criteria is to lower the percentage bar for passing. This is a solution I very strongly advocate. I know this has been proposed and rejected several times before, but it's high time that we start again with fresh and open minds to seriously debate and consider it. Remember, RfA is currently in a condition drier than it has ever been in almost all the history of Wikipedia. We must face the facts: currently, our bar is unlike that of virtually any other group. In practice, it seems to be somewhere around 75%, since most RfAs which get more support than that tend to pass. 70–75% (and rarely, 75–79%) sometimes results in a 'crat chat (a decision by Bureaucrats), but 'crat chats are in fact quite rare. In any case, an RfA usually doesn't pass if it concludes in the low 70s. The United States Congress passes laws by simple majority (50%+1), and even the 67% requirement to overturn the President of the United States' veto is less than this bar. Of course, electing an admin for an online encyclopedia is nowhere near as important as making binding laws for one of the most powerful nations existing. As another example, very few users in the ArbCom elections get 75%+ support. If that would have been the standard for last year's election, only two candidates would have passed. Furthermore, the position of arbitrator holds many more responsibilities, some of which can impact the project in a manner far greater than any individual admin ever could. Arbitrators also gain automatic access to the checkuser and oversight tools, which can have serious privacy implications.

Even if the contrasts above are inaccurate for some reason or another, there is one final issue, which is arguably the most important. Oppose !votes currently carry about three times more weight than support !votes. For instance, for every six opposers, at least eighteen supporters are required to cancel them out. Why should opposers have so much power? We should assume that the candidate is running in good faith; therefore, why give so much weight to the negative side? It makes more sense for every !vote to be given equal consideration, which would mean a 50%+1 bar for passing. Or, to preserve the discretionary range, maybe the bar could be 60% with 50%+1–59% being the discretionary range. In any case, the point here is that in comparison to virtually every body outside us, our bar is very high, and in the interest of truly giving more equal weight to both opinions, our system should not give three times as much power to a single dissenting opinion.

But some people object that we cannot be more lenient in passing candidates at RfA, because if they misuse the tools or are abusive, it is virtually impossible to remove them. This is simply false. There are multiple venues by which admins can be held accountable. If they are being uncivil, they can be blocked like any other user. If they are bothering a particular user incessantly (e.g., WP:HOUNDING them), they can be interaction banned like any other user. If they are generally abusing their tools, they can be taken to ArbCom. ArbCom almost never completely dismisses a good admin abuse case. They can choose to deal with it quickly by motion, or they might choose a complete case in more unclear situations. Of course, realize that ArbCom doesn't have to desysop every admin brought before them, so the dismissal of some cases cannot be used as an example of "failure". They may, for instance, decide that the incident was isolated and not part of a general abusive pattern. We all make isolated mistakes. Now, I would prefer that the wider community have the ability to desysop admins, but since no one can ever fully agree on a satisfactory method, we'll have to use ArbCom for now. ArbCom may sometimes take a considerable time to authorize the desired result, but it is generally effective at holding continually troublesome admins accountable. Whenever evidence is requested from those who assert that there is no effective method by which admins can be desysopped, there never seems to be a clear answer. If the assertion was really true and worthy of consideration, its proponents should be able and willing to present real, solid evidence that ArbCom is chronically ineffective at dealing with patterns of abuse.

On to the second point, it is possible that potential candidates might be discouraged from running because of what they perceive to be a hostile and/or stressful environment at RfA. Some recent RfAs, such as that of Montanabw, Wbm1058 and Liz, were the subject of much contention and accompanied by very lengthy talk pages. Wbm's, in particular, was one of the most intense in a long time. Virtually all recent candidates have also been asked dozens of questions within literally a day or two. This environment might very well be a factor in our admin shortage.

How do we fix the problem?

Fixing our admin election system would be a three-step process. First of all, we must discuss, and reach a consensus upon, what the major problems are. Next, we determine how to fix the problems. These two steps, of course, might require a long time and several discussions per issue. But, if this discourages you, read the last paragraph of this section. I personally see three main solutions for reforming our admin election process: (1) Have the voters see that their standards must be changed; (2) Lower the passing bar, as I suggested above; (3) Completely change the process. Then, we implement the solutions. The current method is very disorganized (e.g., "maybe this is it ... well, maybe not/perhaps it is ... [discussion eventually dies]"). If anything is to be done, it must be in an orderly manner.

Secondly, what is done must be for the long-term. Last year (around this time, in fact), there was a surge of nominations following some discussion of revolutionizing the process. Short-term surges do nothing to fix the long-term issue. We always get into a vicious cycle: Discuss changes → More nominations → People say, "It really does work after all!" → Number of nominations dies down again → Cycle repeats. No, it is not working. The current condition of our admin election process is resulting in its long-term failure. We must not be deceived when brief rises in the number of nominations and passes come about.

Remember that the problem will simply grow worse if we give up easily; we must continue until we find a solution. Otherwise, we might not have time to undertake a organized, reasoned RfA reform process if the problem ever forces us to realize that there really is a problem and therefore take action in a relatively short period of time.

Notes


















Wikipedia:Wikipedia Signpost/2015-09-30/Op-ed