The Signpost
Single-page Edition
WP:POST/1
19 June 2013

Op-ed
Two responses to the 'Tragedy of Wikipedia's Commons'
Traffic report
Most popular Wikipedia articles of the last week
In the media
South African learners want Wikipedia; Editing of Israel topics
WikiProject report
The Volunteer State: WikiProject Tennessee
News and notes
Swedish Wikipedia's millionth article leads to protests; WMF elections—where are all the voters?
Featured content
Cheaper by the dozen
Discussion report
Citations, non-free content, and a MediaWiki meeting
Technology report
May engineering report published
Arbitration report
The Farmbrough amendment request—automation and arbitration enforcement
 

Wikipedia:Wikipedia Signpost/2013-06-19/From the editors

2013-06-19

Most popular Wikipedia articles of the last week

The season finale of Game of Thrones ensured that the epic high fantasy series would dominate the top 10 again last week; however, it was joined by the perennially popular children's author Maurice Sendak, whose 85th birthday was celebrated with a Google Doodle, and by the number one movie of the week, Man of Steel. Politics rarely impacts the top 10, but the controversy over the PRISM surveillance program proved too potent to miss.

Please see here for the top 25 articles of the week, plus analysis.

For the week of 8 to 15 June, the ten most popular articles on Wikipedia, as determined from the report of the 5,000 most trafficked pages* were:

Rank Article Views Notes
1 Maurice Sendak 1,717,368 A Google Doodle to celebrate the children's author's would-have-been 85th birthday sent almost 2 million people to his Wikipedia page.
2 Man of Steel (film) 1,117,658 The second attempt to rework the Superman mythos for modern cinema, (after Bryan Singer's Superman Returns) this film earned $125.1 million over its first weekend, setting a record for the month of June.
3 Game of Thrones 1,000,649 The season finale of this popular TV show drew 5.39 million viewers; its highest rating ever.
4 State of Decay (video game) 715,148 Much anticipated zombie apocalypse video game.
5 Game of Thrones (season 3) 600,721 See #3 above
6 List of Game of Thrones episodes 590,697 see #3 and #5 above
7 Facebook 580,390 A perennially popular article.
8 Edward Snowden 576,664 The PRISM program whistleblower became the major discussion point in the news this week.
9 PlayStation 4 519,716 Sony unveiled their addition to the already controversial eighth generation of video game consoles, to positive reception.
10 The Last of Us 500,214 Another much-anticipated post-apocalypse video game which was released on June 13.

Notes:


2013-06-19

South African learners want Wikipedia; Editing of Israel topics

South African learners lobby for data-free Wikipedia access

Memeburn.com published an article on the yearning of students in South Africa for free knowledge through Wikipedia Zero. Students from Sinenjogo High School have written letters to four major mobile phone companies requesting access to Wikipedia Zero, but the response entailed "little enthusiasm". According to the article, only 21% of South African schools have libraries and access to computers is very limited:


Managing editor of WorldWideWorx.com Arthur Goldstuck agrees. He said that giving kids free access to Wikipedia would go a long way to solving some of South Africa’s education problems.

When asked about the specific request of the students, as well as the future of open educational resources on Wikipedia, Kul Wadhwa, Head of Mobile and Business Development for the Wikimedia Foundation (which encompasses Wikipedia Zero) called the students inspirational, saying "We were truly inspired by this grass roots movement, and we hope that this will open up a larger dialogue about the need to make open educational resources available to everyone in a way that can be delivered to them. This is really what Wikipedia Zero is about."

In an article by IOL SciTech, the author discussed the visit by WMF storyteller Victor Grigas to the high school where he filmed a documentary about their efforts, which will be available later this year. Grigas was quoted in the article as saying "the learners are so sharp and determined to better themselves. The teachers were amazing too. You can’t spend a day there and not feel inspired." Grigas also posted to the Wikimedia-l mailing list on June 19 asking for collaborators on this project.

Partisan editing of Israel topics

Israeli newspaper Haaretz reported on the recent indefinite block of Soosim (talk · contribs), described as "Arnie Draiman, a social-media employee of NGO Monitor". The story, also carried by France 24, says Draiman edited English Wikipedia articles on the Israeli–Palestinian conflict "in an allegedly biased manner".


Draiman had been active in Wikipedia for several years, but had increased his participation in 2010 after taking a position at NGO Monitor, on whose website he is listed as the member of the Communications Department responsible for online communications. At 91 edits, he was the most frequent editor of the Wikipedia article on NGO Monitor, which he began editing in May 2010.


Wikipedia administrator Jan Nasonov told Haaretz that biased editing of organisations like NGO Monitor is "unfortunately not all that uncommon on Wikipedia", pointing out that it is difficult to prove. Neither NGO Monitor nor Draiman provided a comment to Haaretz, though Draiman, who had revealed his name to another user on Wikipedia five years ago, before his employment with NGO Monitor, disputed the sockpuppet and meatpuppet allegations against him on Wikipedia and stated that his edits were in compliance with Wikipedia rules.

In brief

  • Google quietly kills quick view for Wikipedia results in mobile search: An article in Techcrunch.com noted that a "quick view" feature which loaded a Wikipedia page in a matter of milliseconds has quietly disappeared without direct, succinct explanation from Google.
  • Creative Wikipedia edit shows us the winner of the next-gen console wars: PC & Tech Authority reported on Tuesday that the article List of burn centers in the United States was vandalized, saying that an editor added "Sony Entertainment acted as the burn center for Microsoft employees following E3 2013", in a slam against Microsoft.
  • How are museums collaborating with Wikipedia?: A group of articles in the journal Museum Practice gives an overview of collaborative projects for museums who wish to work with Wikipedia, including GLAM initiatives. The suite includes a piece on the challenges and successes of WIkipedia-museum collaborations, guides to hosting edit-a-thons, having a Wikipedian-in-Residence, and digitization, as well as case studies, an overview of QRPedia, and inspiration for smaller museums that wish to work with Wikipedia.
  • Wikipedia’s "Human" Entry Is Charmingly Alien: The Motherboard blog published a short piece exploring the article human, noting that it seems to have been written by "either extraterrestrials or our reptilian, shape-shifting overlords. Or both."


2013-06-19

May engineering report: Flow enters consultation phase and other headlines

May engineering report published

In May:
  • 124 unique committers contributed patchsets of code to MediaWiki (stable)
  • The total number of unresolved commits stood at 960 (up 145 from April)
  • About 87 shell requests were processed (up 38)
  • Wikimedia Labs now hosts 165 projects (stable) and has 1382 registered users (up 158).

—Adapted from Engineering metrics, Wikimedia blog

The WMF's engineering report for May was published recently on the Wikimedia blog and on the MediaWiki wiki ("friendly" summary version), giving an overview of all Foundation-sponsored technical operations in that month (as well as brief coverage of progress on Wikimedia Deutschland's Wikidata project and Wikimedia CH's Kiwix offline reader project, which the report noted, recently released its first version for Android). Although the ten headlines items will be the major focus of this "Technology report", the WMF-led publication also contains a myriad of updates about smaller initiatives which interested users should peruse at their leisure.

As has been the trend in recent months, the choice of headlines mirrors the use of blogposts on the Wikimedia Techblog. Among the teams to blog the most, the Foundation's Language Engineering team wrote of their efforts to attract an intern, deploy the UniversalLanguageSelector, and make it easier to internationalise an external MediaWiki installation. Another busy team was that focussed on the Foundation's "Wikipedia Zero" project, aimed at giving free access to Wikipedia in developing nations via portable devices. The team reported that during May they had "[worked to launch] Wikipedia Zero in Pakistan, refactored its legacy codebase, migrated configuration from monolithic wiki articles to per-carrier JSON configuration blobs, generated utility scripts, patched legacy hyperlink redirect and content rendering bugs, and supported partner on-boarding" against the backdrop of widening adoption. Finally, the Foundation's soon-to-be-flagship project to improve talk pages, Flow, entered its community consultation phase during April.

Highlights from last month's report, which the Signpost did not report extensively at the time, included details on an area that the Foundation has recently begun to hire in – multimedia engineering – with the commitment to ensure that "contributing an image or video to an article while you’re editing does not require leaving the “edit mode”; as this month's report notes, however, the Foundation is still having to fix bugs in its media handling backend, as well as its core TimedMediaHandler video player, which appear to be more likely targets for development in the interim. A second featured another cornerstone project, Wikidata, in the wake of news that Russian technology firm Yandex is to donate €150,000 to support its development. Entitled "The Wikidata Revolution", the blog post details the march of Wikidata's second (infobox) phase, while the Wikidata team has more recently announced progress integrating new datatypes, including date-time and geocordinate displays.

Though neither monthly report commented greatly on any disappointments the Foundation has had over the past two months, it is clear that many of the perennial concerns – project delays, variable community resistance, and code review – remain ever present worries. Commenting on the last of these, the report noted that WMF Technical Contributor Coordinator Quim Gil has been "preparing a proposal to get automated community metrics" with the potential to help the Foundation better understand the health of the volunteer community given the spiraling number of unreviewed (but still open) commits.

In brief

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.

  • Many (if not most) recently developed features have compatibility issues with older version of Internet Explorer, an analysis showed this week (wikitech-l mailing list). The flagship VisualEditor project, for instance, intentionally does not support IE6, 7 and 8 (combined usage: 7.5% of Wikimedia pageviews) and unintentionally excludes a further 6.76% by virtue of their using IE9.
  • The Google Summer of Code code period is officially underway (also wikitech-l).
  • The latest version of MediaWiki (1.22wmf7) was added to test wikis and MediaWiki.org on June 13. It will be enabled on non–Wikipedia sites on June 17, and on all Wikipedias on June 20.
  • A report on mobile upload errors was published, and software changes to reduce their number has been promised.
  • The Narayam (non-Latin script input) and WebFonts extensions were successfully replaced with the Universal Language Selector across all applicable Wikimedia wikis on June 11. It will now be rolled out to wikis which did not have either predecessor extension, including the Catalan (ca), Cebuano (ceb), Persian (fa), Finnish (fi), Norwegian Bokmål (no), Portuguese (pt), Ukrainian (uk), Vietnamese (vi), Waray-Waray (war) and Chinese (zh) Wikipedias on June 18 (wikitech-ambassadors mailing list). In related news, two new webfonts (UnifrakturMaguntia and Linux Libertine) will shortly be added to wikis that use Universal Language Selector to further help avoid the presence of unrecognised and/or unsupported Unicode characters (which would otherwise appear as a string of ���s)..
  • A patrolling link will now be visible for un-patrolled pages, even if users do not originate their page request from Special:NewPages or Special:RecentChanges (bug # 49123).

Wikipedia:Wikipedia Signpost/2013-06-19/Essay Wikipedia:Wikipedia Signpost/2013-06-19/Opinion


2013-06-19

Swedish Wikipedia's millionth article leads to protests; WMF elections—where are all the voters?

Swedish Wikipedia reaches one million articles with a bot

With Erysichton elaborata, the Swedish Wikipedia passed the one million article rubicon this week, following closely on the heels of the Spanish Wikipedia last month. While this is a mostly symbolic achievement, serving as a convenient benchmark with which to gain publicity and attention in an increasingly statistical world, the particular method by which the Swedish site has passed the mark has garnered significant attention—and controversy.

The Swedish Wikipedia, alongside the Dutch and much smaller Wikipedias, is one of the few to allow bots—semi-automated or automated programs—to mass-create articles. Using this method has allowed them to leap from about 968,000 articles in May to about 1,044,000 now, with about 454,000 of them being bot-created. This puts them as the fifth-largest Wikipedia, up from ninth just one month ago, and the same method has pushed the Dutch past the Germans, who had long held the title of second-largest Wikipedia. By comparison, the Polish Wikipedia, which had a similar total to the Swedish in May, is now at 973,000 articles.

The Dutch and Swedish totals come despite their far smaller userbases—for example, the Germans have an active userbase that is five times the size of the Dutch and eight times the size of the Swedish. By the same metric, the Polish are twice the size of the Swedish.

The bot-created articles themselves are basic enough: they are about four sentences long, with an infobox and sources from a common database. Each article is tagged with {{Robotskapad}} a template that notes its origins. Before it received attention for the achievement it represents, Erysichton elaborata provides an excellent example.

The Signpost contacted the bot operator, Lsj, for his thoughts. He told us that the idea for bot-created articles came from the Dutch Wikipedia and an idea mentioned on the Swedish equivalent of the Village Pump in early 2012. While a "handful" of editors were "adamantly opposed", the great majority were in favor. Several smaller trials were conducted before the large-scale project that led to the millionth article, including on birds and sponges.

He told us that bot-created articles can offer significant benefits to Wikimedia communities: "human minds should not be wasted on mind-numbing tasks that a machine can do equally well. Let the machines do the grunt work, and let humans do what requires real intelligence." Bots are also better and far faster at repetitive tasks than humans, who can inadvertently introduce errors. Any bot errors, which in an ironic twist are typically kindled human mistakes, can usually be fixed by a second bot run, similar to what Lsjbot will be doing to add images to the biological articles it has created.

The very concept of bot-created articles, though, has garnered significant opposition in the Wikimedia community as a whole, particularly from German Wikipedians. The prominent editor Achim Raschka authored a piece in the German-language news outlet Kurier. He lamented the Swedish Wikipedia's "bitter" milestone, which puts a spotlight on an article that has little more than "their existence and taxonomic pigeonholing" and omits key information like where the species lives or what it does. Raschka told the Signpost that these stub articles impart little useful information to readers—he asks, "who could be helped with [these] fragment[s] of data?" He also pointed at an entry Denis Diderot wrote for the Encyclopédie, titled "Aguaxima":


... the bot is always right, uses a neutral language, forms complete sentences, provides verifiable facts and makes no trouble, unlike us human authors. It knows ... correct formatting, rarely [vandalizes], addresses no other authors offensively, sought no barrier tests, never complains and is easily turned off without resistance. There are no bots with gender bias and of course no problems with the author leaving the site. If in any topic people are missing, there is no problem, as the programming of a few new bots by specially trained bots, perhaps with steward rights, proceeds rapidly. They are absolutely reliable even with a vote. ... We simply need to take note: Bots are better Wikipedians, our days are gone. We have only consumption, sex and drugs. But this does not have to be bad, right?

Schlesinger, "Die Zukunft heißt Botpedia," 16 June 2013.

A separate Kurier article by Schlesinger, which hyperbolically compared the bot-created articles to the famous novel Brave New World and claimed that bots can and will replace human editors, is a non sequitur. While bots can create article shells and—as can be seen on the Swedish Wikipedia—even short stubs, they can never be programmed to mass-create detailed articles capable of becoming featured or even good articles.

There was also extensive discussion on the Wikimedia-l mailing list and a Wikipedia blog post.

Lsj was unaware of the wider German-language attacks on bot-created articles, but after examining them, found that they were principally based in deeply held principles, making them difficult or impossible to provide an effective counter-argument.

In reply to Hubertl's sarcastic mailing list post, Lsj commented that the statistics, including view counts, editor numbers, and participation, contradict Hubertl's argument.

Still, a major problem could come from human error. Lsj acknowledges that source materials' errors could then creep into articles, but explains this by saying that a second bot run would fix the problem. The obvious rhetorical reply is simple: what if an error only creeps up every so often and is not fixable by bots? What if these errors are not caught until a significant amount of articles are created? A small base of active users may not be able to deal with the required cleanup.

Despite the risks, carefully planned bot-created articles could hold significant benefits for the Wikimedia movement. As Lsj told the Signpost:


While German-language Wikipedians lament the loss in quality in these programmatic articles, especially when compared to their stringent biology project guidelines, a short article may be better than none at all. This advantage is particularly apparent in smaller languages, whose Foundation projects have few editors and limited sources of information on the Internet, but far less so for wikis with larger userbases and article counts. It remains to be seen if more wikis will choose to bolster their content in this way.

This article was updated with comments from Achim Raschka.

Low voter numbers in WMF elections

Voter turnout by day, showing the onset and the effects of emailed reminder notifications halfway through the election period.

With little more than a day before voting closes for the WMF elections for three community seats on the ten-member Board of Trustees, fewer than 1700 Wikimedians out of a purported 90,000 active editors have turned out to vote—about one in every 50. This compares with a vote of almost 3500 in the last elections for these two-year seats, in June 2011.

Voter proportions by language
Arabic is spoken in 27 nation states by nearly half a billion speakers; but where are the voters?
The disappointing rate of participation is despite a lengthy pre-election period and almost two weeks of voting, with banners on all WMF sites and reminder emails sent out. The graph shows the day-by-day vote until the time of publication. The typical spurt of interest followed by a rapid fall-off in numbers occurred twice: once at the open of voting on 8 June, and once a week later on 15 June, corresponding to the distribution of email notifications.

Risker, a member of the volunteer election committee, commented: "It is lower than I would have expected ... It may be that the active community of 2013 is not as interested in the 'meta' aspects of the Wikimedia movement as in the past, as we have mostly followed the same processes as existed over the past several elections. Or it could be something entirely different. It's generally much harder to figure out why people don't do things than why they do them."

Of the 1659 votes cast at the time of writing, 592 (35.7%) are from English-language sites, 221 (13.3%) German, 157 (9.5%) Italian, 153 (9.2%) French, 82 (4.9%) Spanish, 55 (3.3%) Commons, 48 (2.9%) Polish, 41 (2.5%) Chinese, and 310 (18.7%) from all other languages.

Other languages on the radar are Japanese (27 voters) and Indonesian (12)—both welcome signs of the beginnings of a closer engagement with the worldwide movement—and Hebrew (10), Finnish (9), Danish (7), and Norwegian (7).

A notable disappointment is Hindi, with one voter out of some 200 million native speakers and a significant number of second-language speakers—the fourth-most-spoken language in the world—and an active and growing offline movement in the subcontinent.

Arabic, counting all dialects, has well over 400 million speakers, including 300 million native speakers, but managed to garner only four voters; this is despite a marked shift from the English and French Wikipedias to the Arabic Wikipedia in Arabic-speaking countries, and a successful start to a WMF education program in Egyptian universities.

Editors can vote until UTC 23:59 Saturday 22 June, by clicking on this link to the SecurePoll interface. Instructions on voting and information about candidates is at Meta. The close of voting corresponds to Saturday afternoon to evening in the Americas, before sunrise on Sunday morning in the Subcontinent, and early to late Sunday morning in East Asia and Australia/New Zealand.

In brief

  • Wales portrait with an odd backstory: A portrait of Jimmy Wales that was painted with a person's penis was the subject of a Commons deletion discussion, alongside a video of how the image was created. The portrait was uploaded by and possibly requested by Russavia, who was unblocked just months ago via an arbitration appeal. (The Signpost has carefully worded this due to Russavia's persistent refusal to give a definitive positive or negative answer when asked in multiple locations if he inspired the image's creation.) The discussion is leaning towards keeping both. Russavia was indefinitely blocked from the English Wikipedia last week.
  • New hires: The Wikimedia Foundation has brought four temporary community liaisons on board, including users Elitre, WhatamIdoing, The Interior, and Keegan. They have also hired a new director of analytics, Toby Negrin.
  • Privacy policy: The Foundation is asking for community input in formulating a new privacy policy on its projects. The move comes after the recent PRISM scandal in the United States, which drew a Foundation response.
  • Happy birthday: The Foundation is now ten years old.
  • South Africans want free access to Wikipedia: A Facebook campaign to allow free access on cellphones in South Africa so students can do their homework has inspired a WMF blog post. Related coverage is in this week's "In the media". The students state:

Wikipedia:Wikipedia Signpost/2013-06-19/Serendipity


2013-06-19

Two responses to "The Tragedy of Wikipedia's Commons"

Following last week's op-ed by Gigs ("The Tragedy of Wikipedia's Commons"), the Signpost is carrying two contrary opinions from MichaelMaggs, a bureaucrat on Wikimedia Commons, and Mattbuck, a British Commons administrator.

MichaelMaggs

The true tragedy

The title of last week's piece, "The Tragedy of Wikipedia's commons" was perhaps rather more ironic than its author intended. One of the truly great tragedies of medieval England was not so much the tragedy of the commons in its original sense but the forcible enclosure by powerful outside interests of the historic common land that had for centuries been available as a free resource for all. If there is any tragedy here, it is in the author's wish to use Wikipedia to take over Wikimedia Commons and to do very much the same thing online.

Background and remit

Commons always has and always will have a far broader free-content remit than that of supporting the narrow focus of an encyclopaedia. Commons provides media files in support not just the English Wikipedia but all of the WMF projects, including Wikisource, Wikibooks, Wikivoyage and many more. These sister projects of Wikipedia often have a need to use media on Commons that could never be used on the Wikipedias as they are not - in Wikipedia's narrow sense - "encyclopaedic". Some of Commons' detractors like to give the impression that its collections are nothing more than a dumping ground for random non-educational content. Nothing could be further from the truth, and the energy expended by those who would criticise from the outside (but who are strangely reluctant to engage on wiki) bears little relation to the extremely small proportion of images that could in any way be considered contentious.

Commons' policies are of necessity different and more wide ranging than any of the individual projects. We hold many images that will never be useful to the English Wikipedia, and that is not only OK, but should be welcomed as Commons' contribution to the overall mission of the Wikimedia Foundation, "to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally". Note that the overall mission of the WMF is not "to write an encyclopaedia", but rather to develop and disseminate educational content. Supporting the English Wikipedia is one way, but by no means the only way, in which we do that, and the idea that Commons should be forcibly subjugated to the policies of a specialist encyclopaedia project would do immeasurable harm to the mission which I had hoped we were all working to support.

Contrary to the suggestion that the Commons policy on scope of 2008 was an "unchallenged action by a tiny group of people", it was in fact largely an exercise in documenting for the first time the unwritten long-established practices of the community. The policy attracted very little controversy (despite it being very widely advertised, on Wikipedia and elsewhere) largely because the vast majority of it was uncontentious. Indeed, the fact that it has retained very wide community support since then indicates that we didn't do too bad a job.

With its specialised emphasis on media curation and the niceties of copyright law, Commons will never be as popular a place for editors to hang out as some of the bigger encyclopaedias. It requires not only a particular set of interests, but also at least for admins some level of specialist knowledge which not everyone has or is interested to acquire. Those outside the local community who only see the external carping may not realise that we have thousands of very committed editors who work tirelessly in the background curating and categorising content and bringing to the attention of the admins non-educational content that has no place in our collections.

Commons has never (as was claimed last week) been merely a repository that supports its sister WMF projects. Right from the start it had a remit to make content freely available to external re-users. As early as 2006 there was a formal proposal (since implemented as InstantCommons) to integrate into Mediawiki a mechanism specifically designed to support users on non WMF projects. Perhaps the real worry of last week's author was that Commons currently holds too many non-encyclopedic images of a sexual nature. But even assuming that is true, a proposal to revoke one of the fundamental free content aims of Commons hardly seems proportionate. Instead, let's have a proper discussion on what Commons' scope should be. Times change, as do priorities, and what made sense five years ago may now perhaps need to be revisited.

Over the last few months especially there has been a lot of discussion within Commons as well as outside about issues concerning the small proportion of our holdings that relate to sexual imagery and to privacy/the rights of the subject. Both have complex moral and legal dimensions, and neither has yet been fully resolved. I've set out the main strands of argument below, as objectively as I can, for those who may not be familiar with them. Of course, these summaries are by no means the whole story, and many of the discussions are far more subtle than I have space for here, so please bear with me if you are familiar with this and feel I have mis-characterised or omitted any important point that may be close to your own heart. I deliberately make no comment on the validity of any of these arguments.

Sexual imagery

Some argue that pornographic images (as defined in some way) are never appropriate for any of the Wikimedia projects and are simply not educational.

Others argue that we should keep most images, almost whatever the subject matter, as we need to show the whole range of human experience if we are to call ourselves a comprehensive educational resource. Anything else would be censorship.

Yet others suggest that not all the sexual images held by Commons are "educational", properly defined. Some are photographs that have been taken for non-educational purposes, for example personal gratification/entertainment, and/or have been uploaded for the same purpose or by users who wish to push an extreme view that equates any limits at all with unacceptable "censorship".

Finally, some hold that Commons has too many images in certain marginally-educational areas that, taken overall, create an oppressive or threatening environment (e.g. for women) which may be harming the project as a whole.

Privacy and the rights of the subject

One strand of argument is that we should do more to respect the rights of individuals who are identifiable in a photograph, and recognise that, even where the image may be legal, it can be highly damaging to the individual. Even when an outsider might naively think the image unremarkable, it may still be considered threatening, harassing or oppressive by its subject.

Another strand is that allowing the subject of a photograph a say on whether it should stay on Commons or not opens the door to all sorts of censorship. Proponents argue it's essential that we are able to collect all types of educational image, including those that may offend the subject.

Review

If there is indeed a problem with the boundaries of Commons' scope - perceived or otherwise - we should tackle it head-on with open community discussion. Commons should be and I believe is receptive to the views of everyone within the Wikimedia community in reviewing its curatorial policies. But the way to get things changed is to engage rather than to criticise from afar.

A comprehensive review of Commons' scope is just starting now, and you need never say again that your voice cannot be heard. Please talk.

Please visit Commons' Review of Scope pages now, and make your views known for the sake of all the Wiki communities.

Conclusion

Commons has proved to be a phenomenal success in the years since its introduction, and we should be proud of what has been achieved. We should keep it, improve it, and celebrate it.



Mattbuck

Last week, the Signpost published a rather scathing op-ed about Wikimedia Commons, the Wikimedia project which seeks to be a resource of free, educational media. Perhaps you feel it presented a valid argument, perhaps not, that's for you to make up your mind on. I would like to take this chance to offer a defence of Commons.

As you probably know, Wikimedia Commons acts as a central repository for images. Once an image is on Commons, any project can use it, exactly the same way they can use their own images. It's an incredibly valuable tool for the Wikimedia project as a whole, as it prevents duplication and provides a central place to search. You want an image of something for your Wikipedia article? Commons probably has a category for it. And that is the same whether you're editing in English, German, Arabic or even Tagalog.

I first joined Commons back in October 2007, when I was working on an eclectic mix of the Ffestiniog Railway and McFly. About six months later I became a Flickrreviewr, checking uploads from Flickr that for some reason couldn't be checked by a bot, and a month or so after that I became an admin, primarily so I could deal with all the copyright violations I came across with the Flickr work. In the five years since my interest in admin duties has waxed and waned, and I had little side-projects, but Commons had swiftly become my home-wiki. My watchlist has some 60,000 pages on it, of which 10,000 are my own photos.

Commons has its problems, I cannot deny that. The number of people who believe that because they found a photo on Google it can be uploaded to Commons is simply staggering. The search engine is designed for pages not images (a limitation of the software). The community can be a bit fractured, it can be very difficult to get people blocked for being terminally incapable of working with others (even when their name comes back to the admin noticeboards week after week after week), and we have remarkably little in the way of actual policy. Indeed our main guiding principles boil down to two pages: Commons:Licensing and Commons:Project Scope. The former tells us what files we're allowed, the latter which we want. Scope is the real issue of the moment, and in a nutshell it says that Commons collects educational media. Which brings the question, "what is educational?"

A similar problem has existed on Wikipedia for years - what is notable? There are even factions - deletionists, who think articles must prove their notability, and inclusionists, who think that there's no harm in letting potentially non-notable articles stay. And so it is on Commons - those who adhere to a strict definition of educational, and those who accept a somewhat looser guide.

And this dispute would be fine, if it were argued on Commons and in the abstract. But that is not what happens. The major rift happened a few years ago, when, apparently due to a disparaging Fox News article about the amount of "porn" on Wikipedia, Jimbo Wales, co-founder of Wikimedia, came onto Commons and starting deleting sexuality images. That didn't really go over well with the Commons community, of which Jimbo has never been a part, especially when it was found he was deleting images which were in use on multiple projects. To cut a long story short, the deleted images were restored and Jimbo lost admin rights at Commons, as did several admins who had joined him in his purge. Many of the images Jimbo deleted were in fact subsequently deleted again, following deletion requests to allow for Community input. But the deed had been done, and for a large proportion of the Commons community, it appeared that Jimbo was not to be trusted to have the best interests of the project at heart.

The issue stewed for a few years, and reemerged with a vengeance last year. Again, it has been fought almost entirely over what some describe, disparagingly, as "porn". As I mentioned earlier, the Commons search engine is not really designed for images, and so it tends to give unexpected results. One of those results was the search "toothbrush" returning a picture of a woman using an electric toothbrush for self-pleasure as one of the top results. This was entirely a legitimate result - it was a picture of a toothbrush, and it was titled as such. And while the so-called "principle of least astonishment" can easily be applied to categories - Commons has a whole proliferation of "nude or semi-nude people with X" categories on the grounds that nudity should not appear in the parent category "X" - it doesn't really work for a search algorithm, not if you want to continue with correct categorisation. Until the Wikimedia Foundation develops some form of search content filter (which itself brings up issues of what exactly should be filtered - should images of Muhammed be filtered out? What about Nazi images due to German law?) all that can really be done is to either delete the image or rename it to try and reduce the chances of an innocuous search returning it. I personally favour keeping the images, and this has led me to be named as part of a "porn cabal" by people, most of whom rarely if ever edit on Commons, who favour deleting the images.

But the issue, for me, is that these issues so rarely get brought up on Commons. Instead of using the deletion request system to highlight potentially problematic images (which is after all what the process is for), the detractors would rather just soapbox on Wikipedia - usually on Jimbo's talk page - about how awful Commons is, and how this latest penis photo proves once and for all that I (or some other member of the "porn cabal") am the worst admin in the history of forever and deserve to be shot out of a cannon into a pit of ravenous crocodiles. What people don't seem to understand is that in large part, I do agree. Commons has problems. We do have too many low quality penis pictures - so many that we even have a policy on it - and so I have a bot which searches new uploads for nudity categories and creates a gallery so I can see any problematic ones, and thus nominate them for deletion. This somehow seems to make me an even worse admin in many people's eyes. We should indeed have better checks to ensure that people in sexual pictures consented to having their pictures uploaded, and I would like to see a proper policy on this. I'd like to see the community as a whole have a reasoned discussion on the matter, for a policy to be drafted, amended, voted on and finally adopted. But that is very difficult when you feel you are under attack all the time, where your attackers are not willing to actually work with you to create a better project.

Wikimedia projects are based around collaboration and discussion within the community. I would urge those of you who feel that Commons is "broken" to come to Commons and offer constructive advice. Attacking long-term Commons users will get you nowhere, nor will pasting links on other projects, or on Jimbo's talk page. If you truly want to make Commons a better place, and are not in fact just looking for any reason to tear it down, then come to Commons. Come to the village pump - tell us what is wrong, and how you feel we could do better. Use the systems we have in place for project discussions to discuss the project. Sitting back and sniping from afar does nothing for your cause, and it only embitters the Commons community.

Come and talk to us. Wikipedia:Wikipedia Signpost/2013-06-19/In focus

2013-06-19

The Farmbrough amendment request—automation and arbitration enforcement

Richard Farmbrough

Editor's note: the "Arbitration report" invited Richard Farmbrough to comment on his recent request to the arbitration committee. In an effort to represent all sides of the issue, we also asked arbitrators T. Canens and Carcharoth if they would take the time to answer some questions about the case, since they both commented on the initial request. Carcharoth declined, but T. Canens agreed to talk to us from his own perspective.

Richard Farmbrough was set to have his day in court, but as events transpired, this was not to be so. On 25 March 2013, an accusation was made against Farmbrough at Arbitration Enforcement (AE), claiming that he violated the terms of an automated edit restriction. Within hours, Farmbrough had filed his own request with the arbitration committee, citing the newly filed AE request and claiming that the motion was being used "in an absurd way" in the filing of enforcement requests: "I have not made any edits that a sane person would consider automation."

The AE arm of the arbitration committee blocked Farmbrough for one year, after receiving a go-ahead from arbitrator T Canens and without waiting for input from either Farmbrough or the community. The committee, noting that Farmbrough was blocked, then declined to consider Farmbrough's request.

Meet Richard Farmbrough

Richard Farmbrough is something of an icon in the Wikipedia saga. In 2007, Smith Magazine interviewed him as one of the most prolific editors on Wikipedia. In 2011, he was cited by R. Stuart Geiger in "The Lives of Bots" as the creator of the {{nobots}} opt-out template and an advocate of the "bots are better behaved than people" philosophy of bot development. Farmbrough is also credited with coining the word "botophobia", to make the point that bot policy needs to be as responsive to public perceptions as to technical considerations. Farmbrough described himself to the Signpost as "a reader and sometime editor and administrator of the English Wikipedia ... [I've] contributed to and started many articles, worked on policy, edited templates, created and organised categories, participated in discussions, helped new users, run database extraction, created file lists and reports for Wikipedians, done anti-vandal work, and was a host at Tea-house. I also wrote and ran bots."

Genesis

SmackBot: the earliest incarnation of Farmbrough's first bot, Helpful Pixie Bot
Farmbrough's first bot was Smackbot, later renamed Helpful Pixie Bot "to be more welcoming". Helpful Pixie Bot worked mainly on article space, using mostly the AWB (AutoWikiBrowser) program for general clean-up, dating maintenance tags, checking and formatting ISBN numbers, and other tasks that are listed on its user page; it also ran tasks requested by individual editors or projects. Femto Bot was created later, and did more "meta" tasks, such as archiving and maintaining page lists for WikiProjects.

All of the bots' tasks were approved by BAG, the Bot Approvals Group, "although in the less restrictive environment of 2007 a more liberal approach was taken to 'obviously' good extensions of existing tasks than was later the case." Before being submitted to BAG's testing regime, bot tasks underwent a significant amount of manual testing. In one typical case, Farmbrough manually checked and saved more than 3000 edits over the course of six or seven weeks.

None of Farmbrough's bots are currently running. Some of the code and data from his bots is used in other bots, such as AnomieBot and AWB-based bots. AnomieBot has taken over some of Helpful Pixie Bot's dating tasks, but the other general fixes are not being performed.

Dwarves vs gnomes?

So what went wrong? "In September 2010 I made some changes to the general clean-up, there was some opposition and I agreed to revert the changes ... However, an avalanche had been unleashed, and the matter was escalated to ANI. Subsequently I removed all custom general fixes, and rewrote the entire bot in perl, since AWB at that time could not meet the exacting standards that were being demanded. ... One would think that having agreed to do everything asked, and even gone beyond it, the matter would have rested there; but a series of ANI and ARB filings ensued, some rejected out of hand, others gaining traction until by mid-2012 it had become impossible to edit."

As one observer put it, "What we are seeing here is 'The War of the Dwarves and the Gnomes'. Dwarves are editors who work mainly on content, and typically put a lot of thought into each edit; gnomes are editors who work mainly on form, and tend to make large numbers of edits doing things like changing a - to a –. Richard is a Supergnome, and the comparatively small fraction of errors generated by his huge volume of automated edits ended up costing the dwarves who maintain articles an enormous amount of time. Eventually, after repeated failed attempts to rein him in, the outraged dwarves banded together to ban him."

An automation restriction

The outcome of the 2012 Rich Farmbrough arbitration case, along with its subsequent motions, was not at all in his favor. It contained the wording of the automation restriction that has become so controversial: "Rich Farmbrough is indefinitely prohibited from using any automation whatsoever on Wikipedia. For the purposes of this remedy, any edits that reasonably appear to be automated shall be assumed to be so." A later "amendment by motion" stated "Rich Farmbrough is directed ... to make only completely manual edits (i.e. by selecting the [EDIT] button and typing changes into the editing window)".

Is typing four tildes "automation"?
What, exactly, are "automated edits"?
So did Farmbrough break his automation ban? And what exactly are "automated edits"? Opinion was divided over whether automation had been used. Some said there was no compelling reason to believe the edits were likely automated. Others speculated that the edits might have been done with the "search and replace" function in the edit window toolbar, and therefore not prohibited under the restriction. Still others said the edits could be completely manual. (Farmbrough told the Signpost that it was "a manual error incidentally" that gave rise to the AE posting.)

The Arbitration Enforcement administrator, however, stated that "it appears very improbable that this sort of repetitive change was made without some sort of automation, if only the copy/paste or search/replace functions (which are forbidden under the terms of the decision, which prohibits 'any automation whatsoever')", and defined "find and replace" as automation because "it produces the effect of many keystrokes with one or few keystrokes". If "search and replace" is automation, replied the commenters, then so is "copy and paste" or signing posts with four tildes. Farmbrough pointed out that caps-lock also fits the definition of producing the effect of many keystrokes with one keystroke.

Defining automation

What interpretation of "automated edits" is reasonable? We asked Farmbrough if some automated edits are potentially damaging and others not:


Chilling effect on bot operators?

It has been suggested that this will have a chilling effect on other bot operators, that they will be afraid of making mistakes and getting banned. Says one talk page commenter, "A lot of bot ops and potential botops think twice before starting a bot. I have talked with several editors who want too but are afraid if they make mistakes that the zero defect mentality will get them banned."

Arbitrator T. Canens responded:


Does it matter if edits are beneficial?

We did not think to ask whether sub-optimal edits are beneficial, as long as they move the project forward, but both Farmbrough and T. Canens identified this as an issue.

Said T. Canens, "It is very clear to me that the committee in both the initial sanction and the subsequent motion intended to ban all forms of automated editing whatsoever from Rich, regardless of whether any particular automated edit is beneficial. In general, this happens when the Committee determines that 1) the disruption caused by the totality of the automated editing outweighs the benefits of said editing and 2) there is no less restrictive sanction that is both workable and capable of preventing further disruption. In this case, for instance, given the high volume of Rich's automated edits, a remedy that only prohibits him from making problematic edits would be impractical."

Farmbrough stated, "What we should be concerned about is the encyclopedic project, is something someone is doing damaging or benefiting the project? If it is damging we should look at steps to address that, if it is benefiting we should look at ways to improve it further."

Procedural issues about arbitration and enforcement

The Arbitration Enforcement request against Farmbrough was initiated at 10:29, 25 March 2013, and closed less than 13 hours later, at 23:04, with only the accuser and the AE administrator participating. After a request to leave the case open a little bit longer for discussion was declined, discussion continued on Sandstein's and Rich Farmbrough's talk pages.

Farmbrough's block at AE

T. Canens' statement at Farmbrough's Arbcom request that "I think the AE request can proceed as usual", and Richard's subsequent block, received comments at various talk pages ranging from "[it is] somewhat strange that T. Canens should encourage blocking of an editor who has made an appeal to ArbCom" to "the comments from arbitrators seem to say 'block him, we're not going to change the sanction' (T. Canens) and 'we're not going to change the sanction because he's blocked' (Carcharoth and Risker)."

"I was amazed that one arb suggesting Sandstein go ahead was considered authority to do so," Farmbrough told the Signpost. "Even more at the circular argument 'Rich is blocked so the request to remove the provision he was blocked under is moot'".

We asked arbitrator T. Canens why he had Farmbrough blocked while his Arbcom request was still open.


Autonomy of Arbitration Enforcement administrators

There was also some disagreement over the intentions of the arbitration committee with regard to automation and role of AE.

According to one interpretation of the Farmbrough arbitration case, "it isn't the automated editing itself that is harmful/disruptive, and if there is no harm being done here then the 1 year block does not prevent any problems. So in that sense it is neither punitive nor preventative!" and "the Enforcement By block section says 'may be blocked...' which I can't read any other way than to imply that some discretion is given to administrators to not block or to block for a shorter period when, for example, the infraction was so exceedingly minor or when there is no or very little disruption."

According to another view, "the underlying decision of the Arbitration Committee to consider all automated editing of whatever nature by Rich Farmbrough to be harmful, and to ban all such editing. ... Because Arbitration Committee decisions are binding, AE admins in particular have no authority to question the Committee's decisions; they must limit themselves to executing the decisions."

We asked T. Canens if, under these circumstances, "the arbitration committee needs to clarify their intentions about automation and mass editing". Canen replied:


Is there a way forward?

"I just want to get back to editing" says Farmbrough. "Wikipedians do not edit for thanks and barnstars, though they are both nice to receive. It is however a big disincentive to edit, and part of the hostile environment, when there's a constant (and I do mean constant) threat hanging over every editor's head that they're going to have to spend days and weeks fighting off ANI threads and Arbcom cases every time they do something that someone doesn't like."

Given the absence of any other formal mechanism for dealing with automation disputes, that may be exactly what will happen once the block is over. Wikipedia:Wikipedia Signpost/2013-06-19/Humour

If articles have been updated, you may need to refresh the single-page edition.

















Wikipedia:Wikipedia Signpost/Single/2013-06-19