The Signpost

From the editor

The ball is in your court

Last month Status Labs, a commercial paid editing company that has been banned on Wikipedia since 2013, was the main topic of this column. You should expect it to be mentioned here for a long time to come.

Following the column a request for comment was held and over the course of four days supported the proposal that “this RFC asks the Wikimedia Foundation to enforce the Terms of Use against Status Labs violations…” by a count of 100 to 2.

Ordinarily, such a lopsided vote would have prompted a snow close after the first day. It’s an important question - how do we enforce our terms of use against a company that absolutely refuses to recognize the authority of the community, or of the WMF, to enforce our rules? We need to take our time and consider how best to do this. WMF legal needs time to consider the best legal strategy. The Board of Trustees needs to sign off on any legal action. The WMF has been informed of the RfC both by Trustee James Heilman and by myself. They’ve promised to inform The Signpost promptly when there is an official announcement.

But the question is not "should we take action?" It is "how do we best take action?" Not taking action threatens the encyclopedia's very existence. If we allow a paid editing company to solicit rich customers to place anything they’d like into Wikipedia and we don’t enforce our rules against paid editing, then we no longer have any rules. Any rich person could put just about anything into Wikipedia and there’s little we could do about it. Wikipedia would no longer be an encyclopedia, rather it would be an advertising platform for rich people.

So what can we do while we’re waiting for the inevitably slow legal process to work? Here are a few suggestions.

WMF's role

A persistent problem seen at the conflict of interest noticeboard (COIN) is the number of companies which don’t realize that we have rules against advertising and undeclared paid editing (UPE). It’s in everybody’s interest to let them know. The Signpost can only do so much in publicizing Wikipedia's rules. It would be much better if the WMF actively took every opportunity to let companies know via press releases, speeches, and interviews. The WMF knows how to publicize its projects. Please make letting companies know about our rule against UPE a top priority.

The WMF should inform the community via COIN when they have reason to suspect UPE. There is little or no reason to keep this information secret. Editors can then check out the suspicions and come to their own conclusions.

A few cases might be kept under wraps while the WMF learns more about the problem. When they find a company that is clearly breaking our terms of use but wasn't aware of our rules, it could be useful to talk to them informally. Why do they try to advertise on Wikipedia? Is it just the low cost? Or is it the placement on Google search results? Perhaps they were solicited by a known paid editor? What commercial paid editing firm do they use? Finding out the specific reasons for paid editing may help design a program to discourage other advertisers.

Much of the WMF's proposed response to UPE involves developing software that would help identify these editors from the articles they write. First things first, however, there are some simple fixes that might work quickly and inexpensively. Please check with the admins who work in this area. MER-C, for example suggests improving the CAPTCHA function used at account registration to weed out spam-bots.

Artificial intelligence should be able to help identify UPEs, or at least editors who write like them. A good sample of editors identifying spammy articles is available from Deletion sorting/Companies going back to 2015.

Another method of identifying UPEs is to look at their known characteristics:

  • Almost all advertisers on enWiki use a particular form of the English language that anybody, even a 10 year old, can identify. We all know when somebody is trying to sell us something, but for whatever reason, advertisers cannot avoid using that lingo.
  • Adverts all point to the company or product that they are advertising. Just follow the links.
  • Advertisers on Wikipedia almost always have many references in the articles they write. Multiple bad references. Get a list of the 1000 most common references in Wikipedia and calculate the percentage of references in the article that don’t match these good references.
  • Other common characteristics include having a product list in the article, emphasizing the founder's or CEO's genius in running the company, or even having articles on both the company and the CEO submitted at the same time.
  • Identifying PR has a very long list of these characteristics.

Community's role

The community and its administrators should realize that we have most of the rules needed to enforce our paid editing policy. If you see advertising or spam you can remove it. You can report any suspected UPE at the conflict of interest noticeboard. You can nominate an article for deletion at WP:AfD. All of this takes time, so there are a few rules we should change to streamline the process of eliminating UPE.

Changing policies

There are many tweaks that we could make to policies and guidelines to streamline the process of showing UPEs off the premises. Go ahead and make proposals to tweak these policies, but let's concentrate on one big change. Our paid editing disclosure policy is an especially strong policy in a few key areas. Every paid editor must disclose their paid status. No ifs, ands, or buts about it. This is the policy of both the WMF and enWiki, and it is not easily changed. Any major change in this policy must undergo an RfC that is equivalent to that of establishing a new core policy. Let's keep it and build on it, adding a new policy on top of it.

The type of paid editors we are most concerned with are the commercial firms like Status Labs. Let's have an additional policy for commercial editing firms, those that edit Wikipedia as part of a commercial transaction. Wikipedians love to precisely define who is and who isn't covered by a policy. We can do that for commercial firms without changing our paid disclosure policy. We can require that any editor who works with them declare their commercial editing status in one place - on their user page. Thus we can keep track of them much better than paid editors who are allowed to switch between three choices of placing their declaration. We can prohibit commercial editors from working with firms that do not publicly declare in their advertisements that they will follow all our rules against ads, PR, promotion, spam, and UPEs in Wikipedia. We can maintain a blacklist of the commercial editing firms that do not follow our rules and link to them on their websites. We can establish a standard procedure to investigate especially blatant commercial editing and report the results and recommendations to the WMF for further action.

There are many prohibitions we could add to a new commercial editing policy, but let's keep it simple:

  • Nobody can edit Wikipedia who accepts money or works in association with a commercial firm that does not require its employees, contractors, and associates to follow all our rules about advertising and other promotion and UPE.
  • The commercial firm must let all its potential customers, employees, and affiliates know that strict compliance with our rules is required. The public should be informed via a highly visible link on the firm's website.
  • The firm must cooperate with investigations by the Wikipedia community of UPE abuses.

The future

The Signpost will continue to cover the Status Lab story in detail. We will cover major new paid editing scandals as they appear. Typically there are 3 or 4 each year. Our role is to cover the news and offer our analysis of it, so we don't plan to offer new policy proposals on any regular basis. Members of the community on all sides of the issue are encouraged to submit their opinions for publication and debate.

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

Four comments:

  1. The WMF needs to start crawling before they run. They need to fix the entirety of the admin tool package before I consider supporting any attempt by them to deploy any form of machine learning to tackle the problem.
  2. The technical bar to create a new article for spammed subjects should increase a little. The moratorium proposed last month is too extreme and counterproductive. Something like 20-50 edits should do.
  3. The remaining suggestions are sensible, though I very much prefer a total ban altogether.
  4. Other suggestions include increasing sourcing requirements for determining the notability of BLPs to a similar standard to WP:CORP and increasing specific biographical notability guidelines (the various sports notability guidelines are probably the worst - sports players commonly become businesspeople after retirement from sports and thus join the UPE target market).

As I pointed out in November, UPE is an intractible problem because a $10k spend on Wikipedia spamming buys nearly a year's worth of English speaking third world labour, which is extremely cheap and plentiful. The conclusion that we need to streamline as much as possible the removal of UPE is correct. MER-C 18:57, 1 March 2020 (UTC)[reply]

That's a good point #4 I hadn't considered before, especially if you include endorsements in ex-sportspeople's business interests. Just one quickly searched example: Alejandro Villanueva (American football)#Endorsements. Obviously a notable BLP, but is the endorsement encyclopedic? ☆ Bri (talk) 19:04, 1 March 2020 (UTC)[reply]
I'd regard that as trivia. MER-C 19:14, 1 March 2020 (UTC)[reply]
I removed the section in the Villanueva article. The sole source was military.com, which fails WP:RS. -- John Broughton (♫♫) 19:20, 1 March 2020 (UTC)[reply]

One more thing: spam grows exponentially. An attention seeking entity sees spammy articles about similar attention seeking attention seeking entities and decides they want their own. The rate at which spam gets added to Wikipedia is proportional to the amount of spam already there. MER-C 19:18, 1 March 2020 (UTC)[reply]

  • If a partial informal approach was being considered, notifying companies that use of certain paid-PR companies will be automatically prohibited from even otherwise legitimate paid editing. While I'm reticent about giving a positive list of potential paid editors, having a firm, very public, blacklist might help provide some economic incentive Nosebagbear (talk) 20:20, 1 March 2020 (UTC)[reply]
    • I had mixed feelings when I wrote that the WMF should give out the usernames to all suspected UPE editors. I guess it comes down to "what degree of suspicion", but I'd like to make sure that if another Status Labs situation comes along the entire community is informed about it. In general the position should be to disclose to the community. OTOH, I wrote in the next paragraph that the WMF should investigate some of the claims of UPE in an informal manner - just to get information - they should do this as well - finding out what is going on is hugely important. Hopefully they can find a way to do both that is not contradictory. I would never suggest providing a white-list. I've seen 3 paid editors who credibly claimed to follow our rules, but after watching for awhile I'm sure I would never recommend 2+ of them. OK, 1 is sorta ok, but I wouldn't want it on my conscience if I made a mistake!
    • We've already got something of a blacklist going @Bri: should have the link. Certain we should maintain and formalize a blacklist. Smallbones(smalltalk) 21:10, 1 March 2020 (UTC)[reply]
  • MER-C is dead on about the "exponential" bit, and I suspect a lot of that is about ignorance rather than malice. I have more than once had a spammer come to my talk page protesting a G11 deletion, telling me "But X, Y, and Z look like what I wrote, and I based it on that!". In more than one case, that's led to nominating X, Y, and Z for deletion too (generally successfully), because the spammer was right: Those were PR puff pieces too, and still got into the encyclopedia. I can't even find myself to too much fault the paid editor in that case; they looked at similar stuff and figured that must be the acceptable way to do things. We've got to get better at ensuring we stick to encyclopedic subjects, and strictly enforce the requirements for sourcing. Seraphimblade Talk to me 21:24, 1 March 2020 (UTC)[reply]
  • The saddest thing is that a lot, and I mean a lot, of users are actually indifferent to paid editing - there are diffs, but I'm not going to dig them out now. Paid editing is an area for which there is no way of obtaining any metrics. It's probably enormous and it's probably even happening among the 300+ WMF employees - because it's happened before. The scale of it is so large that an RfC to ban it outright would probably fail. Somewhat indirectly, and perhaps ironically, it also contributed to yesterday's Arbcom decision to desysop another admin. Kudpung กุดผึ้ง (talk) 00:48, 2 March 2020 (UTC)[reply]
  • An easy way to help: please comment on discussions at Wikipedia:WikiProject Deletion sorting/Companies. --Piotr Konieczny aka Prokonsul Piotrus| reply here 03:52, 2 March 2020 (UTC)[reply]
  • These stories always tend to miss the forest for the trees. One reason our coverage of companies is so bad and so easy to be exploited by spammers is because editors who do quality work frequently can't be bothered with the extreme risk of nonsense deletion: it's rare that there's a household-name major corporation that someone hasn't tried to whack at least once. And so the neverending attacks on legitimate business content that these discussions inevitably wind up inciting (as the above comments demonstrate) make the problem worse by driving anyone not doing NPP from the whole area (editors who could be doing an efficient job of watching for paid editing, much as editors in other areas of interests keep crap out of their own areas). The Drover's Wife (talk) 09:37, 2 March 2020 (UTC)[reply]
    • If your point is that there are too many people spending too much time deleting company articles, I have to disagree. As soon as we let up on this, or say something like "we're going to be kinder and gentler on business articles" the flood of corp spam will become a deluge. There are at least a million new businesses in the US each year. They'd all love to get free advertising, most of them will get a write up in a local newspaper if they want to go to the bother. 90% will be defunct in 5-10 years, and we won't even be able to get a news story in confirmation of their non-existence. There's a reason that routine coverage doesn't count for notability. There's a reason that companies will ignore that.
    • And this isn't about big, obviously notable (BON) companies. I figure there are about 60,000 BONs in the world max. That would include all the actively traded stocks in the US as approximated by the Wilshire 5000 which now has less than 4,000 companies in the index. Add in similar sized private companies (Cargill, Koch Industries, etc., government sponsored companies Fanny Mae, etc., some other financial business, accounting firms, etc. that are (or were) usually partnerships, and that might be 10,000 BONs in the US. Adding in all the BONs in the UK, EU, Japan, China, and India wouldn't multiply the US number by 6. Adding in the top 10 companies in each country of the world only gets another 2,000. So 60,000 companies in the world that we should be able to get good info on. But when I look for these big companies, I'll estimate that half of them are missing here. Why? a lot of them aren't consumer businesses that are looking for advertising. How many businesses have Wikipedia articles? I'll estimate 4.8% of 6,020,000, which rounds up to 290,000. So something over 80% of Wikipedia articles are small, hard to find info on, usually consumer businesses looking for free ads. That's a fairly quick, broad brush approach. There will be lots of exceptions, but it gives you an idea of what we're up against. Smallbones(smalltalk) 19:09, 2 March 2020 (UTC)[reply]
      • @Smallbones: - I know what we're up against: I'm an AfC reviewer who, when I do it, is as reject-happy with corporate crap as everyone else. The problem is that, if there's (as you say, and a reasonable estimate) 60,000 notable corporations in the world, we've got a culture that means people are likely to try to whack about 59,000 of those at some stage. Happens all the time. This is a huge disincentive to anyone apart from NPP people being active in the business space. That's why most of those notable companies don't have articles. It's also a huge barrier towards having the amount of editors doing quality control that you'd get in other spaces. We need to find a way of stopping people trying to throw the baby out with the bathwater because it's hugely detrimental to our coverage of the area (both in terms of building good content and pruning crap). The Drover's Wife (talk) 23:39, 2 March 2020 (UTC)[reply]
  • WP:AfD is one of the best processes for getting rid of spam. Bearian (talk) 15:00, 2 March 2020 (UTC)[reply]
  • I dissented last month from the proposal to institute a moratorium on new articles about businesses. This article is much more positive and pragmatic. There are several excellent proposals here, and I commend Smallbones for these ideas. Cullen328 Let's discuss it 08:34, 3 March 2020 (UTC)[reply]
  • Don't get your hopes up about the legal system fixing things. Law enforcement doesn't care about this kind of stuff. A civil suit would likely be tough because to get damages you have to show the court that some party was harmed and how. I'm not sure if an injunction would be easy to get or would accomplish much. Given that Status Labs has some well-heeled clients there's also the non-zero risk of someone wealthy getting upset about all this and deciding to try to bleed the WMF dry through legal fees. Fees easily can climb into many millions for any case that drags on. --47.146.63.87 (talk) 06:30, 5 March 2020 (UTC)[reply]

















Wikipedia:Wikipedia Signpost/2020-03-01/From_the_editor