Google isn't responsible for Wikipedia's mistakes

From the archives

Google isn't responsible for Wikipedia's mistakes

Zarasophos published this opinion piece in the June 2018 edition of The Signpost. While many of his opinions relate to the current proposals for an Enterprise API, please remember that this article predates those proposals. The views expressed in this article are the author's alone and do not necessarily reflect the opinions of The Signpost or its staff.

My work, Google's traffic

If you type "Rizaeddin bin Fakhreddin" into Google, Google will give you a list of links and a small box to the right. The first link will probably be to the English Wikipedia article on bin Fakhreddin, created and written by me; this can easily be checked by going into the page history of the article. But most likely you'll never bother to actually click on the article because of that small box to the right. "Rizaeddin bin Fakhreddin was a Tatar scholar and publicist that lived in the Russian Empire and the Soviet Union", it reads.

I typed that sentence. I also put the birth and death dates onto Wikipedia. I uploaded the picture to Wikimedia Commons and put it into the article – or articles, actually, because I also created the article on the German Wikipedia. But now I find this information directly on Google. There is a link to the Wikipedia article, but that may as well be a result of Father Google's omniscient mercy. Nowhere does the box state that it presents the work of an unpaid volunteer next to Google advertisements. The effect is obvious: In a 2017 study, half of the participants attributed what they found in the Knowledge Graph, which is the name of that small box, not to Wikipedia, but to Google.

Only good enough to blame

The Knowledge Graph has recently been in the news for saying that California Republicans are Nazis. The scandal was reported, discussed, closed, opened again and finally forgotten. Conservatives still think Google is biased against them; Google says the whole thing wasn't its fault.

We regret that vandalism on Wikipedia briefly appeared on our search results. This was not the the result of a manual change by Google.
— Google press release

No, obviously it wasn't. None of the content you presented there was. That was all Wikipedia's.

But the interesting thing is that in the public eye, this was still Google's fault. Read through the Twitter thread; none of the enraged commenters there seem to believe that this wasn't an action by a Google employee. "Google: Republicans are Nazis", read the headline on the Drudge Report article exposing the issue, and Wired magazine made a whole story out of making clear that the vandalism itself happened on Wikipedia. And all of that while more Wikipedia editors quickly did the dirty work; they hunted down the specific edit that caused the problem, corrected the vandalism and placed the page under semi-protection to prevent copycats. Meanwhile, the Knowledge Graph is still humming along, the ideology section removed, the rest still filled with Wikipedia data, and Google can be happy until the next scandal.

And we are left with a question: Why do we let this happen? Why do we let a multi-billion dollar company exploit us as uncredited mules – as long as there isn't a need for someone to shift the blame to? Where is the organization that should be responsible for protecting the rights of its volunteer editors – where is the WMF? Traditionally, Google is one of the biggest sponsors of the Foundation; for example, they chucked Jimmy Wales a $2m grant in 2010, more than they donated the whole last year. A few months later, they acquired the knowledge base Freebase, which was to form the basis for the Knowledge Graph, for an undisclosed sum.

Exploiters of free content should give back

After the recent scandal surfaced, the Foundation took an apologetic stance. "We're sorry", its statement seems to say, "and no, online encyclopedias still aren't a bad thing." But on 15 June, WMF executive director Katherine Maher, writing an opinion piece in Wired, saw the other side: "If Wikipedia is being asked to help hold back the ugliest parts of the internet, from conspiracy theories to propaganda, then the commons needs sustained, long-term support", she says, "The companies which rely on the standards we develop, the libraries we maintain, and the knowledge we curate should invest back. And they should do so with significant, long-term commitments that are commensurate with our value we create."

This is a step in the right direction. At the very least, the platform economies of the world should give something back to the largest source of the information they feed their algorithms with. As Maher concludes, "we shouldn’t be afraid to stand up for our value", but maybe it is time we see Google – and Facebook, and Amazon – not only as partners, but also as the ones making huge profits sustained by our unpaid labor.

← Previous "From the archives"

Next "From the archives" →

In this issue

28 March 2021 (all comments)

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

To be honest, my inclination is to take the opposite opinion of this than the author seems to, and see it as a good thing if people are reusing our text. I write content with the full knowledge that it's licensed under a copyleft arrangement and that it is available for use in a commercial setting. I guess legally speaking, Google and the like are maybe not fulfilling the licence by using our content without direct attribution (although there is a clear link saying "Wikipedia" there), but honestly I see it as a sort of badge of honour when I ask Google Nest a question and it spews my own words back at me, or I find some random stuff about Rwandan bus services, and accompanying map, that I wrote about 15 years ago sitting in a Cambridge published revision guide... I can see why in principle it would be good for Google et al to support the Foundation, but I don't see it as an absolute must and personally I give my time to the project principally to help make the world's knowledge freely and easily available, not because I think Wikipedia itself is the absolute be-all-and-end-all. Cheers — Amakuru (talk) 21:46, 28 March 2021 (UTC)[reply]

(regarding attribution, the text above the "Publish changes" button even says "You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.") ~ ToBeFree (talk) 23:08, 28 March 2021 (UTC)[reply]

@ToBeFree: Ah fair enough, that makes sense. Cheers — Amakuru (talk) 23:21, 28 March 2021 (UTC)[reply]

I wonder if that is actually correct for off-wiki use. It doesn’t seem consistent with the authorship/source attribution noted by Creative Commons [1] ☆ Bri (talk) 00:14, 29 March 2021 (UTC)[reply]

I guess the main point is that the attribution history is clearly and obviously available, which it would be if you click the page link and then click the history link. If that's not sufficient and a direct full attribution is required, then many of our processes such as WP:MERGE and WP:COPYWITHIN would also fall foul of the rules. — Amakuru (talk) 07:21, 29 March 2021 (UTC)[reply]

I mean, going from the Creative Commons page, a link of just "Wikipedia" is the book example of an incorrect attribution. It doesn't show the title of the Wikipedia page, at least not in a way that makes it clear that it is is the title of a Wikipedia page; it doesn't show any authors, not even in the way of "Wikipedia editors" or something like that; it doesn't even mention the Creative Commons license. Most importantly, I think, is that the Knowledge box does in no way mention that the content within it is taken from Wikipedia. The link is presented simply as something leading further on, rather than a very necessary link to a source. In my eyes, this seems to be aimed at steering people away from actually clicking the link and having them stay on their respective search platforms as an effective silo of knowledge. This obviously leads to the effect I mentioned in the article where people think some employee at Google wrote the letters in the relevant Wikipedia article and made the conscious decision to edit them to fit their political agenda. This works in part because Google has decided to present itself not as a gateway to knowledge, but as a sole hub of it. And once Wikipedia gets involved in that, I think its right that the WMF pushes for more recognition of our work in this regard. Zarasophos (talk) 22:57, 30 March 2021 (UTC)[reply]

I share the author's concern, but I'm not really sure what we can do in this case. ~★ nmaia ^d 00:50, 29 March 2021 (UTC)[reply]

One thing we can do is make sure that we don't delete our article. When patrolling deletion discussions and searching for sources, I often find that our content has been re-used elsewhere and sometimes even attributed to other people, when books are created from our content. By deleting our original, we then destroy the attribution audit trail and others can then claim ownership of our work. Andrew🐉(talk) 13:27, 29 March 2021 (UTC)[reply]
Laying on the self-parody a bit heavily, isn't it? --JBL (talk) 15:47, 2 April 2021 (UTC)[reply]

Really clear-sighted and important article, Zarasophos. Thanks! (Personally, I found the realisation a few years ago that Google, and to an increasing extent the WMF itself, were beginning to make huge amounts of money from unpaid, volunteer labour a pretty considerable turn-off. One thing I stopped doing at that time was spending hundreds of pounds on reference material ... I started to think that that, at least, was something the Foundation and the various re-users like Google should provide, and to some small extent at least, i.e. the WMF's Wikipedia Library, this has happened. The situation with attribution is even more of an issue with Wikidata, which has a zero-attribution licence. The abortive Knowledge Engine looked like it was heading in the same direction – using Wikimedia volunteer labour as a money-spinner for some of the world's richest companies. Everyone should realise a simple fact — namely that one aspect of contributing here is that you work for free so that Google, Bing, Amazon etc. can make even more money. That's why they support the effort – and what they give is a pittance compared to what they make.) --Andreas JN 466 18:53, 29 March 2021 (UTC)[reply]

Thank you! Yes, I'm very interested in seeing how Enterprise shakes out and where we'll end up in a few years. Potentially, if the current monopolised structures on the internet are broken up and a freer net returns, Wikipedia will be able to take advantage of that. Zarasophos (talk) 22:57, 30 March 2021 (UTC)[reply]

Excellent article. This is one reason why I refuse to bother with Template:Short description on articles. Their main benefit seems to be to provide content for scrapers like Amazon Alexa and Google, and if those guys want me to add them they can pay me. Blythwood (talk) 20:48, 29 March 2021 (UTC)[reply]

I've also had to deal with a four minute vandalistic edit that managed somehow to enter a Google Knowledge graph and which perpetuated a falsehood about a living person many days after Wikipedia had been corrected. What we need for dealing with these (hopefully rare) incidents is for WMF to liaise with Google to establish a direct channel of communication with them so that administrators or other trusted editors can immediately flag up gross errors to them directly, and with authority, instead of relying, as I had to, on clicking the Suggest an edit button and hoping someone might eventually get around to reading my plea to remove defamatory content. This would only be needed for really serious breaches as outlined above, but both organisations have a responsibility to ensure lies and libellous statements are rapidly removed and at the moment we're simply blaming the other and not doing much about it. Nick Moyes (talk) 13:57, 30 March 2021 (UTC)[reply]

I agree with this, more communication would definitely be good. The information relationship between Wikipedia and dominant search engines shouldn't be one-sided. Zarasophos (talk) 22:59, 30 March 2021 (UTC)[reply]

Thank you very much! Zarasophos (talk) 22:57, 30 March 2021 (UTC)[reply]

What do you think of The Signpost? Share your feedback.

Home

About