The Signpost
Single-page Edition
WP:POST/1
31 October 2019

In the media
How to use or abuse Wikipedia for fun or profit
Special report
“Catch and Kill” on Wikipedia: Paid editing and the suppression of material on alleged sexual abuse
In focus
The BBC looks at Chinese government editing
Interview
Carl Miller on Wikipedia Wars
Community view
Observations from the mainland
Arbitration report
October actions
Traffic report
Wrestling with a couple of teenagers, a Nobelist, and a lot of jokers
Gallery
Wiki Loves Broadcast
Recent research
Research at Wikimania 2019: More communication doesn't make editors more productive; Tor users doing good work; harmful content rare on English Wikipedia
Essay
Wikipedia is in the real world
News from the WMF
Welcome to Wikipedia! Here's what we're doing to help you stick around
On the bright side
What's making you happy this month?
 

2019-10-31

How to use or abuse Wikipedia for fun or profit

A fake Nazi death camp in Warsaw

On 4 October, Haaretz published "The Fake Nazi Death Camp: Wikipedia's Longest Hoax, Exposed", with later coverage in The Times of Israel, The Week, and at least nine other news sources in several languages.

For fifteen years, the article for Warsaw concentration camp, also known as KL Warschau, contained the misinformation that the camp was an extermination camp with the majority of its victims non-Jewish Poles. Although there were numerous revisions to the article in time, and disputes as to the veracity of the claims to it being an extermination camp, the misinformation persisted from the creation of the article by the now deceased Halibutt in August 2004 until interventions by K.e.coffman on 5 and 6 May 2019 and Icewhiz on 27 and 28 August 2019. The misinformation (called a "hoax" by Icewhiz and Haaretz) largely originated from research by the judge and author Maria Trzcińska, whose obscure hypothesis about the camp being an extermination camp that targeted non-Jewish Poles was officially discredited in 2007 (three years after the date that the Wikipedia article was created). Icewhiz reported to Haaretz that they investigated the claims in the Wikipedia article after they read a May 2019 article by Christian Davies in London Review of Books which mentioned "Wikipedia entries amended". They posted an essay on their user page documenting the existing problems found on Wikipedia, both in the article itself and mentions of Warsaw concentration camp elsewhere on Wikipedia, as well as the state of the article prior to K.e.coffman's edits. User:François Robere approached the Signpost in September with an edited version of Icewhiz's essay prepared for publication, but the Signpost declined to publish.

Icewhiz was involved in several content disputes about antisemitism and misinformation related to Poland in World War II, and is subject to sanctions per Wikipedia:Arbitration/Requests/Case/Antisemitism in Poland. They have now been banned indefinitely for off-wiki harassment pertaining to the Antisemitism in Poland content dispute, see Arbitration report. Icewhiz states that they brought the story to Haaretz as an attempt to generate reliable coverage of the facts regarding KL Warschau which could then support their arguments on Wikipedia. Both Icewhiz and Haaretz writer Omer Benjakob claim that this case was just one out of many instances of intentional misinformation added by Polish nationalist editors. Benjakob writes that there "seems to be a systematic effort by Polish nationalists to whitewash hundreds of Wikipedia articles relating to Poland and the Holocaust." Benjakob links this effort on the English Wikipedia with current Polish nationalist political efforts, which he accuses of promoting Holocaust distortion and attempting to minimize the documented complicity of Poles in the Holocaust and promoting Poles as equal or worse victims of the Holocaust. The historian and Haaretz contributor Daniel Blatman countered in a 17 October 2019 opinion piece that the false claims in the article persisted through several Polish governments, including those which acknowledge the complicity of Poles in the Holocaust, and thus cannot accurately be described as an attempt by Poland to falsify Holocaust narrative. The blame for the faulty article lies, Blatman argues, entirely with Wikipedia. User:Poeticbent, who was the most prolific Wikipedia editor of Polish-related articles, including those about Jewish-Polish history, until he retired from Wikipedia in May 2018, is named by Icewhiz as one of the Polish editors intentionally spreading misinformation on Wikipedia. He responded to the accusations with an essay posted on his user page. User:Piotrus, another prolific editor of Polish content, is also named in the article, and was interviewed by Haaretz for the piece. However, Piotrus states that the interview was never authorized for publication, and so they posted a response on the Polish Wikipedia. In this response, Piotrus says there are inaccuracies and false statements from Icewhiz that were not corrected by Haaretz.

As a remedy in the Antisemitism in Poland arbitration case, all articles pertaining to Poland in World War II (1933-1945), including those pertaining to the Holocaust, are subject to the guidance applied to the Collaboration in German-occupied Poland article: "Only high quality sources may be used, specifically peer-reviewed scholarly journals, academically focused books by reputable publishers, and/or articles published by reputable institutions. English-language sources are preferred over non-English ones when available and of equal quality and relevance. Editors repeatedly failing to meet this standard may be topic-banned as an arbitration enforcement action."
- 3family6

Swedish embassies lead edit-a-thons

The Swedish government's gender equality foreign policy is shown by WikiGap edit-a-thons held in Swedish embassies in Japan and Pakistan this month. Previously about 60 other WikiGap events have been held. (32 events are shown here).

In brief

Odd bits

  • Vatican edits: The conservative Catholic website LifeSiteNews reports that somebody within the Vatican City, possibly in the Secretary of State office, has vandalized the article on Taylor Marshall, author of Infiltration: The Plot to Destroy the Church from Within.
  • And for the sequel? Larry Sanger wrote an encyclopedia article about his left thumb on Everipedia, the encyclopedia that promises to pay for your writing with digital wooden nickels. Sanger has now returned his wooden nickels and resigned from his position as Everipedia's CIO without writing an article about his right digitus medius. Sanger's next project looks thumb-what more promising, a decentralized encyclosphere supported by the Knowledge Standards Foundation. Not only can anyone edit, but anyone can create their own encyclopedia, with articles on a topic being rated across encyclopedias. Of course anyone can rate the encyclopedia articles.
  • Clueless: Wikipedia “Could Spell the End of Clueless Arguments in Pubs”: However unlikely that seems, it was published on the UK satire site NewsBiscuit.
  • Moscow Times reported on October 9 that the Great Russian Encyclopedia will be repurposed as the Russian analogue of Wikipedia and will be designed for 15 million users per day. Russian technology site RBC.ru reported that the state budget will fund the new site for about $31 million. The Russian broadcaster RT, which usually has a political or ideological agenda, reported that the website will be "free of any political or ideological agenda." Last month The Signpost reported that Belsat, which has a different political or ideological agenda, gave the same information. The Signpost predicts that the GRE will not replace Wikipedia in Russia, but that it will have a political or ideological agenda.



Do you want to contribute to "In the media" by writing a story or even just an "in brief" item? Edit next month's edition in the Newsroom or leave a tip on the suggestions page.



Reader comments

2019-10-31

“Catch and Kill” on Wikipedia: Paid editing and the suppression of material on alleged sexual abuse

In his new bestseller, Catch and Kill: Lies, Spies, and a Conspiracy to Protect Predators, US journalist Ronan Farrow describes how NBC News, his former employer, tried to shut down his reporting on Harvey Weinstein and other alleged sexual predators, which is credited with helping to kickstart the MeToo movement.[1] The book discusses Farrow's struggle to publish stories on Weinstein, Matt Lauer and others, while allegedly being spied on by Black Cube, a private Israeli intelligence service.[2]

Farrow's book includes allegations that NBC hired a paid editor to whitewash Wikipedia articles, and this article focuses on this set of accusations. Farrow stated, "NBC also hired Ed Sussman, a 'Wikipedia whitewasher', to unbraid references to Oppenheim, Weinstein, and Lauer on the crowdsourced encyclopedia. ... He spun the material in NBC's favor, sometimes weaving in errors ... Other times, he simply removed all mention of the controversies."[3][4]

NBC has called Farrow's accusations against the company a "smear"; NBC employees have called, on air, for an independent investigation.[5][6]

Farrow has a reputation as a reliable source. In May 2018, he was awarded the Pulitzer Prize for Public Service, along with Jodi Kantor and Megan Twohey from The New York Times, for stories exposing the alleged behavior of Weinstein and others.[7] Further, The New Yorker, where he works, has famously good fact checkers.[8] The book itself "was exhaustively vetted by Sean Lavery, a senior fact checker at The New Yorker".[9]


Sussman's work

Ed Sussman has also had an interesting career as a journalist, lawyer, and entrepreneur.[10] He worked at The Wall Street Journal and the Financial Times. He graduated first in his class from Duke Law School and served as a law clerk for two US federal judges. His managerial and entrepreneurial experience includes roles at Inc. (magazine), Mansueto Ventures, Fast Company, Buzzr.com and he now works for a paid editing company called WhiteHatWiki.

"Paid editors must respect the volunteer nature of the project and keep discussions concise. When proposing changes to an article, they should describe the suggested modifications and explain why the changes should be made … No editor should be expected to engage in long or repetitive discussions with someone who is being paid to argue with them."

WP:PAYTALK

Sussman edits now as BC1278 and earlier as Edsussman. He openly acknowledges that he was paid to edit Wikipedia by NBC News and about 50 other companies.[11][12][13] He declares that he is a paid editor on his user pages and generally only edits talk pages where he again declares his paid status. He says that he strictly follows all conflict of interest and paid editing rules, though at Administrators' Noticeboards he has been accused of violating WP:PAYTALK.[14][15]

Noam Cohen, a journalist with extensive experience covering Wikipedia for The New York Times and other publications, has called Sussman's approach “paid advocacy” rather than “paid editing.”[16] If Sussman is advocating for his clients, he is violating Wikipedia's rule WP:NOTADVOCACY.

"Content hosted in Wikipedia is not for:
Advocacy, propaganda, or recruitment of any kind: commercial, political, scientific, religious, national, sports-related, or otherwise."

WP:NOTADVOCACY

Sussman's talkpage approach has, at times, included lengthy and even tendentious discussions. Kashmiri summed up Sussman's talk page style by asking, “may I kindly ask you to be more concise? I agree English is a beautiful language, but requiring other editors to read walls of text from you on every single issue is tad daunting, sorry.”[17]

Sussman addressed the issue of being concise in an interview with The Signpost, saying "I almost always get a way better result when I am concise, and I think I succeed in that in the overwhelming majority of Requested Edits for independent review that I make. In a contentious situation, where there's a lot of back and forth, it's harder to do, although I try. I'm always working on it."[18]

The effects of Sussman's editing can be seen on the talk page of the article on Noah Oppenheim, the President of NBC News. Sussman contributed almost half the total content (48%) to the talk page in 59 edits made over less than three months.[19] The article itself was largely unchanged during that period. One sentence was added about a promotion. A small section headed Allegation of Misconduct was removed and a sentence was added that said that NBC News had no knowledge of misconduct by Lauer until shortly before his firing.

The WhiteHatWiki website describes a case study of an unnamed media executive, showing how Sussman manages disputes on Wikipedia:

Article about a very prominent media executive misrepresented his involvement in a high-profile controversy.
The Wikipedia editor who wrote the section would not agree to an accurate, neutral statement, so we brought in more independent Wikipedia editors into the discussion. Consensus decision agreed with our position and the language was changed. The hostile editor persisted, however, with other unjustified changes, so we began 24×7 monitoring of the article. We wrote extensive explanations of the relevant Wikipedia standards to judge the misleading and biased statements. A full-blown Wikipedia dispute process, involving 10 editors, commenced, and after a vote, the language/incident was removed entirely from the article and an administrator closed the dispute permanently.[20][21]

Sussman declined to name the "very prominent media executive".[22]

Sussman started the article on NBC CEO Andrew Lack via Articles for Creation. He's written 75% of the current content.[23] Until October 9, when the first reviews of Catch and Kill were coming out, there was no mention of Weinstein or Lauer in the article.[3]

The article for creation was reviewed and approved within five hours without any change in the text.[24] Sussman also wrote 48% of the content on the talk page.[25] On the talk page of another article of interest, NBC News, he was the most active contributor both in edits (10) and for 38% of the total content in just 32 days.[26]

There have been several long discussions about his paid editing at administrators' noticeboards including these two. At the Conflict of Interest Noticeboard he's been involved in three long discussions.

Other journalists have reported on Sussman's editing, including a March 2019 article by Ashley Feinberg. The Signpost covered that controversy with

Ashley Feinberg at the Huffington Post reports that "Facebook, Axios and NBC" used a declared paid editor, Ed Sussman (BC1278) from the firm WhiteHatWiki, to 'whitewash' their pages. Nevertheless she appeared to stop short of claiming that Sussman broke any Wikipedia rules, except perhaps that he badgered volunteer editors with "walls of text."

Sussman dismisses Feinberg's article as the product of a journalist who knows little of Wikipedia or its rules. He says that Farrow based his reporting on "Wikipedia whitewashing" on the article, telling The Signpost "Farrow did not contact me and the allegation in his book mirrors the HuffPo story, almost point for point … – he shouldn't just be summarizing another article. He should have done his own reporting, including contacting me."[27]

Beyond the articles mentioned by Farrow, Sussman has an extensive portfolio of about 50 clients on Wikipedia. He states on his user page "You can presume any edits I have made for any article are on behalf of the article-subject or their employer, unless I specify otherwise."[12] Relying on that statement, his clients in the last three months have included: Exelon, and related companies, Commonwealth Edison, Baltimore Gas and Electric and PECO Energy Company, as well as Laura Arrillaga-Andreessen (in conjunction with Squirrel678), Judith Genshaft, Noah Kraft, Infront Sports & Media (in conjunction with Tennisstar1995), Alec Oxenford, Robinhood (company), Andrew Lack (executive), and Axios (website).

He has made 114 edits at Articles for Deletion. Recent examples include Ale Resnik (3rd nomination) (see also the first nomination) and Meyer Malka (2nd nomination).

Sussman's view

Sussman says that Farrow's allegations are "just a retread of the thoroughly investigated and discredited smear in the Huffington Post".[28] He says he didn't initiate the removal of information in the Noah Oppenheim article and he does not have any connection to the administrator or other editors who did remove the information. "Farrow picked up on Ashley Feinberg's B.S. accusations about 'networks of friends' to secretly bypass COI -- it's just garbage".[29]

He wishes that the Wikimedia Foundation would protect the system now in place where declared paid editors submit proposed changes on article talk pages, which are implemented only after review by independent editors.[28] Some paid editors, according to Sussman, improve Wikipedia by offering help to paying individuals and corporations who believe they are being mistreated.

I'm helping a client now who has been targeted for attack on Wikipedia by sympathizers of a U.S. designated terrorist organization. The attack has been live for more than two years. I've helped the subjects of articles falsely accused of hate crimes, murder, and corporate malfeasance. Why am I being criticized for transparently assisting clients combat severe biased direct editing?[28]

But he goes beyond just wanting to protect the current system, proposing fundamental changes to the way Wikipedia works.

I'm calling on Wikipedia to freeze direct public editing of all articles. Every edit should be reviewed by experienced editors prior to publication, just as every edit I propose is. It's time for Wikipedia to grow up. Allowing anyone to publish anything, without any prior review, is an open invitation for information warfare.[28]

This proposed system, according to Sussman, would solve some of Wikipedia's most pressing problems.

When it comes to undisclosed editing by subjects of articles, their paid reps, or by editors with agendas and biases, I think Wikipedia has already lost that battle. The only way to remedy it is for every edit by every editor to go through screening before publication -- the same process declared COI editors abide by now. And that's what I think Wikipedia needs to do to take care of the severe problems with disinformation campaigns by governments, companies, terrorist sympathizers, litigants and others - which I get called on to assist with all the time, by the victims, who feel powerless. Why can't all these very effective Talk discussions about policy happen before misinformation is published?[30]

The Signpost asked both Sussman and representatives of Little, Brown and Co. the publisher of Catch and Kill, for comments on the final draft of this article. Sussman's has been edited for length.

Statement from Edward Sussman

Ronan Farrow hasn't done basic fact checking about his Wikipedia accusations. Most tellingly, he didn't even contact me, despite the seriousness of the claims.

Farrow's mistakes are so glaring and the accusations so easy to disprove, that it seems very likely he didn't even read the entirety of the Wikipedia article discussions at the center of the accusations.

I did not direct a “network of friendly accounts” to “launder” changes.

A representative of the publisher stated:

"The discussion of Sussman's Wikipedia whitewashing in Catch and Kill was based on public material including Wikipedia edit records, and on existing reporting from multiple publications including the Huffington Post, which is cited in the book. It was also, like all of the reporting in Catch and Kill, fact checked, in this case with Sussman's employers at NBC. If Mr. Sussman has an issue with his Wikipedia whitewashing activities being disclosed to the press, he should take it up with those who have hired him."


References

  1. ^ Szalai, Jennifer (14 October 2019). "In 'Catch and Kill,' Ronan Farrow Recounts Chasing Harvey Weinstein Story". The New York Times. Retrieved 29 October 2019.
  2. ^ Thomas-Corr, Johanna (25 October 2019). "Catch and Kill by Ronan Farrow review — the bigwigs who backed Harvey Weinstein". The Times. Retrieved 29 October 2019.
  3. ^ a b "Ronan Farrow overcame spies and intimidation to break some of the biggest stories of the #MeToo era".
  4. ^ Farrow, Ronan (October 2019). Catch and Kill: Lies, Spies, and a Conspiracy to Protect Predators. Little, Brown and Company. pp. 399–400, 408.
  5. ^ Farhi, Paul (14 October 2019). "NBC News chief calls Ronan Farrow's book 'a smear' in lengthy new rebuttal". The Washington Post. Retrieved 28 October 2019.
  6. ^ Ellison, Sarah; Farhi, Paul (27 October 2019). "NBC News can't seem to shake Ronan Farrow and the scandal he uncovered". The Washington Post. Retrieved 28 October 2019.
  7. ^ "2018 Pulitzer Prizes". The Pulitzer Prizes.
  8. ^ Hepworth, Shelley (March 8, 2017). "The New Yorker's chief fact-checker on how to get things right in the era of 'post-truth'". Columbia Journalism Review.
  9. ^ Farrow, Ronan (October 2019). Catch and Kill: Lies, Spies, and a Conspiracy to Protect Predators. Little, Brown and Company. p. 415.
  10. ^ "About Us". WhiteHatWiki.com. Retrieved 29 October 2019.
  11. ^ Talk:NBC News Paid editor disclosure
  12. ^ a b User:BC1278 quote:"My name is Ed Sussman... You can presume any edits I have made for any article are on behalf of the article-subject or their employer, unless I specify otherwise."
  13. ^ User Contributions for BC1278
  14. ^ Wikipedia:Administrators'_noticeboard/Archive308#HuffPost_article_on_WP_COI_editing (March 2019)
  15. ^ Wikipedia:Administrators'_noticeboard/IncidentArchive1018#Removal of an RfC on a Talk Page (September 2019)
  16. ^ Want to Know How to Build a Better Democracy? Ask Wikipedia, Noam Cohen, July 4, 2019
  17. ^ Caryn Marooney talkpage
  18. ^ via email, October 24, 2019
  19. ^ Xtools Talk:Noah Oppenheim
  20. ^ WhiteHatWiki
  21. ^ archive
  22. ^ via email, October 29, 2019
  23. ^ Xtools Andrew Lack (executive))
  24. ^ promotion to article
  25. ^ Xtools Talk:Andrew Lack (executive)
  26. ^ Xtools Talk:NBC News
  27. ^ via email October 18, 2019
  28. ^ a b c d via email, October 16, 2019
  29. ^ via email, October 18, 2019
  30. ^ via email, October 24, 2019



Reader comments

2019-10-31

The BBC looks at Chinese government editing

External videos
video icon Wikipedia Wars?, BBC, 24:30[1]

The British Broadcasting Corporation dropped a bombshell on mainland China in its October 5 Click television program.[1] The segment reported by Carl Miller called Wikipedia Wars? strongly suggests that the Communist government of China is directly editing Wikipedia and perhaps even doxing or harassing editors in Taiwan. Widespread edit warring was also reported. However, as Miller notes in an interview in The Signpost "We cannot be sure who were behind the edits that we found, or why they were done ... there was nothing that directly tied these edits back to the Chinese government."

Incidents and accusations

As reported by Click, Wikimedia Taiwan board member Jamie Lin said that many incidents are not merely good faith differences of opinion, but instead "control by the [Chinese] Government". According to Lin, editors — not just content — are under pressure, "some [editors] have told us that their personal information has been sprayed [released], because they have different thoughts." On a Wikimedia Telegram channel, according to Lin, a person told a Taiwanese editor that "the policemen will enjoy your mother's forensic report".

Hong Kong resident 1233 confirmed to The Signpost that Hong Kong editors have had similar experiences. "Direct attacks on well-known editors who do not align with a single point of view and shutting down resistance through off-site harassment. This is what some members of the Mainland Chinese working group are doing. The Wikimedia Foundation is ineffective in dealing with off-site harassment – the WMF really can't do anything. The working group manipulates on-site rules to silence whistleblowers and completely ruin how the project runs. This method is so effective that some working group editors practice it without even knowing that it violates civility at its root."[2]

An editor from mainland China, who did not wish to be identified, told The Signpost that "the BBC's coverage is biased and was backed by Wikimedia Taiwan, the anti-China chapter. The coverage just makes accusations, instead of trying to help solve the problem."[2] The opinions of this editor are their own and do not necessarily reflect the views of The Signpost. He continued "Carl Miller interviewed Wikimedia Taiwan, anti-China scholars, but not Chinese Wikimedians themselves."[2]

When questioned by Click about possible Chinese government editing of Wikipedia, Heather Ford, Senior Lecturer in Media at the University of New South Wales, said, "I'm surprised it's taken this long actually [...] [Wikipedia] is a prioritised source of facts and knowledge about the world."

The Chinese story

Lokman Tsui, Assistant Professor at the Chinese University of Hong Kong, told Click that the battle over Wikipedia content comes at a time when China is showing an increasing desire to fix perceived misconceptions about it abroad, "'Telling China's story' is a concept that has gained huge traction over the past couple of years [...] They think that a lot of the perceptions people have of China abroad are really misunderstandings."

Editor 1233 told The Signpost "The idea of a unique "Chinese view" or "Chinese story" on Wikipedia would be as disastrous as making the Foundation a for-profit entity. It would ruin the reputation of the project."[2]

Two papers cited by Click show that the Chinese government might be interested in altering content on Wikipedia to show itself in a more favorable light. In 2016, Jie Ding, an official at the China International Publishing Group, a global media corporation overseen by the Communist Party of China, published the paper "Analysis of the Feasibility of Using Wikipedia to Carry out the Dissemination of Chinese Political Discourse" in International Communications. Ding posits that "there is a lack of systematic ordering and maintenance of contents about China's major political discourse on Wikipedia" and says that it needs to "reflect our voices and opinions in the entry, so as to objectively and truly reflect the influence of Chinese path and Chinese thoughts on other countries and history".

This line of thought is shared by at least two Chinese academics. This year, Li-hao Gan and Bin-Ting Weng published "Opportunities And Challenges Of China's Foreign Communication in the Wikipedia" in the Journal of Social Sciences. They write that "due to the influence by foreign media, Wikipedia entries have a large number of prejudiced words against the Chinese government". To rectify this, they say the Chinese "must develop a targeted external communication strategy, which includes [...] cultivating influential editors on the wiki platform." They conclude with "China urgently needs to encourage and train Chinese netizens to become Wikipedia platform opinion leaders and administrators… [who] can adhere to socialist values and form some core editorial teams."

The unnamed mainland editor quoted above says that "just two academic papers can't represent what China's propaganda department wants. Countless academic papers in China are aimed at foreign media, including the newswires, Twitter, and Wikipedia, with over 20 so far in October."[2]

The Signpost checked a case of edit warring cited by Click. At 14:20 on September 11 User:Xiaolifeidaohank changed the lead sentence of the Taiwan article from "Taiwan, officially the Republic of China (ROC), is a state in East Asia" to "Taiwan [...] is a province in People's Republic of China." No explanation was provided in the edit summary. User:Kusma reverted the change five minutes later. Xiaolifeidaohank quickly implemented the change again, and then subsequently removed mention of "Republic of China" from the sentence, without an edit summary. Kusma reverted this, and the exchange happened once more before the article's lead was left in its original form, 11 minutes after the edit war started. The BBC characterized this as an "an editorial tug of war that – as far as the encyclopedia was concerned – caused the state of Taiwan to constantly blink in and out of existence over the course of a single day."

According to Click, this single-day event is part of a larger conflict over the politicization of Chinese content across both English and Chinese Wikipedias. This conflict includes a change to the Senkaku Islands article on Chinese Wikipedia to say that they are "China's inherent territory". The territorial status of the islands is currently disputed by Taiwan, the People's Republic of China, and Japan. The Chinese article for 1989 Tiananmen Square protests was altered to describe the event as "the June 4th incident" to "quell the counter-revolutionary riots". Meanwhile, the article for the 2019 Hong Kong protests has seen intense debate over whether to characterize the participants as protesters or rioters. Click identified "almost 1,600 tendentious edits across 22 politically sensitive articles" without specifying the Wikipedia language versions.

Fundamental conditions

Click skipped very quickly over two of the fundamental conditions that the Chinese Wikipedia operates under. Editor 1233 told The Signpost that a "systematic "firewall" against the free flow of information has created a direct and effective blocking of voices from inside the wall. I cannot say for sure that the pro-China voices coming from inside China are cherry-picked. However, the firewall is so effective that most, other than those who are considered 'top dissidents', already have their voices shut out of the outside world."[2]

Some Wikipedia language versions have been blocked in China starting as early as 2004, but the Chinese Wikipedia was most affected. By April 2019, all Wikipedia versions were reported to be blocked. Mainland Chinese editors who wish to edit any version of Wikipedia can try to edit through proxies.

The use of proxies has had an effect on Click's coverage according to the unnamed mainland editor. It "seems to target the Chinese government, but anyone who reads it would apply these false statements to ordinary mainland Chinese Wikimedians in general and conclude that all mainland Chinese contributors are sent by the Chinese government."[2]

References

  1. ^ a b "China and Taiwan clash over Wikipedia edits". Click (TV programme) at BBC. 5 October 2019. Retrieved 26 October 2019.
  2. ^ a b c d e f g All interviews by The Signpost were conducted during October 2019

For related coverage, see: Interview and Community view



Reader comments

2019-10-31

Carl Miller on Wikipedia Wars

External videos
video icon Wikipedia Wars?, BBC, 24:30[1]

Carl Miller is the Research Director of the Centre for the Analysis of Social Media (CASM) at Demos, a London think tank. He has reported several segments on the BBC's Click program, including Wikipedia Wars? about possible editing on Wikipedia by the Chinese government. His interests include how social media is changing society, digital politics and digital democracy, information warfare and online disinformation, digital and citizen journalism, and "fake news". He is a Visiting Research Fellow at King's College London.

This interview covers aspects of information warfare, and the strength and weaknesses of Wikipedia's defenses against editing by governments

Signpost: So how did you get a job investigating government editing of Wikipedia?

Carl Miller: For almost a decade now I've worked at a think tank called Demos. It's basically a research-focussed charity that works on understanding current, important live political questions.

In 2012, myself and colleagues were convinced that the rise of social media would totally transform society and also how it could be researched. So we founded the Centre for the Analysis of Social Media, a mix of Demos researchers and technologists based at the University of Sussex that would work together to study digital life.

The Centre researched all kinds of things that digital technology was influencing and changing: conspiracy theories; literacy; radicalisation, political campaigning, trust, and major events.

In 2018, I took a step back to try to tie all the changes we were seeing into a larger picture. It became a book, called The Death of the Gods, and focussed on how power was shifting across society in often overlooked or invisible ways.

One of the areas on which the book concentrated was information warfare. This was part of the undeniable rise of the power of states online, and I saw that the interests of states were often colliding with those of online communities and their norms and cultures that had emerged and grown online.

After the book, I teamed up with the BBC to take some of the most important issues and stories I'd come across, and put them on TV. I've gone out to Kosovo to meet fake news merchants. I've tried to recreate myself with data, and looked at how tech can reinvent democracy. And now, of course, I've looked at whether information warfare has extended beyond Twitter and Facebook.

any organised and determined group – state or otherwise – (could) attempt to strategically change the content of Wikipedia over a long span of time. They (could) create and support large groups of people to join its open community ... and actually use the processes that exist to change both the content that Wikipedia hosts and the policies that govern it.

SP: In August 2018 you wrote an article in The New Statesman, Wikipedia has resisted information warfare, but could it fight off a proper attack? mostly using the example of the Russian government, saying that Wikipedia has good defenses against some types of organized POV pushing editing, but not against other types. John Lubbock, from WikiMedia UK, responded in another article in The New Statesman, It'd be a lot harder for foreign governments to hack Wikipedia than you think stressing the time it takes to become an administrator, our strong and diverse community of editors, and the transparency of each edit ever made on Wikipedia. Can you describe what types of attacks Wikipedia is prepared for and what types it is not?

CM: I think Wikipedia is stunningly good at protecting itself from vandalism – from outright breaches in its content policies, or patently malicious attempts to deface content. But the possible threat from organised actors is, I think, substantially different. Not vandalism but entryism.

This would be where any organised and determined group – state or otherwise – would attempt to strategically change the content of Wikipedia over a long span of time. They would create and support large groups of people to join its open community, build reputations and prestige within it, run for positions of office and actually use the processes that exist to change both the content that Wikipedia hosts and the policies that govern it.

Of course, as John rightly points out in his article, there will be other editorial communities that will push against this. But it would turn into a long and exhausting struggle and I'm just much less confident that – especially across the non-English Wikipedias – that a volunteer community would necessarily outlast one that has significant levels of state support behind it.

It's probably also worth re-stating the caveats of our investigation at this point. We cannot be sure who were behind the edits that we found, or why they were done. We don't know how widespread the practice is, and there was nothing that directly tired these edits back to the Chinese government. However, I think the evidence that we found points to the possibility of the threat; one I don't think should be ignored.

Wikipedia is one of the most priceless patches of real-estate anywhere on the Internet. It has billions of page views, is highly trusted… Given its value, it is likely there is a range of actors that have an interest in its manipulation.

SP: What countries or government agencies do you suspect have the largest Wikipedia editing programs? the most dangerous editing programs?

CM: We just don't know. Attributing covert state-actor activity on the Internet is almost impossible for independent researchers and journalists at the best of times. Wikipedia is open for researchers to look at, but vast. It may exist, but I certainly haven't seen a detection regime on Wikipedia that is at least partly automated, systematic and multi-lingual. Unless we have that, we have no ideas what part of Wikipedia are being targeted and what kinds of changes are happening, much less who are behind these edits or why.

I think it is very clear, however, that Wikipedia is one of the most priceless patches of real-estate anywhere on the Internet. It has billions of page views, is highly trusted, populates Google's knowledge boxes, is extremely highly ranked by Google's Search, and gives answers to Siri and a number of other digital assistants. It's an important service in its own right, and other even more important services depend on it too. Given its value, it is likely there is a range of actors that have an interest in its manipulation.

SP: When can you say that a country has "gone over the line" in its interactions with Wikipedia? If you were to propose an international anti-infowar treaty, what rules would you include?

CM: I think your readers are far better placed than me to know where this line is – but I think what crosses the line is that it is a country, rather than a volunteer community that is making the edits. A state strategy of edits for geo-politics seems to run clearly against the basic reasons why Wikipedia was set up in the first place. Unfortunately, as with influence operations elsewhere, the distinction most key is unhelpfully the one that is least clear to us: intention.

An international treaty on this – like any international treaty – would have to be predicated on the idea that this kind of activity is mutually damaging. In other words, that it is in the interests of all states to step back from strategic editing activity. Whether that is actually the case, of course, remains to be seen.

SP: There was an interesting moment on the BBC program when you asked Taiwanese editors “What does Wikimedia global think about all of this?” and they started giggling. Were they telling you that the WMF doesn't realize the extent of the problem, that they don't want to address the problem, or that they are powerless to address the problem?

CM: You'll need to speak to the editors themselves to learn exactly what they meant by that. But (and this of course is only my interpretation) I got the sense that, yes, they believe that their complaints had been overlooked, principally because of the problem of language. Much of the editing had to do with the pruning of language to change its implication and even subtle connotation. These changes might be highly significant in Mandarin Chinese, but – I got the sense – were often difficult to communicate to an audience that only spoke English. Some of the editing activity and the abuse Taiwanese Wikipedians received was being lost in translation.

SP: I'm sure you've communicated with people at the WMF. What do they think of your analysis? What would you suggest that they do?

CM: Here, it's worth stating WMF's full response, which we didn't have a chance to do within the broadcast:

Manipulation of Wikipedia for personal or political gain goes directly against the purpose and mission of the site – to be a neutral, reliable source of free knowledge for the world. It violates the trust we have with our readers and volunteer editors to provide the facts based on verified sources.

To address the scale of more than 350 edits per minute, volunteer Wikipedia editors, Wikipedians, have created systems and processes to guard for Wikipedia's neutrality and reliability. Wikipedians regularly review a feed of real time edits; bots spot and revert many common forms of negative behavior on the site; and volunteer administrators(more senior, trusted editors elected by other volunteers with oversight tools) further investigate and address negative behavior to keep articles neutral and reliable. These systems allow many different editors to work together to maintain neutrality. If anyone attempts to intimidate other volunteers or exert control over an article, there are governance structures in place on the site to identify and stop this kind of negative behavior.

Wikipedia's open editing model also means that anyone can go back and evaluate how an article has evolved over time and why. You can see this either by looking at the talk page, where editors discuss changes to an article, as well as the publicly viewable edit history of the article.

As more people edit and contribute to Wikipedia, articles tend to become more neutral and balanced over time. Articles about contentious topics in particular, because they are watched so regularly by editors, tend to become higher quality and more neutral. In the few examples you shared, most of the edits were reverted within minutes by volunteers. This can be seen in the timestamps of the article histories you link to, or by clicking “next edit.” These processes have generally been effective in identifying and responding to edits that don't meet Wikipedia's standards. Consider the article about global warming on English Wikipedia, which is a featured article, the highest quality article based on criteria set by volunteers, and now has almost 300 citations to verifiable sources.

The Wikimedia Foundation does not write, edit, or determine what content exists on Wikipedia. Volunteer editors do. Each language Wikipedia, of which there are roughly 300, has its own community of volunteer editors that set the norms and standards for how information is represented on the site. This model leads to more robust articles that meet the cultural and linguistic nuances of a given language Wikipedia. Chinese Wikipedia, while blocked in mainland China, continues to be edited by a community of nearly 30,000 volunteers around the world each month, which includes Taiwan and Hong Kong where it remains accessible. As with any case identifying potential bias on Wikipedia, we'll pass on these concerns directly to volunteer editors to further evaluate and take action where needed.

They went on to say:

"In terms of what you mention about concerns raised by our Taiwanese community, I wouldn't say it's fair to characterize the Foundation's actions as overlooking concerns. We take issues raised by our communities seriously, and we continue to engage with our Taiwanese community."

That is an extremely dynamic task, as attackers try to camouflage their activity, each platform's defenders constantly have to develop new ways of detecting it. Then they have constant operational interventions and multi-layered challenges to remove manipulative content, ban accounts, and de-rank suspicious activity.

SP: Ultimately, how do you think the problem of governmental editing of Wikipedia will be resolved?

CM: Since the discovery of significant, state-based manipulation of their platforms in around 2016, the tech giants have built information integrity, safety and ‘conversational health' teams very rapidly. If you look at what these teams do, they're engaged in strange kind of arms race with a range of actors – including states – that want to game, manipulate and subvert their platforms. They call the problem ‘coordinated and inauthentic' activity.

They use data science and scaled heuristics to model what this activity looks like on the platform at scale, and how it's different from non-suspicious activity. That is an extremely dynamic task, as attackers try to camouflage their activity, each platform's defenders constantly have to develop new ways of detecting it. Then they have constant operational interventions and multi-layered challenges to remove manipulative content, ban accounts, and de-rank suspicious activity.

In one way, the challenge for Wikipedia is the same as for the tech giants, and in another way it is completely different. I think it is possible to build out similar approaches to detecting suspicious editing activity: creating models to identify possible geo-political intent, and finding patterns and anomalies that point to irregular, unusual or inorganic behaviour. At the very least, a system like this would be able to notify Wikipedia's editors about the places they need to patrol the most.

The new part of the challenge, however, is how to do this without the business models of the tech giants behind you. Integrity teams in platforms of equivalent sizes to Wikipedia number from the hundreds into the tens of thousands of full-time staff. Wikipedia's model is completely different.

This I think is an area where charitable foundations and trusts need to step in to support a new, non-profit capability to protect not only Wikipedia but the other parts of the open web that doesn't have the commercial ability to build out protective teams as part of a salaried staff. A new kind of citizen information integrity needs to be carved out, that leverages the enormous passion, skill and knowledge of Wikipedia's editor-base, with the constantly updating, dynamic kind of detection-to-mitigation tools that the tech giants have developed. That is where I would love the conversation to head next.

SP: What will happen if your solution, or at least some solution, isn't implemented?

CM: This case might represent something bigger: how do the parts of the Internet without a business model respond to the rise of online manipulation?

Either some kind of public-interest, non-profit alternative to the tech giants' mitigation strategies needs to be found, or, in my opinion, we'll see this kind of split emerge where the commercial internet maintains some kind of protection, and everything else becomes more vulnerable.

References

  1. ^ "China and Taiwan clash over Wikipedia edits". Click (TV programme) at BBC. 5 October 2019. Retrieved 26 October 2019.

For related coverage, see: In focus and Community view



Reader comments

2019-10-31

Observations from the mainland

Yan is a resident of the People's Republic of China and an administrator on the Chinese Wikipedia -S

I'm currently in Beijing, finishing a meetup of the Wikimedians of mainland China with over 50 attendees. I'm not writing as the group's official representative, but in some ways I may be fairly typical of an experienced member. As an admin on the Chinese Wikipedia I perform regular maintenance. I've written about China's internet backbones, explaining China's censorship policies and related problems. Other mainland Chinese Wikimedians are working on articles on local histories and monuments – something that desperately needs more contributions.

Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment.

— The Wikimedia Foundation

Let me break down the WMF's commitment into two.

Every single human being can freely share?

“Freely sharing" means both "reading" and "writing". Free knowledge will never be read by its intended audience if we do not distribute it. We need to take government censorship more seriously and consider how to best react to it.

"Every single human being" includes Chinese mainlanders. Yet the WMF's policies are more responsible for denying access to China than the Chinese government is. One example is the "IP block exempt" user flag. Only with this flag can someone use a proxy (VPN) to access Wikipedia. Since all language versions of Wikipedia are blocked in China, proxies are the only way to read or edit Wikipedia. All mainland Chinese Wikimedians must have this flag, and it has to be added by admins on a case-by-case basis. But the WMF removed local checkuser rights on the Chinese Wikipedia, which has since increased stewards' workloads on Meta, making the "IP block exempt" problem even worse for us.

Barring innocent contributors is definitely not what “every single human being" should mean.

The WMF sued the Turkish government for blocking Wikipedia, but it hasn't done anything about censorship in China. We pay for our own VPNs; we pay for our own meetups; but we've received nothing from the WMF's US$100 million annual budget.

The sum of all knowledge?

There are few mainland China-related articles on Wikipedia. The contributors are roughly equally divided among Hongkongers, Taiwanese, and mainlanders. This means with the same number of editors, mainland contributors cover a geographic area and population dozens of times larger than those of the other two. The upshot is that the articles of many large mainland cities are not as detailed as a town in Taiwan. A city like Beijing, with about the same population as Australia, has only one or two dozen active contributors.

There are double standards on notability. A bus line in Hong Kong can have its own dedicated article, but in the mainland city of Yuhuan, with a population of 400,000, the single active contributor is focusing on local articles and has difficulty meeting notability criteria.

So the WMF's "Imagine a world" statement makes little sense in China. It sounds like a propaganda slogan to me – something like the American dream or the Chinese dream, or the Communist party propaganda I see on the street.

Advocacy

The WMF's advocacy doesn't help. People in communities without good English-language skills are the ones most likely to be forgotten. We encyclopedia-writing nerds don't care about those slogans either. We just want somewhere to contribute real solid content.

What has happened in China contradicts the WMF's slogans. China allows uncensored information to be accessed by the privileged few, instead of the masses. The "privileged few" are mostly scholars, college students, people who received higher education, and politicians making policies. That's why we’re saying: "Wikipedia has been blocked in mainland China" instead of "Wikipedia has been banned in mainland China." There are no laws to explicitly ban Wikipedia and its activities, though the site itself has been blocked. But Wikipedia is well-known by those with higher education, and their views on the site are quite the opposite of how people think of Wikipedia in the West, where most people treat Wikipedia as a valuable source academically, instead of trying to stop Wikipedia from academic uses. Beijing doesn't care if just a few people know about things like the Tiananmen Square protests.

As ordinary mainland contributors, we care more about preserving the present and past for future generations. Motivations for contributing to Wikipedia are different for different editors, but this one is important among mainlanders. When the day China regains access to Wikipedia, we don't want to see there is almost nothing about China when we get there.

International relations

China has had a troubled history with some countries, but we don't care about ancient history. Let's take the current senseless Sino-American trade war. Neither the trade war nor the Communist party’s propaganda make Chinese people hate Americans, though it does make us think that some Americans, especially Donald Trump, are jerks. I don't think there is even a need for increased understanding between Chinese Wikimedians and those from the Western hemisphere, since there is little interference and distrust between us. What genuinely alienates Chinese and Westerners are modern-day political problems, biases developed through media exposure and education (Westerners included), and a post-Cold War residue of ideological disputes. Abandoning such political squabbling and focusing on WMF projects is what we should be doing.

Diversity

There are 56 officially recognized ethnic groups in China, with Han being the largest, at over 90% of the population. Mainland editors have never surveyed ourselves by ethnicity or gender, but based on what I personally know, there are Zhuang, Manchu, and Korean-Chinese editors. Han's percentage is higher than 90%, as many ethnic minority groups live in underdeveloped western parts of China, lacking access to the internet and the awareness required to use Wikipedia.

Political diversity is another question. This is one reason I’m unable to speak for our user group as a whole. We have members who are pro-Beijing and others who are pro-democracy. The whole group doesn't agree on a single political statement, but everybody's fine letting me speak for myself.

Wikimedians of mainland China

An official Wikimedia User Group China (WUGC) was recognized by the Affiliation Committee in 2014 but stopped accepting new members when the Chinese Wikipedia was blocked by China in May 2015. Their public activities since then have been extremely rare.

Unhappy with the situation, another group of mainland Wikimedians, including me, created a new user group in early 2017, naming ourselves "Wikimedians of Mainland China User Group" (WMCUG). The founding members started the Shanghai bi-weekly meetup, the most frequent regular meetup in mainland China ever. Most currently active mainland Chinese Wikimedians are already our members or at least pro-WMCUG.

A law came into effect in China in 2017, barring foreign non-governmental organizations (NGOs), including the WMF, from carrying out activities in China. How to establish a formal branch of the WMF in China is a large topic we can deal with later. But at this moment, we are the de-facto user group representing mainland China. Because of the new law, we won't bother getting recognition from the Affiliation Committee for awhile.

For related coverage, see: In focus and Interview

As usual, we welcome polite commentary from our readers in the section below. Yan has requested direct questions to him from readers be set aside at the top of the comments section, with the understanding that he may not be able to respond within 24 hours for technical reasons.



Reader comments

2019-10-31

October actions

Two long-term members of the community have departed as a result of Arbitration Committee actions this month.

"Unblockable"

Eric Corbett, a Wikipedian since 2006, was indefinitely blocked by the Arbitration Committee on September 2. ArbCom member KrakatoaKatie, writing for the committee, and announcing closure of an 18 August 2019 request for an Arbcom case, stated "The Arbitration Committee has been made aware of and has independently confirmed that Eric Corbett, since his public retirement, has been abusively misusing multiple accounts and disruptively editing while logged out. Eric Corbett's accounts are hereby indefinitely blocked by the Arbitration Committee."

Apparently following banning policy, ten minutes after ArbCom's decision was posted, English Wikipedia administrator Joe Roe made an edit to Corbett's userpage that would notify the community of the block. Several edits by others ensued, both administrators and non-administrators. First blanking the page, several more additions and removals of the templates and one final addition. After this, on October 9, administrator Floquenbeam removed the templates with the edit summary "busybodies",[1] and protected the userpage from editing. Floquenbeam told The Signpost: "That was a controversial tagging, which was edit warred over last month. It had died down and was quiet for quite a while [...] But now no more drive by shit stirring can happen."

Corbett's name appears frequently in The Signpost's archives including October 28, 2015 Arbitration report and December 30, 2015 Arbitration report; the latter detailing an administrator's privileges being removed by the Arbitration Committee due to actions related to Corbett. Our December 10, 2014 In the media report quoted an author in Slate who had described Corbett as one of "'The Unblockables', a class of abrasive editors who can get away with murder because they have enough of a fan club within Wikipedia, so any complaint made against them would be met with hostility and opprobrium."

Off-wiki harassment results in indefinite site ban

On 1 October, Icewhiz was indefinitely site banned. According to the Arbitration Committee, they have "engaged in off-wiki harassment of multiple editors". Responding to criticism of methods and jurisdiction, Mkdw stated

This case is further complicated by reports in Haaretz and other news sources that Icewhiz was responsible for the exposure of a 15 year old hoax on Wikipedia about a fake Nazi extermination camp. For further coverage on these reports see In the Media.

Resignation and retirement of former Arbitrator

On October 20, former Arbitration Committee member DeltaQuad, aka Amanda, retired from English Wikipedia, was desysoped at her own request (resigned as administrator), and had her other advanced permissions including oversighter removed at her own request. She was recently acting as an ArbCom clerk. DeltaQuad declined to further explain any reason for the resignation to The Signpost.[2]

As of issue deadline, 36 accounts on English Wikipedia have oversight rights. It is one of the most tightly held system controls issued only to the founder, Arbitration Committee members, and a handful of others, and can expunge information from view even by "ordinary" administrators.

Other matters

Notes



Reader comments

2019-10-31

Wrestling with a couple of teenagers, a Nobelist, and a lot of jokers

This traffic report is adapted from the Top 25 Report, prepared with commentary by Igordebraga (September 22 to October 5, October 13 to 19) and Hugsyrup (October 6 to 12).

All around the world, you've got to spread the word (September 22 to 28)

Most Popular Wikipedia Articles of the Week (September 22 to 28, 2019)

This is a week where our report goes all over the place, and this can be attested by how the only article with millions of views is a teenage environmental activist. Otherwise, there's crimes (#2), deaths (#3, #6, #7), economical turmoil (#4) , movies (#5, #8), TV (#10), and Google Doodles (#9).

For the week of September 22 to 28, 2019, the 25 most popular articles on Wikipedia, as determined from the WP:5000 report were:


Rank Article Class Views Image About
1 Greta Thunberg 2,348,736
Described by Samantha Bee as "a 16-year old with an agenda who knows how to hold a grudge (that) terrifies me, and I love her for it", this young Swedish activist has been in the news for her eloquent speeches complaining on the lack of action to control climate change. In the meantime, the internet has been appropriating Miss Thunberg's image in both amusing and demonizing ways.
2 6ix9ine 866,603
In a sharp contrast to our #1, someone with a terrible public image: a rapper who began his court trial alongside ten other members of the Nine Trey Gangsters for charges of racketeering, drug distribution, weapon possession, and conspiracy to commit murder.
3 Deaths in 2019 759,498
We live a dying dream
If you know what I mean
It's all that I've ever known
4 Thomas Cook Group 723,045
The sudden closure of this British travel group not only left 21,000 people without a job, but forced the Civil Aviation Authority to find a way to bring 150,000 tourists back home.
5 Joker (2019 film) 665,946
Joaquin Phoenix as the famed DC Comics supervillain hasn't even hit theaters yet, but already raised discussions among moral guardians whether its plentiful violence can inspire copycats.
6 Sid Haig 611,865
Two dead actors, one from Hollywood (who despite a long career had his most success in the 21st century, especially as Captain Spaulding in three Rob Zombie horror flicks) and another from Tollywood (best known as a comedian and with over 600 film credits!).
7 Venu Madhav (actor) 609,604
8 Ad Astra (film) 597,534
Brad Pitt goes after his father Tommy Lee Jones in the orbit of Neptune, in a movie that's a rough cross between 2001 and Apocalypse Now. Ad Astra had positive reviews (this writer wasn't very impressed, but admits the movie is alright) and opened at #2 in the box office behind Downton Abbey.
9 Junko Tabei 574,227
Google celebrated the 80th anniversary of this Japanese climber (who died in 2016), who became the first woman to top Mt. Everest, and also the biggest mountains of every continent.
10 Fleabag 555,826
Upon hearing this was the Emmy favorite, this writer decided to watch it... and the day after Fleabag had won many of the top prizes, including Comedy Series, had wasted six hours of his life with all the episodes of this unfunny, vulgar thing that for every moment that worked had three which made him think "why am I still watching this?!" And to think that Amazon Prime Video already offers a much better series, The Marvelous Mrs. Maisel, which could have gotten the awards instead.

What a Joker, I play my jokes upon you (September 29 to October 5)

Most Popular Wikipedia Articles of the Week (September 29 to October 5, 2019)

Comic book movies, they happen so much every year and still bring in views and related entries to our Top 25 Report. The latest one is Joker, which not only topped the list but makes sure that the main star (#3) appears. Still in movies, there's American (#10) and Indian productions (#2), while the rest of our entries are scattershot: famous people dying (#4, #8), holidays (#5), activism (#8), music (#6), and Google Doodles (#7).

Take a look at my report, it's the only one I've got. For the week of September 29 to October 5, 2019, the 25 most popular articles on Wikipedia, as determined from the WP:5000 report were:

Rank Article Class Views Image About
1 Joker (2019 film) 3,078,817
DC Comics combined the Marvel adaptations Logan (unorthodox approach adapting superhero comics) and Venom (centering around the villain – although in this one the hero still has a brief appearance) and thankfully hit closer to the former, as Joker is a very well made drama reminiscent of Taxi Driver where a troubled guy loses his mind in what is basically 1970s New York – though still with some laughs, given the protagonist is a clown and director Todd Phillips made a name for himself with works such as The Hangover. Joker had good reviews and opened to a whopping $93 million in the box office.
2 War (2019 film) 1,091,540
What is it good for? Well, at least ₹172.02 crore, as this Bollywood action hit starring Hrithik Roshan (pictured) and Tiger Shroff already made this much in the box office.
3 Joaquin Phoenix 1,048,608
Already nominated for Academy Awards in Gladiator, Walk the Line and The Master, Joaquin Phoenix is currently pegged as an Oscar favorite for our #1, where he's unhealthily thin and amusingly weird and deranged as a clown who descends into a life of crime.
4 Deaths in 2019 742,990
Oh my God, no time to turn
I got to laugh 'cause I know I'm gonna die
Why?
5 Mahatma Gandhi 742,664
Either he or some monk named Siddarta is the most famous Indian ever, and the holiday celebrated on Gandhi's birthday always brings views for his article.
6 Billie Eilish 690,661
The season opener of Saturday Night Live's 45th season had as the musical guest this singer-songwriter who will only come of age in December, and yet is already guaranteed to be one of the most viewed Wikipedia articles of the year – maybe in the top 10! Along with national exposure, the high views for the performance are probably also fueled by Eilish deciding to go Dancing on the Ceiling.
7 Herbert Kleber 673,197
Google celebrated the 23rd anniversary of this psychiatrist's election to the National Academy of Medicine – coincidentally, four days before his death completed one year – in recognition to his work regarding drug addiction.
8 Greta Thunberg 663,816
Another teenage girl, though energy-wise a polar opposite to our #6 – while Ms. Eilish sings slow songs calmly, Ms. Thunberg is always shouting to make sure her environmental messages cause the emotional impact they deserve.
9 Diahann Carroll 614,719
The first Black woman to win a Tony Award, Diahann Carroll died at 84, ending a career that also had nominations for the Oscar and the Emmy.
10 Judy Garland 569,100
Judy depicts the last days of the eternal Dorothy Gale, played there by Renée Zellweger.

Jokers, meth cookers, and child stars (October 6 to 12)

Most Popular Wikipedia Articles of the Week (October 6 to 12, 2019)

There are some interesting connections in this week's report. #1 is a popular new film, in which #2 plays #8, who was previously played by #7. #22 is the partner of #2, while #12 is his younger brother, and was a child star who died of a drug overdose, much like #13. That neat mesh of connections covers a good chunk of the week's report, while the remainder comprise a familiar mix of wrestling, Indian movies, Netflix productions, and of course dead celebrities.

For the week of October 6 to 12, 2019, the 25 most popular articles on Wikipedia, as determined from the WP:5000 report were:

Rank Article Class Views Image About
1 Joker (2019 film) 3,916,575
It's no surprise that this week's list is going to be dominated by Jokers. The latest, and perhaps most controversial, incarnation of the comic book character (item #1) hit cinemas earlier this month. It features #2 as an isolated, lonely and mentally ill would-be comic who slowly finds his villainous side.
2 Joaquin Phoenix 1,766,842
3 War (2019 film) 1,109,505
We always see a lot of movies on this list, but it's not every week that a non-English film makes it quite so high up. This Hindi-language action thriller stars Tiger Shroff as an Indian soldier assigned to hunt down his former mentor, played by Hrithik Roshan.
4 El Camino: A Breaking Bad Movie 834,546
It wouldn't be a Top 25 without a Netflix original movie or series. This week's #4 completes the story of Jesse Pinkman, played by Aaron Paul and last seen in the finale of Breaking Bad back in 2013.
5 Hell in a Cell (2019) 792,414
The Top 25 isn't all Netflix, wrestling and death, I promise. But of course the latest big wrestling event always has to make an appearance, and this week it was this one that took place on October 6, 2019 at Golden 1 Center in Sacramento, California.
6 Deaths in 2019 736,002
Death and taxes may be life's only certainties, but taxes never seems to appear on the Top 25 for some reason, so I guess we're all a lot more interested in death.
7 Heath Ledger 610,510
Interest in the latest actor to play the Joker seems to have sparked interest in the character himself, as well as the previous incumbent of the role. At the time, many felt that Heath Ledger's portrayal in The Dark Knight in 2008 could never be matched. Whether that's still true is, and will probably remain, a matter of considerable debate.
8 Joker (comics) 589,608
9 Ginger Baker 540,799
The first of this week's recently-deceased celebrities, Peter 'Ginger' Baker was an English drummer and the co-founder of rock band Cream. He died on 6 October 2019 after a short illness.
10 Samuel Little 537,066
Believed to be America's most prolific serial killer, Little claims to have killed 93 people and appears on this list after making headlines for closing a number of cold cases through the chilling portraits he draws of his victims.

The Joker is wild, you know he's got a lot of views (October 13 to 19)

Most Popular Wikipedia Articles of the Week (October 13 to 19, 2019)

While It Chapter Two didn't have the lasting power of the first on our report, the same can't be said of another monster clown, albeit one more human: Batman's nemesis Joker, whose solo movie has topped the list for three weeks and brought along the main actor (#4). More movies can also be found, on Netflix (#8) and Indian theaters (#9) – the country also produced a Nobel winner (#6). Present as always are the recently deceased (#2, #5, #7, #16), politics (#10), and Google Doodles (#3).

For the week of October 13 to 19, 2019, the 25 most popular articles on Wikipedia, as determined from the WP:5000 report were:

Rank Article Class Views Image About
1 Joker (2019 film) 2,163,336
Jared Leto apparently tried to block this movie from being made, given Warner Bros. had promised a solo Joker film for him, only to greenlight this unrelated 1970s throwback instead. And now Joker is almost outgrossing Leto's maligned take in Suicide Squad, with much more positive reviews, to boot.
2 Sulli 1,388,588
Ever since "Gangnam Style", K-pop has broken out in surprising ways, specially for those who write and read this report. And this entry related to South Korean music is a sad one, as singer/actress Choi "Sulli" Jin-ri died from a possible suicide at just 25, following cyberbullying taking its toll.
3 Joseph Plateau 1,076,405
Google celebrated this Belgian physicist whose research on optics led to the phenakistiscope, one of those disks that when spun create the illusion of a moving image.
4 Joaquin Phoenix 1,031,901
For those who remember the I'm Still Here days where Joaquin Phoenix had seemingly gone insane and decided to go from acting to rapping, him becoming the certified insane Joker in our #1 seemed like natural casting.
5 Elijah Cummings 940,273
The big death of the week, a Maryland politician who had over two decades in the House of Representatives and this year took over the Committee on Oversight and Reform.
6 Abhijit Banerjee 905,237
Indians getting big views, no surprise here. Though this one had a global interest reason for it, as Banerjee, alongside his wife Esther Duflo and also Michael Kremer, won the Nobel Prize in Economics for their experimental approach to alleviating global poverty.
7 Deaths in 2019 746,129
Last night the wife said
"oh, boy when you're dead,
You don't take nothing with you, but your soul, think!"
8 El Camino: A Breaking Bad Movie 712,432
The show Honest Trailers described as "so powerful, you binge-watch it on Netflix; so all-consuming, you push it on your friends, even if they don't watch TV; and so addicting, you can't shut up about it. It's basically like drugs." gets a Netflix epilogue centered around Aaron Paul's character Jesse Pinkman, that also served as an epitaph for one of its actors, Robert Forster, who died the exact day of release.
9 War (2019 film) 666,731
This Bollywood action epic starring Hrithik Roshan, Tiger Shroff and Vaani Kapoor is now the highest-grossing Indian movie of the year.
10 Tulsi Gabbard 624,330
Hillary Clinton claimed that Russia was "grooming" a female Democrat to run as a third-party candidate who would help President Trump win reelection via the spoiler effect. Many in the media interpreted that as being Hawaiian Representative Gabbard, who denied running independently if not chosen and had other candidates defending her.

Exclusions

  • These lists exclude the Wikipedia main page, non-article pages (such as redlinks), and anomalous entries (such as DDoS attacks or likely automated views). Since mobile view data became available to the report in October 2014, we exclude articles that have almost no mobile views (5–6% or less) or almost all mobile views (94–95% or more) because they are very likely to be automated views based on our experience and research of the issue. Please feel free to discuss any removal on the Top 25 Report talk page if you wish.



Reader comments

2019-10-31

Wiki Loves Broadcast

Wikiolo edits on the German Wikipedia and volunteers for Wikiproject WLTV. A longer version of this article originally appeared on the Kurier.
Project logo

The project Wiki Loves TV & Radio (WLTV) of the German Wikipedia is working to ensure that German public service broadcasters will place material under Wikipedia-compatible Creative Commons licenses. The project is ambitious and the interactions between broadcasters and the wiki community were sometimes contenious. However, we are proud to report that the public service broadcaster ZDF has published clips and photos from the editorial office of Terra X under the license CC BY 4.0 in a pilot project.

Wikipedia editors are requested to enroll for the project and register materials they would like to include in articles. WLTV will ask shows and series of public service broadcasting to release that content under the proper license. Wikipedia will benefit from the high-quality audiovisual materials it provides. Wikimedia Germany's demand is "public money? Public good!" We hope German public service broadcasters will support a strong, up-to-date encyclopedia that supplants the "dubious sources" on the Internet. It is all the more symbolic that the pilot project turned its attention to the topic of climate change – also a topic of a future that already has begun.



Reader comments

2019-10-31

Research at Wikimania 2019: More communication doesn't make editors more productive; Tor users doing good work; harmful content rare on English Wikipedia

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Research presentations at Wikimania 2019

This year's Wikimania community conference in Stockholm, Sweden featured a well-attended Research Space, a 2.5-days track of presentations, tutorials, and lightning talks. Among them:


"All Talk: How Increasing Interpersonal Communication on Wikis May Not Enhance Productivity"

Enabling "easier direct messaging [on a wiki] increases... messaging. No change to article production. Newcomers may make fewer contributions", according to this presentation of an upcoming paper studying the effect of a "message walls" feature on Wikia/Fandom wikis that offered a more user-friendly alternative to the existing user talk pages. From the abstract:[1]

"[We examine] the impact of a new communication feature called “message walls” that allows for faster and more intuitive interpersonal communication in wikis. Using panel data from a sample of 275 wiki communities that migrated to message walls and a method inspired by regression discontinuity designs, we analyze these transitions and estimate the impact of the system’s introduction. Although the adoption of message walls was associated with increased communication among all editors and newcomers, it had little effect on productivity, and was further associated with a decrease in article contributions from new editors."


Presentation about a paper titled "Tor Users Contributing to Wikipedia: Just Like Everybody Else?", an analysis of the quality of edits that slipped through Wikipedia's general block of the Tor anonymizing tool. From the abstract:[2]

"Because of a perception that privacy enhancing tools are a source of vandalism, spam, and abuse, many user-generated sites like Wikipedia block contributions from anonymity-seeking editors who use proxies like Tor. [...] Although Wikipedia has taken steps to block contributions from Tor users since as early as 2005, we demonstrate that these blocks have been imperfect and that tens of thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions of Tor users to other groups of Wikipedia users. Our analysis suggests that the Tor users who manage to slip through Wikipedia's ban contribute content that is similar in quality to unregistered Wikipedia contributors and to the initial contributions of registered users."

See also our coverage of a related paper by some of the same authors: "Privacy, anonymity, and perceived risk in open collaboration: a study of Tor users and Wikipedians"


Discussion summarization tool to help with Requests for Comments (RfCs) going stale

"Supporting deliberation and resolution on Wikipedia" - presentation about the "Wikum" online tool for summarizing large discussion threads and a related paper[3], quote:

"We collected an exhaustive dataset of 7,316 RfCs on English Wikipedia over the course of 7 years and conducted a qualitative and quantitative analysis into what issues affect the RfC process. Our analysis was informed by 10 interviews with frequent RfC closers. We found that a major issue affecting the RfC process is the prevalence of RfCs that could have benefited from formal closure but that linger indefinitely without one, with factors including participants' interest and expertise impacting the likelihood of resolution. [...] we developed a model that predicts whether an RfC will go stale with 75.3% accuracy, a level that is approached as early as one week after dispute initiation. [...] RfCs in our dataset had on average 34.37 comments between 11.79 participants. As a sign of how unwieldy these discussions can get, the highest number of comments on an RfC is 2,375, while the highest number of participants is 831."

The research was presented in 2018 at the CSCW conference and at the Wikimedia Research Showcase. See also press release: "Why some Wikipedia disputes go unresolved. Study identifies reasons for unsettled editing disagreements and offers predictive tools that could improve deliberation.", dataset, and our previous coverage: "Wikum: bridging discussion forums and wikis using recursive summarization".


See our 2016 review of the underlying paper: "A new algorithmic tool for analyzing rationales on articles for deletion" and related coverage


Presentation about ongoing survey research by the Wikimedia Foundation focusing on reader demographics, e.g. finding that the majority of readers of "non-colonial" language versions of Wikipedia are monolingual native speakers (i.e. don't understand English).


Wikipedia citations (footnotes) are only clicked on one of every 200 pageviews

A presentation about an ongoing project to analyze the usage of citations on Wikipedia highlighted this result among others.


See last month's OpenSym coverage about the same research.


About the "Wikipedia Insights" tool for studying Wikipedia pageviews, see also our earlier mention of an underlying paper.


Harmful content rare on English Wikipedia

The presentation "Understanding content moderation on English Wikipedia" by researchers from Harvard University's Berkman Klein Center reported on an ongoing project, finding e.g. that only about 0.2% of revisions contain harmful content, and concluding that "English Wikipedia seems to be doing a pretty good job [removing harmful content - but:] Folks on the receiving end probably don't feel that way."


Presentation about "on-going work on English Wikipedia to assist checkusers to efficiently surface sockpuppet accounts using machine learning" (see also research project page)


Demonstration of "Wiki Atlas, [...] a web platform that enables the exploration of Wikipedia content in a manner that explicitly links geography and knowledge", and a (prototype) augmented reality app that shows Wikipedia articles about e.g. buildings.


Presentation of ongoing research detecting subject matter experts among Wikipedia contributors using machine learning. Among the findings: Subject matter experts concentrate their activity within a topic area, focusing on adding content and referencing external sources. Their edits persist 3.5 times longer than those of other editors. In an analysis of 300,000 editors, 14-32% were classified as subject matter experts.


Why Apple's Siri relies on data from Wikipedia infoboxes instead of (just) Wikidata

The presentation "Improving Knowledge Base Construction from Robust Infobox Extraction" about a paper already highlighted in our July issue explained a method used to ingest facts from Wikipedia infoboxes into the knowledge base underlying Apple's Siri question answering system. The speaker noted the decision not to rely solely on Wikidata for this purpose, because Wikipedia still offers richer information than Wikidata - especially on less popular topics. An audience member asked what Apple might be able to give back to the Wikimedia community from this work on extracting and processing knowledge for Siri. The presenter responded that publishing this research was already the first step, and more would depend on support from higher-ups at the company.


"Discovering Implicational Knowledge in Wikidata" (presentation slides)

From the abstract of the underlying paper:[4][5]

"A distinguishing feature of Wikidata [among other knowledge graphs such as Google's "Knowledge Graph" or DBpedia] is that the knowledge is collaboratively edited and curated. While this greatly enhances the scope of Wikidata, it also makes it impossible for a single individual to grasp complex connections between properties or understand the global impact of edits in the graph. We apply Formal Concept Analysis to efficiently identify comprehensible implications that are implicitly present in the data. [...] We demonstrate the practical feasibility of our approach through several experiments and show that the results may lead to the discovery of interesting implicational knowledge. Besides providing a method for obtaining large real-world data sets for FCA, we sketch potential applications in offering semantic assistance for editing and curating Wikidata."


See last month's OpenSym coverage about the same research


The now traditional annual overview of scholarship and academic research on Wikipedia and other Wikimedia projects from the past year (building on this research newsletter). Topic areas this year included the gender gap, readability, article quality, and measuring the impact of Wikimedia projects on the world. Presentation slides


Other events

See the the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.


Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

Compiled by Tilman Bayer and Miriam Redi

"Revealing the Role of User Moods in Struggling Search Tasks"

In search tasks in Wikipedia, people who are in unpleasant moods tend to issue more queries and perceive higher level of difficulty than people in neutral moods.[6]

Helping students find a research advisor, with Google Scholar and Wikipedia

This paper, titled "Building a Knowledge Graph for Recommending Experts",[7] describes a method to build a knowledge graph by integrating data from Google Scholar and Wikipedia to help students find a research advisor or thesis committee member.

"Uncovering the Semantics of Wikipedia Categories"

From the abstract:[8]

"The Wikipedia category graph serves as the taxonomic backbone for large-scale knowledge graphs like YAGO or Probase, and has been used extensively for tasks like entity disambiguation or semantic similarity estimation. Wikipedia's categories are a rich source of taxonomic as well as non-taxonomic information. The category 'German science fiction writers', for example, encodes the type of its resources (Writer), as well as their nationality (German) and genre (Science Fiction). [...] we introduce an approach for the discovery of category axioms that uses information from the category network, category instances, and their lexicalisations. With DBpedia as background knowledge, we discover 703k axioms covering 502k of Wikipedia's categories and populate the DBpedia knowledge graph with additional 4.4M relation assertions and 3.3M type assertions at more than 87% and 90% precision, respectively."

"Adapting NMT to caption translation in Wikimedia Commons for low-resource languages"

This paper[9] describes a system to generate Spanish-Basque and English-Irish translations for image captions in Wikimedia Commons.

"Automatic Detection of Online Abuse and Analysis of Problematic Users in Wikipedia"

About an abuse detection model that leverages Natural Language Processing techniques, reaching an accuracy of ∼85%.[10] (see also research project page on Meta-wiki, university page: "Of Trolls and Troublemakers", research showcase presentation)

"Self Attentive Edit Quality Prediction in Wikipedia"

A method to infer edit quality directly from the edit's textual content using deep encoders, and a novel dataset containing ∼ 21M revisions across 32K Wikipedia pages.[11]

"TableNet: An Approach for Determining Fine-grained Relations for Wikipedia Tables"

From the abstract:[12]

"we focus on the problem of interlinking Wikipedia tables for two types of table relations: equivalent and subPartOf. [...] We propose TableNet, an approach that constructs a knowledge graph of interlinked tables with subPartOf and equivalent relations. TableNet consists of two main steps: (i) for any source table we provide an efficient algorithm to find all candidate related tables with high coverage, and (ii) a neural based approach, which takes into account the table schemas, and the corresponding table data, we determine with high accuracy the table relation for a table pair. We perform an extensive experimental evaluation on the entire Wikipedia with more than 3.2 million tables. We show that with more than 88\% we retain relevant candidate tables pairs for alignment. Consequentially, with an accuracy of 90% we are able to align tables with subPartOf or equivalent relations. "

"Training and hackathon on building biodiversity knowledge graphs" with Wikidata

From the abstract and conclusions:[13]

"we believe an important advancement in the outlook of knowledge graph development is the emergence of Wikidata as an identifier broker and as a scoping tool. [...] To unite our data silos in biodiversity science, we need agreement and adoption of a data modelling framework. A knowledge graph built using RDF, supported by an identity broker such as Wikidata, has the potential to link data and change the way biodiversity science is conducted.

"Spectral Clustering Wikipedia Keyword-Based Search Results"

From the abstract:[14]

"The paper summarizes our research in the area of unsupervised categorization of Wikipedia articles. As a practical result of our research, we present an application of spectral clustering algorithm used for grouping Wikipedia search results. The main contribution of the paper is a representation method for Wikipedia articles that has been based on combination of words and links and used for categoriation of search results in this repository. "

"Indigenous Knowledge for Wikipedia: A Case Study with an OvaHerero Community in Eastern Namibia"

From the abstract:[15]

"This paper presents preliminary results from an empirical experiment of oral information collection in rural Namibia converted into citations on Wikipedia. The intention was to collect information from an indigenous group which is currently not derivable from written material and thus remains unreported to Wikipedia under its present rules. We argue that a citation to an oral narrative lacks nothing that one to a written work would offer, that quality criteria like reliability and verifiability are easily comparable and ascertainable. On a practical level, extracting encyclopaedic like information from an indigenous narrator requires a certain amount of prior insight into the context and subject matter to ask the right questions. Further investigations are required to ensure an empirically sound approach to achieve that."

"On Persuading an OvaHerero Community to Join the Wikipedia Community"

From the abstract:[16]

"With an under-represented contribution from Global South editors and especially indigenous communities, Wikipedia, aiming at encompassing all human knowledge, falls short of indigenous knowledge representation. A Namibian academia community outreach initiative has targeted rural schools with OtjiHerero speaking teachers in their efforts to promote local content creation, yet with little success. Thus this paper reports on the effectiveness of value sensitive persuasion to encourage Wikipedia contribution of indigenous knowledge. Besides a significant difference in values between the indigenous community and Wikipedia we identify a host of conflicts that might be hampering the adoption of Wikipedia by indigenous communities."

References

  1. ^ Sneha Narayan, Nathan TeBlunthuis,Wm Salt Hale, Benjamin Mako Hill, Aaron Shaw: "All Talk: How Increasing Interpersonal Communication on Wikis May Not Enhance Productivity" To appear in Proc. ACM Hum.-Comput. Interact., Vol. 3, No. CSCW, Article 101. November 2019.
  2. ^ Tran, Chau; Champion, Kaylea; Forte, Andrea; Hill, Benjamin Mako; Greenstadt, Rachel (2019-04-08). "Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor". 2020 IEEE Symposium on Security and Privacy (SP). pp. 186–202. arXiv:1904.04324. doi:10.1109/SP40000.2020.00053. ISBN 978-1-7281-3497-0. S2CID 211132791.
  3. ^ Im, Jane; Zhang, Amy X.; Schilling, Christopher J.; Karger, David (November 2018). "Deliberation and Resolution on Wikipedia: A Case Study of Requests for Comments". Proc. ACM Hum.-Comput. Interact. 2 (CSCW): 74–1–74:24. doi:10.1145/3274343. ISSN 2573-0142. S2CID 53246624. Closed access icon Author's copy
  4. ^ Hanika, Tom; Marx, Maximilian; Stumme, Gerd (2019-02-03). "Discovering Implicational Knowledge in Wikidata". Formal Concept Analysis. Lecture Notes in Computer Science. Vol. 11511. pp. 315–323. arXiv:1902.00916. doi:10.1007/978-3-030-21462-3_21. ISBN 978-3-030-21461-6. S2CID 59599786. ("more detailed version" of doi:10.1007/978-3-030-21462-3_21
  5. ^ Hanika, Tom; Marx, Maximilian; Stumme, Gerd (2019). "Discovering Implicational Knowledge in Wikidata". In Diana Cristea; Florence Le Ber; Baris Sertkaya (eds.). Formal Concept Analysis. Lecture Notes in Computer Science. Springer International Publishing. pp. 315–323. doi:10.1007/978-3-030-21462-3_21. ISBN 9783030214623. Closed access icon Author's copy and slides
  6. ^ Xu, Luyan; Zhou, Xuan; Gadiraju, Ujwal (2019). "Revealing the Role of User Moods in Struggling Search Tasks". Proceedings of the 42Nd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR'19. New York, NY, USA: ACM. pp. 1249–1252. doi:10.1145/3331184.3331353. ISBN 9781450361729. Closed access icon (preprint: https://arxiv.org/abs/1907.07717)
  7. ^ Behnam Rahdari, Peter Brusilovsky: Building a Knowledge Graph for Recommending Experts. In KI2KG ’19: 1st International Workshop on challenges and experiences from Data Integration to Knowledge Graphs, August 05, 2019, Anchorage, AK. http://di2kg.inf.uniroma3.it/papers/DI2KG_paper_2.pdf
  8. ^ Heist, Nicolas; Paulheim, Heiko (2019-06-28). "Uncovering the Semantics of Wikipedia Categories". arXiv:1906.12089 [cs.IR].
  9. ^ Alberto Poncelas, Kepa Sarasola, Meghan Dowling, Andy Way, Gorka Labaka, Inaki Alegria: "Adapting NMT to caption translation in Wikimedia Commons for low-resource languages" http://ixa.eus/sites/default/files/dokumentuak/12789/Commons_Captions_NMT__SEPLN_2019.pdf
  10. ^ Rawat, Charu; Sarkar, Arnab; Singh, Sameer; Alvarado, Rafael; Rasberry, Lane (2019-05-21). "Automatic Detection of Online Abuse and Analysis of Problematic Users in Wikipedia". doi:10.5281/zenodo.3101511. {{cite journal}}: Cite journal requires |journal= (help)
  11. ^ Sarkar, Soumya; Reddy, Bhanu Prakash; Sikdar, Sandipan; Mukherjee, Animesh (2019-06-11). "StRE: Self Attentive Edit Quality Prediction in Wikipedia". arXiv:1906.04678 [cs.SI].
  12. ^ Fetahu, Besnik; Anand, Avishek; Koutraki, Maria (2019-02-05). "TableNet: An Approach for Determining Fine-grained Relations for Wikipedia Tables". arXiv:1902.01740 [cs.DB].
  13. ^ Sachs, Joel; Page, Roderic; Baskauf, Steven J.; Pender, Jocelyn; Lujan-Toro, Beatriz; Macklin, James; Comspon, Zacchaeus (2019-11-06). "Training and hackathon on building biodiversity knowledge graphs". Research Ideas and Outcomes. 5: –36152. doi:10.3897/rio.5.e36152. ISSN 2367-7163.
  14. ^ Szymański, Julian; Dziubich, Tomasz (2017). "Spectral Clustering Wikipedia Keyword-Based Search Results". Frontiers in Robotics and AI. 3. doi:10.3389/frobt.2016.00078. ISSN 2296-9144.
  15. ^ Gallert, Peter; Winschiers-Theophilus, Heike; Kapuire, Gereon K.; Stanley, Colin; Cabrero, Daniel G.; Shabangu, Bobby (2016). "Indigenous Knowledge for Wikipedia: A Case Study with an OvaHerero Community in Eastern Namibia". Proceedings of the First African Conference on Human Computer Interaction. AfriCHI'16. New York, NY, USA: ACM. pp. 155–159. doi:10.1145/2998581.2998600. ISBN 9781450348300. Closed access icon
  16. ^ Mushiba, Mark; Gallert, Peter; Winschiers-Theophilus, Heike (2016). "On Persuading an OvaHerero Community to Join the Wikipedia Community". In José Abdelnour-Nocera; Michele Strano; Charles Ess; Maja Van der Velden; Herbert Hrachovec (eds.). Culture, Technology, Communication. Common World, Different Futures. IFIP Advances in Information and Communication Technology. Cham: Springer International Publishing. pp. 1–18. doi:10.1007/978-3-319-50109-3_1. ISBN 9783319501093. Closed access icon




Reader comments

2019-10-31

Wikipedia is in the real world

This Wikipedia essay was originally posted by Mangoe on April 13, 2007 - S

What you say in Wikipedia can have serious consequences.

Wikipedia is highly visible on the Internet; any Google or other search engine search on a subject for which Wikipedia has an article is likely to display that article on the first page of results, and quite likely is the first or second result returned. If you edit that article, then anyone who is interested in the subject is going to be able to see what you wrote. They will also be able to track your activity across the site, in project and user pages as well as in articles. So anything you say here and anything you do here can have real world consequences. Consider carefully what you write (or delete); keep in mind that you (and other people) can get hurt and experience real-life consequences, such as legal, employment or security issues.

Wikipedia is a public place

In Wikipedia, everyone can hear you talking.

It is tempting to view Wikipedia as something of a private club, but it is really much more like Hyde Park. In fact, since every keystroke on WP is logged and time and date-stamped with your identity, an even more apt comparison would be talking on a megaphone in a public park while TV news cameras are recording and transmitting your statements to the world.

Anyone who abides by the rules is welcome to edit; anyone with a web browser is welcome to read. Therefore, you should consider that you have about as much privacy as you would if you got on a soapbox in the town square and used a megaphone. The whole world can hear you, including your wife/husband/significant other, your children, your boss, your neighbors, spy agencies, the police, investigative reporters, Rush Limbaugh, Stephen Colbert, The New York Times, and the pope. If you don't want them to read what you're saying, you shouldn't post or edit it here.

Those outside readers, organizations and individuals will also read your words in the context of generally understood meanings, not Wikipedia-specific definitions. Appeal to Wikipedia rules and processes will not save you from misunderstandings or real-world consequences.

Wikipedia is not a role-playing game

Wikipedia is not edited by miniatures.
Editors in Wikipedia are (hopefully) writing about real-world subjects, not creating a fantasy world. Editors are not characters in a game; they are real people. You should not be here to gain experience points, create your own reality, play mind games with others, or engage in satisfying your taste for single combat. If you say something malicious about someone, you're saying it about a real person, and that real person may well get angry with you. Don't visualize your discussion opponents as NPCs in EditQuest or World of Wikicraft or Jimbo's Call; visualize them as someone sitting across the table from you. After all, the Golden Rule isn't just a rule; it's a good idea.

Don't count on your anonymity

Although the true identity of Wikipedia editors is not normally revealed within the site, and efforts to "out" editors are frowned upon, it is impossible to prevent attempts at unmasking editors. From time to time editors have been the subject of such attempts. Wikipedia cannot forestall the consequences of being identified, so the best course may be to edit defensively:

In Wikipedia, your mask can fall off.

Administrators and long standing members of the community, having developed a high profile, can expect inverse surveillance. If passers-by and other editors mistake the recipient of your acts for a punching bag, they'll want to know why you're doing it. And to know why, they'll want to know who you are, even if the Wikipedia culture values privacy.

Wikipedia keeps an Akashic record

Edits and discussions are kept forever unless suppressed.

All your contributions to Wikipedia, including comments in talk pages, edits to articles, comments in article for deletion discussions, etc., are kept forever by the wiki software unless suppressed. Anything that you say that has not been deleted by an administrator or via oversight will be available to anyone for research via your contributions page. The aggregation of all these contributions represents your public identity to others and can be used to make an assessment of your personal viewpoints, personality, edit patterns, and motivations.

An editor can request administrators to delete certain content the editor now regrets; an offensive speech on a controversial topic or a personal photo on their user page, or their real name. However, even if an admin deletes this content on Wikipedia, the content may remain on mirror websites that re-use WP content under license. As well, third parties who monitor Wikipedia and who have access to cached "snapshots" of the encyclopedia at various points in time may notice that some content has been deleted by admins. This attempt to suppress this content may in turn stimulate interest in this content, the so-called Streisand effect.

Real world conflicts are not different in Wikipedia

If you don't like controversy, you should stay away from editing controversial topics. And if you don't like being tagged with a position on a controversial topic, you should be very wary of editing articles on it. It's not like The Wizard of Id; if you write "The king is a fink!" here, everyone will see you doing it.

Wikipedia's visibility makes it a natural haunt of viewpoint pushers on political and social controversies. Even if you try to be scrupulously careful about avoiding POV edits, other editors working on the same topic may assume that you are a party to the dispute and assign you to one of the various camps. If this offends, annoys, or troubles you, you should consider staying out of the fray. And if being identified with one of the parties to the dispute would be difficult for you in real life, you should consider well the consequences of being identified, and refrain if you feel the stakes are too high.

The bottom line

Wikipedia is not a soapbox; but when people get angry about a cause, they often take their grievances to Wikipedia.

Take responsibility for your actions here, and you will be less likely to be surprised by any undesirable consequences of what you say and do. Use the preview button, and think before pressing "Save page". You can always self-revert, but what you said may remain.

See also

Essays

Articles

Policies and guidelines



Reader comments

2019-10-31

Welcome to Wikipedia! Here's what we're doing to help you stick around

Originally published on the WMF blog on 21 October 2019

Someone signing up to edit a Wikipedia page for the first time may feel a bit like a visitor to a new planet—excited, to be sure, but wildly overwhelmed.

Many new editors must wonder: Just how many millions of articles are in this vast online resource? And which of those millions of articles need that particular person's help?

More than 10,000 users open accounts on Wikipedia each day, and the English language version alone contains nearly six million articles. Many new users give up before they start, however, intimidated by how many different ways there are to get started, and how challenging it can be to learn to edit. Wikipedia wants their help, of course; its success requires an engaged community of contributors, providing wisdom at all levels to enhance the product and enrich the greater good.

With the goal of increasing new users' engagement, the Wikimedia Foundation Research team and their collaborators in the Data Science Lab at EPFL, a technical university in Switzerland, undertook a study to see how Wikipedia might solve this so-called "cold start" problem. How, by asking a few simple questions at the outset of someone's volunteering as an editor, it could nurture and coax that person, guiding them to the places where their expertise could be most helpful—easing them into the experience, and making it satisfying enough to encourage them to stick around. They published the results in the proceedings of the Thirteenth International AAAI Conference on Web and Social Media (ICWSM 2019).

Assembling data without feedback

Wikipedia needs to start by asking questions because of an age-old problem: Because the users are only just then signing up, Wikipedia doesn't yet know anything about them, and can't immediately guide them to a soft landing in their area of interest.

Moreover, other sites, like Amazon, Facebook, YouTube or Spotify, immediately generate a wealth of data on their users, from things they buy, to things they "like", to things they watch or listen to (and how long they watch or listen), to things they recommend.

That's explicit feedback. Wikipedia, however, has an implicit feedback system. It must infer from its data how a user feels about one of its articles, which is harder to do; if someone edits an article, do they like it? Or do they contribute in order to fix the issues they see in the article, even though they are not necessarily interested in the subject? You can't tell.

To set up its recommender system, the researchers sought to establish a questionnaire for new editors. Researchers had a general architecture in mind. They didn't want to ask too many questions, fearing a burdensome process would turn people away. So they wanted to get as accurate a preference as possible, as quickly as possible.

The main concept was to have pre-loaded lists of articles that users could react to: "I want to edit articles in list A, but not list B". The lists, the researchers knew, would have to have articles that were distinctive enough for the preferences to be meaningful. Once a user identified enough lists, the engine would have the data to extract articles that need editing that—hopefully—align with the user's interests.

With that structure in mind, researchers considered populating lists based on the content of Wikipedia articles. That's an obvious choice, but it carried a drawback: some words have multiple meanings, and putting it in a list wouldn't necessarily answer the question of the user's interest.

The other option was picking topics based on how much editing they had received, which researchers called the "collaborative" method. That, too, had a drawback. Some topics get a lot of edits because they may have a lot of typos, for instance, while important topics may be non-controversial and not generate so many edits.

So the researchers decided to use a little bit of both the content method and the collaborative method, to leverage both of their strengths. That involved a complex series of calculations, ultimately optimized into a series of lists that could be presented to new editors.

One list might include the musicians Elvis Costello, Stevie Nicks, and Eddie Vedder. Another could itemize entries on the cities Sikeston, Missouri; Selma, Alabama; and Baltimore, Maryland. One list could feature scientific terms, such as Apomecynini, Desmiphorini, and Agaristinae, while another list offered other terms, like affective computing, isothermal microcalorimetry, and positive feedback.

Each term in a list would be related to another. As new users chose one list compatible with their interests and rejected others outside their expertise, the system would then learn which types of Wikipedia entries to ask those users to help edit.

Reaction from users

Getting to that point took a lot of complicated math and computer science, partly because the researchers didn't want to mess with the actual workings of Wikipedia. (The site has a policy: "Wikipedia is not a laboratory".)

Once the surveys were established and the research project fully described on Wikimedia's Meta-Wiki, the researchers launched the experiment. They took all the users who signed up for an account on English Wikipedia from September to December 2018 and divided them into two sets: those with more than three edits, and those with three or fewer edits.

Once the groups were further sliced and diced, they were sent questionnaires of 20 questions each, in which each "question" was actually a comparison of two lists of 20 articles each—an extension of the illustration above. Once the users made their selections, they received 12 lists of recommendations, each with five articles to read or edit.

After meticulously analyzing the responses, the researchers found that the users were generally satisfied with the recommendations. Naturally, they also found room for improvement. For instance, they could do a better job of distinguishing whether users are editors or readers, and tailoring recommendations that way. And they could make recommendations for articles that really need editing, such as stubs or articles of low quality.

Another fix could be a timesaver for the users. Each question took about a minute to answer; the entire process took 20 minutes. That may drive some users away, but the researchers theorized that they could simply allow a user to answer as many questions as they would like.

Long term changes

In this experiment, researchers focused only on making sure the recommendations were of high quality, and that the users were satisfied. They did not get at the bigger question the work is designed to ensure: Will the users stay on the site? In the future, the researchers will need to see if they can improve retention, but noted that "will need to be a different and longer experiment".

One alternative to the questionnaires could be establishing connections with other Wikipedia users. If the researchers pair a new user with other new users with similar interests, they might form a group that would collectively contribute to Wikipedia. And if they pair a new user with a veteran editor, that person could serve as a mentor to help them along.

Anything that encourages new users to stay and contribute should also help diversity Wikipedia. "The gender and ethnicity distribution of active Wikipedia users is currently very biased, and having a system for holding the hands of newcomers can help alleviate this problem", the researchers noted.

If you like the idea of that—either teaming with other users, or having a mentor—please let us know, and maybe we can arrange it. Perhaps you'll be part of the next experiment!

And if you're a new user, we look forward to playing 20 Questions with you!


Dan Fost is a freelance writer based in San Rafael, Calif.



Reader comments

2019-10-31

What's making you happy this month?

There are many opportunities to discuss bad news, problems, and concerns in the Wikiverse, and I think that having candid discussions about these issues is often important. Many days I spend more time thinking about problems than about what is going well. However, also I think that acknowledging the good side and taking a moment to be appreciative can be valuable.

I encourage you to add your comments about what's making you happy this month to the talk page of this Signpost piece.

The Commons Picture of the Day for 11 October 2019 was taken in Ukraine. The description is "Meadow at dawn near Desenka railway halt. Ukraine, Vinnytsia Oblast, Vinnytsia Raion." The photo was taken by User:George Chernilevsky.

I thank WMF Engineering and Site Reliability Engineering (specifically Marostegui, Samwilson, and Joe) for timely engagement with a problem that some users encountered in September. See phab:T232698. Also see the Wikimedia Production Excellence newsletter for August 2019.

Also, I would like to thank User:Ата for making rapid translations into Ukranian on the few occasions that I have requested them, including the Ukranian translation for this email's subject line. Ата appears to be a frequent translator.

Hello,

this week’s email is coming from a new sender, since Pine always says we’re also allowed to start the thread :)

I am thankful that the software migration from HHVM to PHP7 in production is mostly completed. Many people participated in this work, but some especially prominent (Phabricator) names seem to be Jdforrester-WMF, Krinkle, jijiki, Joe, and Reedy – apologies to the people that I inevitably missed. This migration also unlocks many code style improvements that were previously blocked on HHVM compatibility requirements, and Daimona has been very active here, modernizing code and configuration across lots of source code repositories: thank you!

The Commons Picture of the Day for 15 October 2019 was taken in Germany. The English description is "Gracht castle in Erftstadt, Rhein-Erft-Kreis (Germany)"; there are no other descriptions or captions yet. The photo was taken by User:A.Savin.

Additional translations of the subject line of this email would be appreciated on Meta.

What’s making you happy this week? You are welcome to write in any language. You are also welcome to start a WMYHTW thread next week.

Lucas
( https://meta.wikimedia.org/wiki/User:Lucas_Werkmeister )

PS: Did you know? In Germany, the week starts on Monday, not Sunday! I kept the email subject aligned to Sunday, though, in case referring to the “week of 14 October” would confuse anybody :)


— https://lists.wikimedia.org/pipermail/wikimedia-l/2019-October/093723.html

The bridge from HHVM to PHP7

Preface

User:Lucas Werkmeister started last week's WMYHTW email thread on Wikimedia-l and I was waiting for an Amharic translation, so I decided to delay this content from the week of 13 October to the week of 20 October. The Amharic translation was kindly provided by User:ክርስቶስሰምራ.


English Wikiquote of the Day for 30 September

Silence is an ocean. Speech is a river. When the ocean is searching for you, don't walk into the language-river. Listen to the ocean, and bring your talky business to an end. Traditional words are just babbling in that presence, and babbling is a substitute for sight.  
— Rumi


New affiliate recognitions from the Affiliations Committee


English Wiktionary Words of the Day

"pleach":
(transitive) To unite by interweaving, as (horticulture) branches of shrubs, trees, etc., to create a hedge; to interlock, to plash.

"dojo":
1. (martial arts) A training facility, usually led by one or more sensei; a hall or room used for such training.
2. (by extension) A room or other facility used for other activities, such as meditation or software development.
3. The dojo loach, Japanese weather loach, or pond loach (Misgurnus anguillicaudatus), a freshwater fish native to East Asia.

"orient":
1. (transitive) To build or place (something) so as to face eastward.
2. (transitive, by extension) To align or place (a person or object) so that his, her, or its east side, north side, etc., is positioned toward the corresponding points of the compass; (specifically, surveying) to rotate (a map attached to a plane table) until the line of direction between any two of its points is parallel to the corresponding direction in nature.
3. (transitive) To direct towards or point at a particular direction.
4. (transitive, reflexive) To determine which direction one is facing.
5. (transitive, often reflexive, figuratively) To familiarize (oneself or someone) with a circumstance or situation.
6. (transitive, figuratively) To set the focus of (something) so as to appeal or relate to a certain group. 7. (intransitive) To change direction to face a certain way.


Images from Commons


Product and Technology news


Newsletter news


Off wiki

The 2019 Nobel Peace Prize was awarded to Abiy Ahmed, the current Prime Minister of Ethiopia and a former cyberintelligence officer. He helped to resolve a lengthy armed conflict and has made reforms in Ethiopia's government. See https://www.theguardian.com/world/2019/oct/11/abiy-ahmed-ethiopian-prime-minister-wins-2019-nobel-peace-prize.

Halloween fun


Donation of 2000 medical images

User:Netha Hussain announced that Dr. Yale Rosen, a pathologist, agreed to donate his entire collection of approximately 2000 pathology images to Wikimedia Commons.


New affiliate recognition from the Affiliations Committee


Project milestone for Italian Wiktionary


English Wiktionary Word of the Day for October 14

"Woozle effect: The phenomenon whereby frequent citation of earlier publications leads to a mistaken public belief in something for which there is no evidence, giving rise to an urban myth." Regarding the etymology for this term, Wiktionary says in part: "A reference to the book Winnie-the-Pooh (1926) by English author A. A. Milne (1882–1956), in which the characters Winnie-the-Pooh and Piglet follow their own tracks in the snow, believing them to be the tracks of the imaginary 'Woozle'."


Survival and adventure in Watership Down

Watership Down "is a survival and adventure novel by English author Richard Adams, published by Rex Collings Ltd of London in 1972. Set in southern England, around Hampshire, the story features a small group of rabbits. Although they live in their natural wild environment, with burrows, they are anthropomorphised, possessing their own culture, language, proverbs, poetry, and mythology. Evoking epic themes, the novel follows the rabbits as they escape the destruction of their warren and seek a place to establish a new home (the hill of Watership Down), encountering perils and temptations along the way."

I first learned of these Watership Down quotes from User:OohBunnies!:

"All the world will be your enemy, Prince with a Thousand Enemies. And whenever they catch you, they will kill you. But first, they must catch you; digger, listener, runner, Prince with the swift warning. Be cunning, and full of tricks, and your people will never be destroyed."

"Look. Look. That's the place for us. High, lonely hills, where the wind and the sound carry, and the ground's as dry as straw in a barn. That's where we ought to be. That's where we have to get to."


"The Sound of Her Voice"

Many relationships in the Wikiverse involve remote communication. Someone who influenced my early days in the Wikiverse was User:Sonia, who graciously took me under her wing. I remember her as being intelligent and kind. She left the Wikiverse years ago, and I miss her. As far as I know, Sonia is alive and well, somewhere in the world. I wish that I could have met her in person.

Mindful of Sonia and the many other people that I know through remote communications, I am sharing a video clip from the Star Trek: Deep Space Nine episode "The Sound of Her Voice".

Some background information is necessary here. This contains plot spoilers. (For a more thorough summary of the episode and a commentary by Michelle Erica Green, see https://www.trektoday.com/reviews/ds9/the_sound_of_her_voice.shtml.) In this episode, Captain Benjamin Sisko and the crew of the USS Defiant (NX-74205) receive a distress call from Captain Lisa Cusak, who is the sole survivor of the destruction of her ship. The Defiant begins a six day journey to rescue Captain Cusak. The Defiant's crew members have voice conversations with Cusak during their journey, and the crew members form friendships with her. Unfortunately, when the Defiant arrives at Cusak's location, she is deceased. The video clip that I link below shows the end of the episode. Sisko and the crew of the Defiant have an Irish wake for their friend. Chief O'Brien shares reflections that I think are applicable to many friendships, especially friendships across long distances like many of us have in the Wikiverse.

https://www.youtube.com/watch?v=yryqc8RETgE


WMYHTW reflections

I am glad that generally people like these WMYHTW emails, and I appreciate the positive feedback about them. I think that these emails encourage a collegial environment in Wikimedia-l, and I hope that they are good for morale. However, they are time consuming to write. I have a backlog of Wikimedia emails that I have not read, I want complete a few of my long delayed Wikimedia video tutorials, and I have many off wiki demands on my time. As I wrote in September, I need to reduce the amount of time that I spend writing these pieces. Perhaps shorter contributions from me to WMYHTW will lead to other people feeling that there is more space for them to make their own WMYHTW contributions.

WMYHTW should not be about me, but I will say that of the projects that I could do for the community, this has become one of my favorites. Before I started writing WMHYTW emails, I spent a lot of my time being vigilant for problems and wondering what would go wrong next. The habit of writing these pieces has slowly changed how I think. I thank the community members who accept and encourage the WMYHTW initiative, and the Wikimedians on Facebook who gave me the idea for this practice.


Off wiki

Regarding translations

Skillful translations of the sentence "What's making you happy this week?" would be very much appreciated. If you see any inaccuracies in the translations in this article then please {{ping}} User:Pine in the discussion section of this page, or boldly make the correction to the text of the article. Thank you to everyone who has helped with translations so far.


Your turn

What's making you happy this month? You are welcome to write a comment on the talk page of this Signpost piece.



Reader comments

If articles have been updated, you may need to refresh the single-page edition.

















Wikipedia:Wikipedia Signpost/Single/2019-10-31