Does AI level the playing field for underrepresented subjects? Or perpetuate systemic bias?
The logo of Women in Red, whose WikiProject has been heavily involved in creating articles based on the technology
Wired, Popular Science, The Verge, and others published a story on Quicksilver, a new artificial intelligence tool that finds missing Wikipedia articles, and writes short summaries. Users can head to Quicksilver's website to find a list of the 100 released notables.
In a blog post, the people behind the technology described how it works:
[The software can] read 500 million news articles, 39 million scientific papers, all of Wikipedia, and then write 70,000 biographical summaries of scientists. ... We are publicly releasing free-licensed data about scientists that we’ve been generating along the way, starting with 30,000 computer scientists. Only 15% of them are known to Wikipedia. The data set includes 1 million news sentences that quote or describe the scientists, metadata for the source articles, a mapping to their published work in the Semantic Scholar Open Research Corpus, and mappings to their Wikipedia and Wikidata entries.
The technology can also be used to help prevent Wikipedia articles from going "stale" and lagging behind the pace of events. In February 2018, Google announced that it was embarking on a similar project, but the passages were described by The Register as "a bit difficult to read without clear capital letters at the start of new sentences, and most sentences have the same rigid structure", and the model was criticized for reliability issues. Even Quicksilver only presents short clippings from news articles strung together, and presents a large focus on those who have the most mentions in the news, but it is a good place to start.
Bias in the big-data sources selected to fire up the AI has been pointed out as a potential downfall. Haaretz published a story titled "The Real Reason Sheldon Adelson's Wife Deserves a Wikipedia Page" about Miriam Adelson, who was listed in the original 100 figures, including this observation:
The initial data fed into the program was that of academics from the world of computer science, skewing the results in favor of that field from the outset. More so, a large number of those Quicksilver proposed for articles were American figures from the world of IT, suggesting that the initial dataset provided by the San Francisco-based company reflected its own location as much as their own backgrounds as engineers.
The sentiment was echoed in a "reflection" on-Wiki (permanent link), including this comment from Xcia0069: "Many of the sources are cheap news sites that aren't the most reliable interpretations of the research undertaken [and] a surprisingly high majority of the sampled of 100 scientists are from the USA".
The worries might be moot for us, though, if the output is incompatible with open licensing. The licensing of the work states that: "The data contains sentences from news articles provided for the purpose of computational research and development. Copyright law applies to the text of these sentences which may limit its use."
In brief
Sarah Jeong
NYT journo's white jokes: The Daily Caller discusses an on-Wiki debate over whether to mention Sarah Jeong's "racist tweets" in the article about her. The controversial tweets were brought to the fore as she takes up a position on the New York Times editorial board. The Atlantic describes the "arcane rules" governing inclusion in articles while the Conservative Tribune attacks the neutrality of involved editors.
Deny this: Buzzfeed News reports that YouTube is now excerpting and linking to the Wikipedia article on global warming below videos on the subject, in an effort to combat climate change denial on its platform. In July, a Wikimedia Foundation employee revealed seven articles the video website plans to link to.
Token contribution: Wikipedia rival Everipedia goes live on the blockchain after a previous botched attempt. The decentralised encyclopaedia incentivises editors with its own cryptocurrency. Next Web.
Wikimedia releases clothing: The Wikimedia Foundation announced that it had partnered with Los Angeles–based Advisory Board Crystals to release a Wikipedia-themed shirt. The $85 piece, which is sold out, was described by Vice as "the year's most improbable – and surprisingly wearable – streetwear collab."
Obsessed on Wikipedia: The podcast Obsessed by Joseph Scrimshaw dedicated an episode interviewing writer Josh Fruhlinger about his experiences as a Wikipedia editor. The interview is mainly a humorous inside baseball explanation of some of the procedures and subcultures of Wikipedia such as an agonistic pluralism model for collaboration, and a "nerdy rap battle" for supremacy over the article Central Link.
Arbitration Committee: The Arbitration Committee made the news again, with Wired writing about the history and work of "Wikipedia's Supreme Court", specifically referencing a case involving Philip Cross. (PreviousSignpostcoverage)
Guerrilla Skepticism on Wikipedia: A profile of Susan Gerbic, founder and leader of the Guerrilla Skepticism on Wikipedia (GSoW) project, was published by Voices of Monterey Bay.
Do you want to contribute to "In the media" by writing a story or even just an "in brief" item? Edit next week's edition in the Newsroom or leave a tip on the suggestions page.
Discuss this story