The Signpost
Single-page Edition
WP:POST/1
8 May 2023

News and notes
New legal "deVLOPments" in the EU
In the media
Vivek's smelly socks, online safety, and politics
Recent research
Gender, race and notability in deletion discussions
Featured content
I wrote a poem for each article, I found rhymes for all the lists; My first featured picture of this year now finally exists!
Arbitration report
"World War II and the history of Jews in Poland" approaches conclusion
News from the WMF
Planning together with the Wikimedia Foundation
Special report
There Shall Be Seasons Refreshing – Stories from WikiConference India 2023
 

2023-05-08

New legal "deVLOPments" in the EU

European Commission designates Wikipedia a "Very Large Online Platform"

Placeholder alt text
The European Commission has designated the first set of "Very Large Online Platforms" and "Very Large Online Search Engines" under the Digital Services Act

The European Commission has designated Wikipedia a "Very Large Online Platform" (VLOP) under the Digital Services Act. The same designation, used for platforms that reach at least 45 million active users per month, has also been applied to Alibaba AliExpress, Amazon Store, Apple App Store, Booking.com, Facebook, Google Play, Google Maps, Google Shopping, Instagram, LinkedIn, Pinterest, Snapchat, TikTok, Twitter, YouTube and Zalando.

The new designation means that Wikipedia will be required to comply with a set of new legal obligations. Some of these, related to targeted advertising and profiling, clearly don't apply to Wikipedia. However, the following may be relevant:

  • More user empowerment:
    • Users will be able to report illegal content easily and platforms have to process such reports diligently;
    • Platforms need to provide an easily understandable, plain-language summary of their terms and conditions, in the languages of the Member States where they operate.
  • Strong protection of minors:
    • Platforms will have to redesign their systems to ensure a high level of privacy, security, and safety of minors;
    • Special risk assessments including for negative effects on mental health will have to be provided to the Commission 4 months after designation and made public at the latest a year later.
  • More diligent content moderation, less disinformation:
    • Platforms and search engines need to take measures to address risks linked to the dissemination of illegal content online and to negative effects on freedom of expression and information;
    • Platforms need to have clear terms and conditions and enforce them diligently and non-arbitrarily;
    • Platforms need to have a mechanism for users to flag illegal content and act upon notifications expeditiously;
    • Platforms need to analyse their specific risks, and put in place mitigation measures – for instance, to address the spread of disinformation and inauthentic use of their service.
  • More transparency and accountability:
    • Platforms need to ensure that their risk assessments and their compliance with all the DSA obligations are externally and independently audited;
    • They will have to give access to publicly available data to researchers; later on, a special mechanism for vetted researchers will be established;
    • Platforms need to publish transparency reports on content moderation decisions and risk management.

In reaction to the Commission's announcement, the Wikimedia Foundation published a blog post titled "Wikipedia is now a Very Large Online Platform (VLOP) under new European Union rules: Here’s what that means for Wikimedians and readers". This describes some of the work on compliance that has been ongoing since last year and concludes:

While we think our movement is already doing a good job addressing the expectations of Wikipedia being a VLOP, compliance with the EU DSA is nonetheless a journey into uncharted territory that the Wikimedia movement cannot avoid taking.

Further context was also provided in a posting on the Public Policy mailing list by a WMF lead counsel, and in Wikimedia Europe's monthly EU Policy Monitoring report for April (both summarized here). – AK, H

Who speaks for Wikipedia? Mastodon accreditation reverted.

Checkmark on a verified Mastodon account

As reported in our previous issue, Wikipedia recently gained a presence on the federated social network Mastodon, in form of the @wikipedia@wikis.world account – without the Wikimedia Foundation's involvement, after various community suggestions had fallen flat that WMF should itself establish such an account alongside the official @wikipedia Twitter account that it operates.

In general, if your Mastodon profile links to a website, and the website links back to the same Mastodon account with a rel="me" attribute, then Mastodon will display that account profile to others with a "verified" checkmark on the website. For a period of time the community-controlled[a] @wikipedia@wikis.world account was linked in this way to wikipedia.org, making it the verified Wikipedia account.

However, at the end of April, the change that had been made to the Wikipedia portal allowing @wikipedia@wikis.world to be verified was reverted for now by WMF.

What this means: Since un-verification, another Mastodon account could claim to represent Wikipedia, but it won't show up as verified.

Nevertheless, the Mastodon account remains active, helping this project to reach the Fediverse/Mastodon audience.

There's some uncertainty about when the wikis.world Mastodon instance or other Mastodon instances will actually notice that the account has been unverified, but long-term the verification does depend on the wikipedia.org portal. – B, M, H

  1. ^ not WMF-controlled

WMF proposes "an AI-assisted Wikipedia browsing experience"

Buried in the Wikimedia Foundation Annual Plan (draft) for Product & Technology is this provocative proposal: "an AI-assisted Wikipedia browsing experience". See meta:Special:Diff/24865392B

Brief notes

A Wiki Loves Earth image from Benin



Reader comments

2023-05-08

Vivek's smelly socks, online safety, and politics

A presidential candidate and undeclared paid editing

Vivek Ramaswamy

Vivek Ramaswamy – a long-shot Republican candidate for the 2024 United States presidential electionpaid a Wikipedian to edit the article about himself, according to Mediaite (see also The New Republic, Forbes, and Yahoo!). The paid editor, Jhofferman, had earlier declared that he was paid, and even summarized two edits as "at subject's request." Jhofferman is the most active editor of the article, having made 97 (38.5%) of the edits. His most controversial edits removed Ramaswamy's role in Ohio’s COVID-19 Response Team, and Ramaswamy's receipt of the The Paul & Daisy Soros Fellowships for New Americans. Paul Soros was the older brother of George Soros, a perceived enemy of the MAGA crowd.

Ramaswamy's campaign, according to HuffPost, stated that the edits simply corrected "factual distortions" on "a number of topics, including family members’ names." Of Jhofferman's 97 edits to the article, The Signpost could only find 3 that "corrected" family member's names: two added "Ramaswamy" as his wife's family name, and one abbreviated his father's given name "Vivek Ganapathy" to V.G.

Even though Jhofferman is not an undeclared paid editor, he was reported at an administrators' noticeboard anyway. Commenters there seemed to be leaning toward an editing restriction for violating our rules, perhaps neutral point of view.

The Signpost can add that Ramaswamy appears to have repeatedly used undeclared paid editors on his biography article, and articles on two of his businesses, Axovant Sciences and Roivant Sciences. These 16 editors who have since been blocked as sock puppets include two members of the well-known Yoodaba sockfarm. One of those socks edited all three articles, the other only two articles. In total seven editors who were later blocked for socking edited the Axovant Sciences article; ten edited the Roivant Sciences article, and seven edited the Vivek Ramaswamy article. The biography article and the Roivant Sciences article were both created by now banned sock puppets from the Jbuffkin sockfarm. The Axovant Sciences article was edited by an anonymous account who declared a conflict of interest for the article and also edited the other two articles.

We remind our readers that the identities of an editor or their employer can not be definitively proven even with Wikipedia's near-complete record of edits, for example an editor may be intentionally trying to embarrass an article subject with a Joe job.

Political candidates who are considering editing articles about themselves through undeclared paid editors or sock puppets should consider themselves notified that these editors are fairly easy to track on Wikipedia. We expect to further report on paid editing and socking by presidential candidates as the campaign progresses. – S

Encyclopaedia exemption to online safety bill?

The BBC reports that Wikipedia will not perform Online Safety Bill age checks which will likely be required under a proposed UK online safety bill meant to protect children. See also [1], [2], [3], [4]

The two methods of protecting children on the surface are contradictory. The online safety bill could require children to register accounts with their names and age and store this data. According to Rebecca MacKinnon of the WMF, Wikipedia has a "commitment to collect minimal data about readers and contributors" which is a method of protecting all Wkipedia editors. Fortunately, the House of Lords has debated an amendment that would exempt encyclopaedias and other websites "provided for the public benefit". – S

Compromise with paid editors?

Wikipedia's influence grows in Axios discusses how companies can help write encyclopedia articles, showing the viewpoints of a reputation management company as well as from several unnamed Wikipedians. This is a couple of steps above how Entrepreneur used to write similar articles which didn't have input from Wikipedians.

Summarizing some of Axios's main points, they tell companies

  • If a client or company is in the news, communication and PR teams should make sure that the Wiki page is up to date, because it'll inevitably see a surge in views.
  • "You must have a presence on Wikipedia, and you must be able to correct or update your presence there — which is easier said than done," quoting the CEO of the reputation management firm.
  • Flowery brand language or spin, ... makes editors skeptical of PR professionals and corporate communicators.

And they say that Wikipedians say

  • Create a Wikipedia account, do not attempt to edit anonymously
  • Always disclose conflicts of interest
  • Sourcing is a must. Every edit must be cited by a secondary source — and no, your company website or personal records don't count.

There's some obvious tension in the two viewpoints. What would happen if a company maintained a presence on Wikipedia and tried to keep the article about their company up to date, while avoiding anonymous edits, disclosing their conflicts of interest, and adequately sourcing every edit?

It would be a great advance from the current conditions, of course. But what's going to happen when Wikipedians and the company representatives disagree? Wikipedians almost always disagree about major edits, even among ourselves. Some Wikipedians will want it one way and won't be shy about telling the company employees. The employees generally won't disagree among themselves. They will be thinking about their paychecks and what their bosses want. Who will win?

Axios seems to suggest that a compromise can be reached. The Signpost is skeptical. Would the companies really want to work this way? Would Wikipedians really be willing to compromise so easily? Perhaps more importantly, if everybody could agree on what not to include, we would end up having some really boring articles.

The "Wikipedia's influence grows" claim in the title rests on a recent report by the Washington Post, titled "Inside the secret list of websites that make AI chatbots sound smart". Axios summarizes it as saying that "these tools focused on three key websites — 'patents.google.com No. 1, which contains text from patents issued around the world; wikipedia.org No. 2, the free online encyclopedia; and scribd.com No. 3, a subscription-only digital library.'" However, it is worth noting that according to the WaPo's numbers this top position corresponded to Wikipedia contributing only 0.19% of the tokens in the text corpus in question - much less than e.g. the 3% of the training data for OpenAI's GPT-3 model that are said to have come from Wikipedia.

Axios also states that "According to Wikistats, the data gathering function within the Wikimedia organization, Wikipedia saw 26 billion total page views in March alone. In the last year, the site received 279 billion unique views [sic], which is a 22% increase year over year." However, as cautioned in the small print of Wikistats, "this data shows page views from automated traffic as well as human traffic." When switched to display "human user page views", the same site instead indicates a 4.5% drop from 2021 to 2022 (although for more recent months, the WMF's more detailed "Movement Metrics" analysis indicates a modest year-over-year growth again).

S, H

African Journalism Award: Open the Knowledge

Map showing Francophone Africa
Francophone Africa – too diverse for inclusion?

A number of African media outlets, among them the Nigerian Guardian, are reporting on the WMF's "Open the Knowledge" African Journalism Awards announced on 3 May 2023 in a WMF press release:

Africawide – The Wikimedia Foundation, the non-profit that operates Wikipedia and other Wikimedia projects, is today launching the inaugural Open the Knowledge Journalism Awards. Coinciding with the 30th anniversary of World Press Freedom Day, this year’s awards celebrate the contributions of journalists in Africa who prioritize diversity, equity and inclusion in their reporting.

African journalists living on the continent (including active Wikimedians, but excluding Wikimedia staff) can submit their articles from May 3 to June 30, 2023. Articles must have been published online between January 1, 2022 and June 23, 2023. Admissible topics are:

  • Arts, Culture, Heritage, and Sports
  • Health, Climate Change, and Environment
  • Women and Youth
  • Digital and Human Rights

In what is surely a blow to diversity and inclusion, however, only English-language articles may be submitted. The WMF will not even accept translations of articles published in any other language.

This seems like an unfortunate exclusion, given that close to half of Africa by area is Francophone and there is a significant amount of African journalism in Arabic, Swahili, Portuguese, Afrikaans and many other languages. – AK

Wikipedia too powerful?

Former Arabic Wikipedia administrators Osama Khalid (left) and Ziyad Alsufyani (right), both in prison in Saudi Arabia (see previous Signpost coverage).

The UK Telegraph published a 3000-word article (archive copy) titled "How Wikipedia became too powerful". Featuring interviews with Wikipedians like Rich Farmbrough, WMF CEO Maryana Iskander, Wikimedia UK CEO Lucy Crompton-Reid and Wikipedia co-founder Larry Sanger among others, the article starts out by reminding readers of the two Wikimedians imprisoned in Saudi Arabia:

What is the price of information? For Osama Khalid, one of the hundreds of thousands of volunteers who edit Wikipedia, the tariff was 32 years in jail – his punishment for 'violating public morals' by posting news 'deemed to be critical' of the Saudi regime. Ziyad al-Sofiani, his fellow 'admin' (as senior Wikipedia editors are known) was handed eight years. Their jail terms were reported by activist groups around the same time that Wikimedia, the online encyclopaedia's parent foundation, revealed that it had banned 16 users in the Middle East and North Africa region for 'editing the platform in a coordinated fashion to advance the aim of [external] parties'. Alleged spies for the Saudi government, in other words, trying to manipulate the truth. And these days, if you want to control 'the truth', you want to control Wikipedia.

The article goes on to say that Wikipedia gets almost as many visits as Twitter, about half as many as Facebook – and about ten times as many as the BBC. Wikimedia Foundation CEO Maryana Iskander however feels this level of influence is in good hands:

'There are enough checks and balances in the system,' she says. Chief among them, she notes, are the 'human army of truth tellers out there trying to ensure that information remains accurate and reliable and neutral'. And that means the volunteers like Osama Khalid.

The article goes on to provide a well-researched overview of Wikipedia topics such as edit wars, volunteer motivation, Wikipedia's gender imbalance and Wikipedia bureaucracy and ends with a summary of plans for the future growth of Wikipedia:

Iskander says that in future she wants to focus on Wikipedia becoming as comprehensive in its hundreds of other language versions as it is in English. The danger then might not be the danger of regulatory extinction, but of Wikipedia becoming too powerful – a ubiquitous single source of truth. It has already become, for example, the go-to source for smart speakers and voice assistants dishing out information on demand. And Wikimedia is keen to go further by vacuuming up the archives and data at specialist institutions. Iskander thinks it’s a good deal for such institutions, because the size of Wiki's audience brings such huge exposure. 'More eyeballs on your content is going to increase interest.'

It sounds like the take-it-or-leave it power play of information’s market leader. Iskander insists it's all for the greater good. But she does admit that the rapid growth, influence and reach achieved by Wikipedia's decentralised structure means it is a model that is being closely followed. 'I think more governments, corporations, organisations are heading in this direction,' she says. 'Wikipedia stands for a way of human engagement and a way of human interaction that the world would benefit from in other spheres, too.'

AK

In brief

One Montgomery Tower
  • Russian fines: A Russian court fined the Wikimedia Foundation two million roubles ($24,510), according to Reuters, for publishing a well documented story on Russia's invasion of Ukraine. It was the seventh such fine of the WMF this year, totaling 8.4 million roubles ($103,000). While there is no indication that the WMF is willing to pay the fine, or could even if they wanted to, Digital Affairs Minister Maksut Shadaev said "We are not blocking Wikipedia yet, there are no such plans for now."
    Relatedly, a school teacher from Orsk was sentenced to a fine of 30,000 roubles ($387) for "discrediting the army", after she had printed out and displayed a Wikipedia article about the Russian invasion of Ukraine. As reported by SOTA project (translated into English by the widely read @ChrisO_wiki Twitter account), the teacher "was supposed to prepare materials for an information stand on a 'special military operation' and without reading it she asked her colleague to print out the first text on the subject she found on the Internet", which turned out to be the Wikipedia article. @ChrisO_wiki commented that "as far as I'm aware this is the first time that Wikipedia *users* have been fined for using its content - a concerning precedent given that Wikipedia is one of the few uncensored sources of information still available to Russians."
  • False alarm: The San Francisco Standard reported a bomb threat targeting One Montgomery Tower on April 27. Though the report did not name any specific tenant, the building's 16th floor houses the Wikimedia Foundation headquarters. A KRON-TV followup stated that no device was found.
  • RIP or rip?: AI is tearing Wikipedia apart, according to the alarmist Vice headline. The body of the story is much calmer, often quoting Amy Bruckman, a professor at Georgia Tech. She says that AI should only be used for first drafts of articles and must be checked by real people, but that "I would put the genie back in the bottle, if you let me. But given that that's not possible, all we can do is to check it."
  • WikiConference India: Nearly 200 Wikipedians from India and its neighbors attended the conference in Hyderabad, 27–30 April. [5] [6] See this issue's special report for further coverage.
  • Retraction, or else: The BBC's two-part documentary India: The Modi Question about Indian Prime Minister Narendra Modi is banned in India but easily viewed in the UK. Livemint and Newslaundry report that an Indian court has summoned the BBC, the Wikimedia Foundation, and the Internet Archive about the documentary. The connection of the WMF to the legal action is unclear as there is no copy of the video on Wikipedia or at Wikimedia Commons, though at the end of the English-language article there are links to each part of the documentary. The Internet Archive confirmed that it had removed links to the BBC documentary after the suit was filed but told BoingBoing this was due to a DMCA takedown request filed by the BBC itself. Tweets about the programme had previously been removed from Twitter.
  • Fante Wikipedia: News outlet Modern Ghana reports that Wikipedia is now also available in Fante, a language dialect spoken by an estimated two million people in Ghana. It was the mother tongue of Kofi Annan, the seventh secretary-general of the United Nations. The Fante Wikipedia can be accessed on fat.wikipedia.org.



Do you want to contribute to "In the media" by writing a story or even just an "in brief" item? Edit the next edition in the Newsroom or leave a tip on the suggestions page.




Reader comments

2023-05-08

Gender, race and notability in deletion discussions


A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.


"How gender and race cloud notability considerations on Wikipedia" – or do they?

Reviewed by XOR'easter

The March 2023 paper "'Too Soon' to count? How gender and race cloud notability considerations on Wikipedia", by Lemieux, Zhang, and Tripodi[1] claims to have unearthed quantitative evidence for gender and race biases in English Wikipedia's article deletion processes:

Applying a combination of web-scraping, deep learning, natural language processing, and qualitative analysis to pages of academics nominated for deletion on Wikipedia, we demonstrate how Wikipedia’s notability guidelines are unequally applied across race and gender.

Specifically, the authors

"[...] explored how metrics used to assess notability on Wikipedia (WP:Search Engine Test; “Too Soon”) are applied across biographies of academics. To do so, we first web-scraped biographies of academics nominated for deletion from 2017 to 2020 (n = 843). Next, we created a numerical proxy for each subject's online presence score. This value is meant to emulate Wikipedia's “Search Engine Test,” (WP:Search Engine Test) a convenient and common way editors can determine probable notability before nominating a biography for deletion. [...] We also conducted a qualitative analysis of the discussions surrounding deleted biographies labeled “Too Soon,” (WP:Too soon). Doing so allowed our research team to assess if gender and/or racial discrepancies existed in deciding whether a biography was considered notable enough for Wikipedia. We find that both metrics are implemented idiosyncratically."

However, this is making a manifestly and indefensibly incorrect claim about how Wikipedia editors judge topics for notability. Also, the paper attempts to back it up with misleading quotations and numbers that are dubious on multiple levels.

Background

On English Wikipedia, the status of being significant enough to warrant an article is, in the community lingo, notability. This is related to, but more specialized than, the everyday idea of "noteworthiness". Notability is evaluated with the aid of various guidelines that codify community norms and experience, prominent among which is the General Notability Guideline (GNG). Also of relevance for the matter at hand is the specialized guideline applicable to scholars, academics, and educators, known by various abbreviations like WP:PROF. This guideline lays out eight criteria, meeting any one of which is sufficient qualification for notability.

Notability is not based on counting raw search-engine results. As the documentation page for "Arguments to avoid in deletion discussions" succinctly puts it:

Although using a search engine like Google can be useful in determining how common or well-known a particular topic is, a large number of hits on a search engine is no guarantee that the subject is suitable for inclusion in Wikipedia. Similarly, a lack of search engine hits may only indicate that the topic is highly specialized or not generally sourceable via the internet. WP:BIO, for instance, specifically states, Avoid criteria based on search engine statistics (e.g., Google hits or Alexa ranking). One would not expect to find thousands of hits on an ancient Estonian god.

Misrepresentation of Wikipedia policies, guidelines, and essays

The problems begin early in the paper. Referring to the article Katie Bouman, Lemieux et al. observe that it was put up for deletion a mere two days after its creation. They write, "Eventually, her page received a “snow keep” decision, indicating that her notability might be questionable but that deleting her page would be too much of an uphill battle to pursue (WP:SNOW)." This is not the meaning of the snowball clause. Instead, invoking WP:SNOW is a statement that the conclusion of a discussion is already obvious: the chance of its changing direction is the proverbial snowball's chance in hell, and letting it run for a deletion debate's typical period of a full week would be a waste of everyone's time. The conclusion is not that Bouman's notability was "questionable", but rather that it was obvious. The deletion debate for Bouman's biography did not end "eventually". It ended one hour and seventeen minutes after it began. Multiple !voters found the nomination sexist, either in intent (Jealous bros should not cry each time a woman is part of an achievement) or in effect (Shit like this is exactly why en.wp is such a sausage party).

In their literature review, Lemieux et al. state incorrectly that Of the more than 1.5 million biographies about notable writers, inventors, and academics on English-language Wikipedia, less than 20% are about women. There are over 1.5 million biographies total; most of them are not about writers, inventors, or academics.[note 1] They provide two sources for this claim. The first, a primary source, states that the figure is for the total number. The second, a 2021 paper by Tripodi,[2] makes the incorrect claim in its abstract and provides no citation for it.[note 2]

Policies, guidelines, and essays are, in Wikipedia jargon, different types of behind-the-scenes documents. The terms denote a descending sequence:

Policies have wide acceptance among editors and describe standards all users should normally follow. [...] Guidelines are sets of best practices supported by consensus. Editors should attempt to follow guidelines, though they are best treated with common sense, and occasional exceptions may apply. [...] Essays are the opinion or advice of an editor or group of editors for which widespread consensus has not been established. They do not speak for the entire community and may be created and written without approval.

Lemieux et al. describe both WP:Search engine test and WP:Too soon as metrics used to assess notability on Wikipedia. Neither page is such a thing.

The "Search Engine Test"

Lemieux et al. write that the "Search Engine Test" (WP:Search engine test) is a convenient and common way editors can determine probable notability before nominating a biography for deletion. Despite its supposed convenience and commonality, they provide no examples where it is actually invoked by name. Any attempt to find such examples, or even a cursory inspection of the WP:Search engine test page itself, would reveal the truth: their description of it is completely erroneous. Lemieux et al. claim to emulate the WP:Search engine test by computing a numerical proxy for each subject's online presence score. The page contains no such procedure. As observed above, none exists, or ever should exist. WP:Search engine test merely explains some uses of search engines and some reasons why their results must be treated with caution. It explicitly states, "A raw hit count should never be relied upon to prove notability", and it provides multiple reasons why. Checking the discussions in which it is invoked is illuminating. For example, in one debate, a !voter admonished, articles are not assessed based on the number of search results that come up after Googling them. For obvious reasons, this is a poor criteri[on] as you may get tens of thousands of results for inconsequential searches, and some noteworthy topics may not receive a particularly large amount of search results. See Wikipedia:Search_engine_test#Notability. Instead, articles should be assessed based on the criteria such as WP:GNG. In other words, pointing to WP:Search engine test has exactly the opposite meaning as Lemieux et al. make it out to have. This misrepresentation of WP:Search engine test also occurs in Tripodi's earlier paper, but it is more prominent in this one.[note 3]

As to its commonality, the numbers are stark enough to be indicative. The page WP:Search engine test was viewed 1,190 times over the year prior to this writing.[supp 5] The page WP:Notability, which includes the actual General Notability Guideline, was viewed 143,574 times over the same period.[supp 6] Even the specialized guideline WP:PROF was viewed 12,675 times, an order of magnitude more than the page that Lemieux et al. say is commonly used.[supp 7] As of this writing, 8,297 other pages link to the WP:PROF guideline, while only 1,090 link to the search engine test page,[supp 8][supp 9] though according to Lemieux et al. the latter is what applies to any subject, not just the niche domain of academic biographies. If the goal is to determine whether or not Wikipedia's self-proclaimed standards are being applied equitably, then the evaluation should consider the standards that are actually being proclaimed.

A central assertion of Lemieux et al. is that they quantified whether the "Search engine test" is being equitably utilized using a metric they call the "Primer Index", after the software employed to calculate it. This metric is based upon the number of times that an individual is mentioned in a news article and in which context, which evidently is at best an indirect perspective upon any of the WP:PROF criteria. To confirm the validity of the “Primer Index,” we also created a "Google Index" to approximate the total number of hits that appear when an academic's full name and occupation are searched on Google. Using a custom Google Sheets code, we extracted an academic's full name and occupation from Wikidata and automatically searched Google for every instance of “full name + occupation” for each academic in our dataset on the same day. At this juncture, see again the cautionary words of "Search engine test" itself, and the more popular link for making the same point, the "Arguments to avoid" page quoted above. Lemieux et al. nominally validated their "Primer Index" based on an argument that is literally one of the arguments we tell everyone to avoid. This supposed confirmation rests on no foundation at all. Having made this claim, Lemieux et al. write that our data indicate that BIPOC biographies who meet Wikipedia's criteria (i.e., above the White Male Keep median Primer Index of 12.00) were among those deleted. Because the "Primer Index" and "Google Index" are completely unmoored from Wikipedia practice, there is no reason to equate passing some threshold of them with meeting "Wikipedia's criteria".

The evident limitations of the "Google Index" are, in fact, an excellent indication of why a guideline like WP:PROF is necessary. Any search of the form "full name + occupation" will omit, or at best provide a paucity of hits from within books, the text of paywalled journal articles, and bibliographies including citations to the person in question. Furthermore, it will penalize those who are written about in languages other than English. The various criteria of WP:PROF, which start with citation databases but decidedly do not end there, provide a far less superficial understanding of article-worthiness. It is also the case that the consensus-building in AfD's reliant upon WP:PROF allows for greater discernment and flexibility where differences between specializations are concerned. (Lemieux et al. do not individuate between disciplines, despite the strong likelihood that some fields are more heavily advertised and more charismatic than others. Think of pop psychology versus pure mathematics, for example.)

Even if we set aside the concerns about the basic premises, a problem with the analysis remains. Suppose, for the sake of the argument, that we grant that the "Primer Index" measures something of interest. Lemieux et al. write that white males whose biographies were kept rated significantly higher on the Primer Index than those whose biographies were deleted. They compare the median Primer Indices of these two groups using the Kruskal–Wallis test and report that the test gives a small p-value. In contrast, the p-values for white women, for BIPOC men, and for BIPOC women were all large. Lemieux et al conclude that There was no statistically significant difference in the median Primer Index between kept and deleted pages for white women or for BIPOC academics, and thus that the Primer Index is not an accurate predictor of Wikipedia persistence for female and BIPOC academics. But a large p-value on the Kruskal–Wallis test is not itself sufficient to conclude that the distributions being compared are the same. (The documentation for the software the authors use says as much.[supp 10]) Indeed, per their figure 2, the difference in sample medians between kept and deleted biographies for BIPOC women was even larger than that for white men. The apparent differences in the statistics across these groups may be due, in whole or in part, to the large variations in sample sizes: 419 white men, but only 185 white women, 171 BIPOC men, and 69 BIPOC women.[supp 11]

The Titular "Too Soon"

The page Wikipedia:Too soon is an essay, not a guideline and certainly not a policy. It is largely a collection of pointers to other documentation pages with some commentary. The gist is given in the first section: Sometimes, a topic may appear obviously notable to you, but there may not be enough independent coverage of it to confirm that. In such cases, it may simply be too soon to create the article. Lemieux et al. emphasize this page from their title onward, but most of what they have to say about it is misguided.

“Too Soon” is a technical label developed by Wikipedians indicating that a subject lacks sufficient coverage in independent, high-quality news sources to have a page. It is hardly "technical": the meaning is an example of the everyday sense of the phrase. Moreover, the language of the essay is not news sources, but independent secondary reliable sources, with a clarifying link to the Wikipedia:Reliable sources guideline. The latter phrasing is more general than the former. A mathematics textbook or a medical review article is most likely independent, secondary, and reliable, but they are not news. Not only do Lemieux et al. drastically oversell the importance of WP:Too soon, they fail to summarize its contents accurately.[note 4] It is puzzling how anyone would come to believe that "too soon" means anything other than what it says on the tin. How, one wonders, would Wikipedia editors then describe topics that look like they will never be notable?

The initialism "AfD" is short for "Articles for deletion", the area in which deletion debates about articles occur. Lemieux et al. follow this usage, and so shall this comment. First, let us illustrate the usage of WP:Too soon per Wikipedia guidelines. The following excerpt is from the AfD for the biography of a white, male, assistant professor who was nominated for deletion under the tag WP:Too soon: “Most of the newspaper articles cited in the main article are not directly related to the subject, and apart from this brief article in the Dainik Jagran that borders on being a hagiography of the subject, there's no real coverage for WP:GNG. WP:Too soon perhaps.” The biography in question was for a Shivendu Ranjan. Moreover, the tag WP:TOOSOON was not used in the nomination. Instead, the nominator stated that Ranjan failed the GNG. The !voter being quoted began their rationale by saying that there is no evidence that the subject meets WP:ACADEMIC (another pointer to the same guideline as WP:PROF). Most of the !vote is a point-by-point explanation of why the editor believes that the WP:PROF criteria are not met. The plain reading of the WP:TOOSOON perhaps at the end is that the !voter believed Ranjan's situation could change in the future. Further examples of misleading quotation will be noted below.

They go on: Despite the academic being an assistant professor, the moderator [sic] focused on media coverage, not the career stage, of the subject which is in accordance with the Wikipedia guidelines of the tag WP:Too soon. The essay WP:TOOSOON says nothing specifically about academic career stages. Most of it is about movies. The only profession that is specifically discussed is acting.

Lemieux et al. state that they collected the career stages of each individual designated WP:Too soon, using a pair of research assistants to identify these stages manually. Since perceived notability among academics is highly contingent on their rank (Adams et al., 2019; WP:Notability (academics)), academic careers were scored based on stage, with assistant professors being scored as 1, associate professors with 2, and so on. This methodology is flawed from the outset. It confuses correlation with contingency: one cannot neglect citation metrics and the other success indicators described in WP:PROF when talking about how academic-biography AfD's evaluate career status. The only time that "career stage" as Lemieux et al. think of it factors into a WP:PROF judgment is if the subject has attained the named chair/Distinguished Professor level, because these indicate high levels of accomplishment. Lemieux et al. confuse both the status of WP:PROF versus WP:TOOSOON (guideline versus essay) and the logical roles those pages play in the very !votes they quote. WP:TOOSOON is not the means by which notability is evaluated; instead, invoking it is a means of speculating about the background of why the actual notability guideline is not met and noting that the situation may change.[note 5]

The claim that Wikipedia does not count trainees, research scientists, and/or government workers as “academics” is flatly untrue. Plenty of IEEE Fellows work in industry and are notable per WP:PROF#C3, for example. And when articles on students appear at AfD, the community files them with the other AfD's on academics and educators. Students and trainees are academics and are evaluated as such. They often fall short of notability, not because of the label "student", but because students infrequently stand out by the criteria that are actually relied upon. Because Lemieux et al. erroneously believe that Wikipedia regards all these people as "non-academics," they score all biographies thereof with a 0 on their career-stage scale, regardless of the subject's actual career stage. For example, a graduate student at a university would be scored the same as a senior research scientist at a major corporation. Upon these meaningless numbers, Lemieux et al. then do bad arithmetic, computing an average career stage [...] by calculating the sum of the career stage scores and dividing this by the total number of entries. Career stage is a qualitative or categorical variable, so taking the numerical mean is not likely to be indicative. What job is 0.76 of the way between assistant and associate professor?[supp 16]

David Eppstein, a computer-science professor, is a longtime participant in academic-bio AfD's and a member of the Women in Red project which aims to improve Wikipedia's biographical coverage of women. He writes,

[S]ince the article quotes me as invoking TOOSOON, perhaps I should explain what I generally mean by it. It is never the actual reason for a delete opinion, at least from me. In the cases under discussion, my choices are always grounded in notability guidelines and policies, not essays. When I use TOOSOON, it is not intended to strengthen the case for deletion. Maybe it is the opposite: it is a ray of hope in an otherwise negative opinion. If I think someone is not likely to ever be notable, I am probably just going to say delete, and explain why. If I think an academic does not currently meet our notability standards, but is on a trajectory on which they might well eventually do so, years later, I will say TOOSOON. We often see re-creations of the same articles, years apart, and including this in an opinion is a suggestion that if we discuss the same case again sometime we should check their accomplishments again more carefully instead of relying on past opinions.[supp 17]

In fairness, this was written in response to Lemieux et al., but it is also a natural implication of the everyday meaning of the phrase "too soon": not now, but maybe later. For an example of this involving David Eppstein and others, see Wikipedia:Articles for deletion/Charles Steinhardt.

Quote mining

The quotes given after These examples from AfD discussions all failed to mention the presence or depth of media coverage are misleadingly presented. Two of the three are from Wikipedia:Articles for deletion/Kate Killick. One of the !votes quoted actually began, Sadly, fails WP:NPROF. This is a significant omission, since it removes the actual reason the editor gave for believing the article should be deleted. Moreover, that AfD did discuss the presence or depth of media coverage, insofar as editors noted that there wasn't any. (The Irish Times reference is an opinion piece of which Killick's name is only mentioned once among many, many other names. I cannot locate any significant coverage from reliable sources that indicate notability.) The third quote is also truncated; the original is from here and concluded Tiny citations on GS do not pass WP:Prof and lack of independent in-depth sources fails WP:GNG.[note 6]

Another !vote they quote also began with a rationale that Lemieux et al. do not reproduce: The only form of notability claimed in the article is academic, but our standards for academic notability explicitly exclude student awards. Merely having written a few review papers is inadequate for notability; the papers need to be heavily cited, and here they appear not to be.

Misrepresentation of career status and other AfD aspects

The sentence immediately after the blockquote that includes snippets from the K. Killick AfD is Our dataset revealed that men at similar early career stages were present on Wikipedia. Their example is Colin G. DeYoung, who at the time of his AfD had an h-index of 44. That is not "similar" to a postdoc who coauthored a respectable but unremarkable number of papers (no more than 10, according to Web of Science). At the times of their respective AfD's, Killick had possessed a PhD for roughly six years, and DeYoung had possessed his for roughly thirteen. Without making any suppositions about the worth of their research on some notional absolute scale of intellectual achievement, it is safe to say that these AfD's were not for individuals at "similar" stages of their careers.

For example, Tonya Foster, a professor of creative writing and Black feminist scholar at San Francisco State University had a high Primer Index of 41 yet her Wikipedia page was deleted. It is worth mentioning that the article Tonya Foster does exist today. At the time of the AfD in 2017, the consensus was that WP:AUTHOR was not met, but the article was recreated in July 2020 without complaint. Foster did not join San Francisco State University until 2020, at which point she was named one of the George and Judy Marcus Endowed Chairs,[supp 21] so the argument for her passing WP:PROF got significantly stronger. The AfD process can hardly be faulted for failing to consider evidence that would not exist for another three years.

Another example is the late Sudha Shenoy, an economist and professor of economic history at the University of Newcastle, Australia, who had a high Primer Index of 198 yet her page was also deleted. According to her profile at the Mises Institute, she was a "lecturer" at Newcastle,[supp 22] and "lecturer" in Australia is a lower academic rank than a professor. In any case, that AfD looked at WP:PROF and WP:GNG and found that neither was met; citation counts were low, no other academic notability criteria could be argued for, and the sources about her were unreliably published. This data point appears to be more an indictment of the "Primer Index" than anything else.

Lemieux et al. mention the drama surrounding the article about nuclear chemist Clarice Phelps. They state that her biography was deleted three times in the span of one week. While its existence was indeed contentious, that specific claim is not in the cited source,[supp 23] or in the source upon which that website relied.[supp 24] Examining the deletion log for the article[supp 25] and the "article milestones" list at Talk:Clarice Phelps indicates that there is no one-week span that could match. Instead, it was deleted once in February 2019 and twice in April, before being incubated as a draft page and then restored for good in February 2020.[supp 26][supp 27][supp 28] The cause of diversity on Wikipedia is a marathon, not a sprint; the inaccurate timeline and the omission of the eventual success make for a misleading portrayal of the challenge that is of no help in resolving it.

Conclusion

Whatever revolutionary claims have been made on its behalf, Wikipedia has a fundamentally institutionalist character. It layers on top of existing academic and journalistic systems of legitimacy. The same rhetorical fences that keep Wikipedia from being a toxic waste dump of advertising and conspiracy theories also mean that it is a bad place for social change to begin. The encyclopedia can only be as non-sexist as the least sexist institution. The question of which articles should exist and what they should say is a question of content moderation at scale, a task that is "impossible to do well".[supp 29] The failure modes of Wikipedia's written rules and subcultural practices deserve study. But ill-conceived studies can lead to ill-conceived advocacy that makes real problems no easier to solve.

Lemieux et al. misrepresent Wikipedia policies, guidelines, and essays; the content of deletion debates; news reporting; and the prior academic literature. How these errors could have transpired is, in a word, baffling. How they passed through peer review is likewise a puzzle (and a discouraging sign), but pass through they did.[note 7] The problems are too pervasive to be addressed by an erratum or an expression of concern. The literature on the important subject of systemic bias in Wikipedia would be best served by a retraction and a careful re-examination of the editorial process.

Briefly


Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

Compiled by Tilman Bayer

"Towards a Digital Reflexive Sociology: Using Wikipedia's Biographical Repository as a Reflexive Tool"

From the abstract:[3]

"[...] we employ Wikipedia as a ‘reflexive tool’, i.e., an external artefact of self-observation that can help sociologists to notice conventions, biases, and blind spots within their discipline. We analyse the collective patterns of the 500 most notable sociologists on Wikipedia, performing structural, network, and text analyses of their biographies. Our exploration reveals patterns in their historical frequency, gender composition, geographical concentration, birth-death mobility, centrality degree, biographical clustering, and proximity between countries, also stressing institutions, events, places, and relevant dates from a biographical point of view. Linking these patterns in a diachronic way, we distinguish five generations of sociologists recorded on Wikipedia and emphasise the high historical concentration of the discipline in geographical areas, gender, and schools of thought."

"How can the social sciences benefit from knowledge graphs? A case study on using Wikidata and Wikipedia to examine the world’s billionaires"

From the abstract:[4]

"This study examines the potentials of Wikidata and Wikipedia as knowledge graphs for the social sciences. The study demonstrates how social science research may benefit from these knowledge bases by examining what we can learn from Wikidata and Wikipedia about global billionaires (2010-2022). [...] We show that the English Wikipedia and, to a lesser extent, Wikidata exhibit gender and nationality biased in the coverage and information about global billionaires. Using the genealogical information that Wikidata provides, we examine the family webs of billionaires and show that at least 15% of all billionaires have a family member also being a billionaire."

"Can you trust Wikidata?"

From the abstract:[5]

"The present work aims to assess how well Wikidata (WD) supports the trust decision process implied when using its data. WD provides several mechanisms that can support this trust decision, and our KG [Knowledge Graph] Profiling, based on WD claims and schema, elaborates an analysis of how multiple points of view, controversies, and potentially incomplete or incongruent content are presented and represented."

"A Study of Concept Similarity in Wikidata"

From the abstract:[6]:

"In light of the adoption of Wikidata for increasingly complex tasks that rely on similarity, and its unique size, breadth, and crowdsourcing nature, we propose that conceptual similarity should be revisited for the case of Wikidata. In this paper, we study a wide range of representative similarity methods for Wikidata, organized into three categories, and leverage background information for knowledge injection via retrofitting. We measure the impact of retrofitting with different weighted subsets from Wikidata and ProBase. Experiments on three benchmarks show that the best performance is achieved by pairing language models with rich information, whereas the impact of injecting knowledge is most positive on methods that originally do not consider comprehensive information. The performance of retrofitting is conditioned on the selection of high-quality similarity knowledge. A key limitation of this study, similar to prior work lies in the limited size and scope of the similarity benchmarks. While Wikidata provides an unprecedented possibility for a representative evaluation of concept similarity, effectively doing so remains a key challenge."

Matching non-notable Wikidata "orphans" to Wikipedia sections

From the abstract and paper:[7]

"We present a transformer-based model, ParaGraph, which, given a Wikidata entity as input, retrieves its corresponding Wikipedia section. To perform this task, ParaGraph first generates an entity summary and compares it to sections to select an initial set of candidates. The candidates are then ranked using additional information from the entity’s textual description and contextual information. Our experimental results show that ParaGraph achieves 87% Hits@10 when ranking Wikipedia sections given a Wikidata entity as input. [...]

This mapping between Wikipedia and Wikidata is beneficial for both projects. On the one hand, it facilitates information extraction and standardization of Wikipedia articles across languages, which can benefit from the standard structure and values of their Wikidata counterpart, e.g., for populating infoboxes. On the other hand, Wikipedia articles are routinely updated, which in turn keeps Wikidata fresh and useful for online applications. However, the Wikipedia editorial guidelines require that an entity be notable or worthy of notice to be added to the encyclopedia, which does not hold for all Wikidata entities. [...] We refer to the remaining entities, which do not have an article in [any language] Wikipedia, as orphans. In the absence of a textual counterpart, orphans often suffer from incompleteness and lack of maintenance. Our present effort stems from the observation that a substantial number of orphan entities are indeed represented in Wikipedia, but not at the page level; orphan entities are often described within existing Wikipedia articles in the form of sections, subsections, and paragraphs of a more generic concept or fact. For example, the English Wikipedia does not have a dedicated page about “Tennis racket”, it is instead embedded in the “Racket” page as a section, whereas it can be found as a standalone (orphan) entity on Wikidata (“Q153362”)."

References

  1. ^ Lemieux, Mackenzie; Zhang, Rebecca; Tripodi, Francesca (March 29, 2023). ""Too Soon" to count? How gender and race cloud notability considerations on Wikipedia". Big Data & Society. 10. doi:10.1177/20539517231165490. S2CID 257861139.
  2. ^ a b Tripodi, Francesca (2021-06-27). "Ms. Categorized: Gender, notability, and inequality on Wikipedia". New Media & Society. doi:10.1177/14614448211023772. S2CID 237883867.
  3. ^ Beytía, Pablo; Müller, Hans-Peter (2022-09-15). "Towards a Digital Reflexive Sociology: Using Wikipedia's Biographical Repository as a Reflexive Tool". Poetics: 101732. doi:10.1016/j.poetic.2022.101732. ISSN 0304-422X. Closed access icon (author's link
  4. ^ Daria Tisch, Franziska Pradel: How can the social sciences benefit from knowledge graphs? A case study on using Wikidata and Wikipedia to examine the world’s billionaires (submission to Semantic Web – Interoperability, Usability, Applicability , under review)
  5. ^ Veronica Santos, Daniel Schwabe and Sérgio Lifschitz Can you trust Wikidata? (submission to Semantic Web – Interoperability, Usability, Applicability , under review)
  6. ^ Filip Ilievski, Kartik Shenoy, Hans Chalupsky, Nicholas Klein and Pedro Szekely: A Study of Concept Similarity in Wikidata (submission to Semantic Web – Interoperability, Usability, Applicability, under review), Code
  7. ^ Natalia Ostapuk, Djellel Difallah, and Philippe Cudré-Mauroux. “ParaGraph: Mapping Wikidata Tail Entities to Wikipedia Paragraphs.” In: 2022 IEEE International Conference on Big Data, BigData, 2022. slides, Dataset: Ostapuk, Natalia; Difallah, Djellel; Cudré-Mauroux, Philippe (2022-11-25), Wikidata dump extension (enwiki section links), Zenodo

Supplementary references

  1. ^ Matei, S. A.; Dobrescu, C. (2011). "Wikipedia's "Neutral Point of View": Settling Conflict through Ambiguity". The Information Society. 27 (1): 40–51. doi:10.1080/01972243.2011.534368. S2CID 27479715.
  2. ^ Gauthier, Maude; Sawchuk, Kim (2017). "Not notable enough: feminism and expertise in Wikipedia". Communication and Critical/Cultural Studies. 14 (4): 385–402. doi:10.1080/14791420.2017.1386321. S2CID 149229953.
  3. ^ Luo, Wei; Adams, Julia; Brueckner, Hannah (2018-08-30). "The Ladies Vanish? American Sociology and the Genealogy of its Missing Women on Wikipedia". Comparative Sociology. 17 (5): 519–556. doi:10.1163/15691330-12341471.
  4. ^ "Lois K. Alexander Lane, version of 19 March 2016".
  5. ^ "Wikipedia:Search engine test". pageviews.wmcloud.org. Retrieved 2023-04-12.
  6. ^ "Wikipedia:Notability". pageviews.wmcloud.org. Retrieved 2023-04-12.
  7. ^ "Wikipedia:Notability (academics)". pageviews.wmcloud.org. Retrieved 2023-04-12.
  8. ^ "Wikipedia:Notability (academics)". xtools.wmflabs.org. Retrieved 2023-04-12.
  9. ^ "Wikipedia:Search engine test". xtools.wmflabs.org. Retrieved 2023-04-12.
  10. ^ "Interpreting results: Kruskal–Wallis test". GraphPad Prism. Retrieved 2023-05-07.
  11. ^ "Wikipedia talk:WikiProject Women in Red, version of 7 May 2023".
  12. ^ "Wikipedia:Too soon". pageviews.wmcloud.org. Retrieved 2023-04-13.
  13. ^ "Wikipedia:Notability". xtools.wmflabs.org. Retrieved 2023-04-13.
  14. ^ "Wikipedia:Too soon". xtools.wmflabs.org. Retrieved 2023-04-13.
  15. ^ Adams, Julia; Brückner, Hannah; Naslund, Cambria (2019). "Who Counts as a Notable Sociologist on Wikipedia? Gender, Race, and the "Professor Test"". Socius. 5: 1–14. doi:10.1177/2378023118823946. S2CID 149857577.
  16. ^ For the meaninglessness of applying averages to ordinal variables, see, e.g., Wilson, Thomas P. (March 1971). "Critique of ordinal variables". Social Forces. 49 (3): 432–444. doi:10.2307/3005735. JSTOR 3005735.
  17. ^ "Wikipedia talk:WikiProject Women in Red, version of 13 April 2023".
  18. ^ Roberts, Justin (2020-07-02). Slavery & Abolition. 41 (3): 686–688. doi:10.1080/0144039X.2020.1790769. ISSN 0144-039X. S2CID 221178536.{{cite journal}}: CS1 maint: untitled periodical (link)
  19. ^ Sklansky, Jeffrey (Spring 2021). Journal of Social History. 54 (3): 973–975. doi:10.1093/jsh/shz115.{{cite journal}}: CS1 maint: untitled periodical (link)
  20. ^ Rhode, Paul (March 2020). The Journal of Economic History. 80 (1): 293–294. doi:10.1017/S0022050720000029. S2CID 214003587.{{cite journal}}: CS1 maint: untitled periodical (link)
  21. ^ "Creative Writing Department announces new George and Judy Marcus Endowed Chairs". SF State News. 2020-04-10. Retrieved 2023-04-12.
  22. ^ "Sudha R. Shenoy". Mises Institute. 20 June 2014. Retrieved 2023-04-11.
  23. ^ Sadeque, Samira (2019-04-29). "Wikipedia just won't let this Black female scientist's page stay". The Daily Dot.
  24. ^ Jarvis, Claire L. (2019-04-26). "Wikipedia's Refusal to Profile a Black Female Scientist Shows Its Diversity Problem". Slate.
  25. ^ "All public logs: Clarice Phelps". Retrieved 2023-04-12.
  26. ^ "Clarice Phelps: version of 7 February 2020".
  27. ^ Page, Sidney (2022-10-17). "She's made 1,750 Wikipedia bios for women scientists who haven't gotten their due". The Washington Post. Retrieved 2023-04-12.
  28. ^ Khan, Arman (2022-11-18). "I've Made More Than 1,700 Wikipedia Entries on Women Scientists and I'm Not Yet Done: British scientist Jessica Wade has made one Wikipedia entry every day since 2017". Vice. Retrieved 2023-04-12.
  29. ^ Masnick, Mike (2019-11-20). "Masnick's Impossibility Theorem: Content Moderation At Scale Is Impossible To Do Well". Techdirt. Retrieved 2023-04-15.

Notes

  1. ^ A manual survey of 100 random biographies found only 25 that could meet those criteria, and this casts a net more widely than the remit of WP:PROF.
  2. ^ Later in the review, they write, For academic biographies on Wikipedia, notability is achieved through the significant impact of one's scholarly work on society, the winning of prestigious academic awards, or the holding of important leadership positions at an academic institution or academic journal board. This conflates multiple criteria that WP:PROF lists separately, in a way that may obscure how they operate. WP:PROF#C1 concerns significant impact in their scholarly discipline (emphasis added). WP:PROF#C4 asks for a significant impact in the area of higher education, affecting a substantial number of academic institutions. And influence within society at large, outside academia in their academic capacity, is WP:PROF#C7. These are each evaluated in different ways, as the guideline details. Of the sources cited for this point, Matei and Dobrescu[supp 1] do not discuss any notability guideline at all. Gauthier and Sawchuk[supp 2] and Luo et al.[supp 3] discuss the page WP:Notability but do not mention the existence of a guideline specialized to scholars and academics, despite its relevance to their subject matter.
  3. ^ Tripodi's description of a case where a woman's purported significance is easily verifiable using the search engine test[2] is factually inaccurate. The biography of Lois K. Alexander Lane was not pushed out of the main space; it was a draft article not yet in the main space.[supp 4] This draft did not contain, as Tripodi writes, links to seven credible sources independent of the subject, including The Washington Post and the Smithsonian. It contained two sources, one a Washington Post item and the other a webpage at the Smithsonian, and it linked to those two sources a total of seven times. The article Lois K. Alexander Lane has, contrary to Tripodi's statement, never been nominated for deletion.
  4. ^ To the point about importance, note that WP:Too soon received a median of 106 views per month over the year prior to this writing,[supp 12] versus 11,924 for WP:Notability. 1,664,013 pages link to the latter, and 1,393 link to the former.[supp 13][supp 14]
  5. ^ The Adams et al. paper[supp 15] only studies sociologists, and so whether its conclusions generalize further across academia is an open question. They report (Table 3) a correlation between career stage and the probability of having a Wikipedia page, but they do not disentangle career stage from citation metrics or other indicators. Emeritus professors are likely to have done more than assistant professors. Adams et al.'s data is from October 2016 and may be outdated in various aspects, which it is beyond the scope of this comment to determine. Perhaps worth noting in this context, however, is their tentative conclusion that "pages about women were not more likely to be deleted than pages about men" and "the main story is that women are less likely to appear in the first place".
  6. ^ In the time since that AfD, the book mentioned in it has accumulated additional reviews.[supp 18][supp 19][supp 20] In the present circumstances, the biography of the author might be refactored into a page about the book, rather than deleted. Compare, e.g., Wikipedia:Articles for deletion/Daisy Deomampo, Wikipedia:Articles for deletion/Aaron Fox (musicologist), and Wikipedia:Articles for deletion/Alam Saleh.
  7. ^ For related commentary, see Tilman Bayer's 25 July 2021 "Recent research" column in the Wikipedia Signpost. The concerns about confounding factors raised there are echoed here. For example, newly-created articles might be scrutinized more closely; articles begun by well-meaning novices might be more likely to be nominated for deletion in good faith, even if they turn out salvageable.




Reader comments

2023-05-08

I wrote a poem for each article, I found rhymes for all the lists;
My first featured picture of this year now finally exists!

Li Fu Lee by Underwood & Underwood, restored by Adam Cuerden, my first featured picture of the year, after a rather rough few months.

This Signpost "Featured content" report covers material promoted from 1 to 15 April.

If you plan to write featured article summaries in poetry (maybe that's just me) it's really convenient to have some of them be on songs. All you have to do is take the opening bit of the song, rewrite it to be about itself, and there you go: Self-referential music. And you only have to keep the rhyme scheme as good as the original, which means I can let myself get away with slant rhymes for once, which I feel guilty about using otherwise.

On a personal note, this issue marks my first featured picture of the year (see above), after a very rough first three months. This being me, there's three others by me this issue, and it's looking good for that or more next issue. Have to catch up somehow, eh?

Adam Cuerden

Ten featured articles were promoted this period.

Mecca in panorama, eighteen fifty-five
'Tis pity how little within it survives.
Hajj: Journey to the Heart of Islam, nominated by MartinPoulter
A journey to Mecca in artefacts arrayed
At the British Museum where they were displayed;
"I Don't Wanna Cry", nominated by Heartfox
Once again we sit and listen
To Mariah Carey's song.
Latin torch song, sung so sweetly,
Baby, look what it's become.
It can make a million record sales
And earn her some change.
It's a Billboard hit, and Heartfox brought it
Up to featured article, hooray!
(Cue chorus)
Li Rui (politician), nominated by Ganesha811
Li Rui was a member of the Communist party,
He defied Chairman Mao, and for following his heart, he
Was imprisoned and rejected, but restored when Mao died,
Then wouldn't help nepotism, so was once more denied.
The Next Day, nominated by zmbro
Bowie hid his work
Then a sudden release
You never knew that
That he could do that
Just sudden release
 
Planning it for months he
Kept everyone quiet
He bided his time
To The Next Day
Just sudden release.
Ernest Roberts:
As a nineteen-oh-eight photograph, this would stand right near the top,
Had someone, for the article, not chose this awful crop.[1]
Ernest Roberts (Australian politician), nominated by Peacemaker67
He fought in the Boer War, he wrote for newspapers,
And then he joined in on political capers.
For Adelaide, for Labor he won many an election
But he was struck down in his prime, which would cause them all dejection.
Portland Spy Ring, nominated by SchroCat
The Admiralty Underwater Weapons Establishment:
A successful place for Soviet lavishment.
They paid out the money, the secrets they got
And five separate spies then would each get their cut.
Diodorus scytobrachion, nominated by FunkMonk
A silesaurid dinosauromorph
Of which few bones have yet come forth.
"Made You Look" (Meghan Trainor song), nominated by MaranoFan (a.k.a. NØ)
Trainor wrote a doo-wop song
Tiktok danced it in a huge throng:
A dance challenge for all day long,
It's Made You Look.
Battle of the Trebia, nominated by Gog the Mild
Hannibal and Romans again, so go put on your tunic
And get settled in for a war that is Punic!
We're coming in hot with a Roman defeat:
Surprise attacks from the rear really turn up the heat.
1867 United States Senate election in Pennsylvania, nominated by Wehwalt
The Democrats supported a Republican, but that one, Cowan, didn't win.
The caucus of Republican legislators voted for Cameron (not Curtin).
... It's hard in a short poem to cover 19th-century elections,
So go and read the article — or at least some key selections.

Twelve featured pictures were promoted this period, including the images at the start and bottom of this article.

Seven featured lists were promoted this period.

The Seattle Sounders won three Open Cups in a row
The first three chances they had — Wow, way to go!
List of Seattle Sounders FC seasons, nominated by SounderBruce
SounderBruce writes on Seattle Sounders;
I wonder where he got his name?
Still, Bruce is one of those all-rounders,
This list his latest claim to fame.
Alia Bhatt filmography, nominated by Krimuk2.0
No more just Bollywood star alone,
With her upcoming role in the film Heart of Stone.
GLAAD Media Award for Outstanding Documentary, nominated by PanagiotisZois
GLAAD selects the best of film that has LGBT within,
But why's so many of this set lack articles on them?
List of Billboard Latin Pop Airplay number ones of 1999, nominated by Magicandude (a.k.a. Erick)
Yet another featured list we have to put within the hoard
Of yearly lists of music as reported by Billboard.
List of early-diverging flowering plant families, nominated by Dank
The eudicots and monocots won't be found within this list.
But of flowering plants you'll find the rest of all those that now exist.
List of roles and awards of Angeline Quinto, nominated by Pseud 14
A Filipina actor/singer, she won herself great fame,
With mostly Filipino things I don't know well enough to name.
She did Four Sisters and a Wedding and "Patuloy Ang Pangarap",
But I don't know the subject, so this poem is quite crap.
Timeline of the Warren G. Harding presidency, nominated by Thebiguglyalien
After World War I, his presidency tried
To make things better. And then ... he died.
NGC 6530 by NASA, another of our new featured pictures.

Notes

  1. ^ The author of this featured content has opinions on crops that make things look like they were taken from an American school yearbook.



Reader comments

2023-05-08

"World War II and the history of Jews in Poland" approaches conclusion

World War II and the history of Jews in Poland

Continuing from previous Signpost coverage: the case for World War II and the history of Jews in Poland was accepted 13 March. The proposed decision is to be posted a few days after Signpost publication, by 11 May 2023 according to clerks. Content posted in the Analysis phase (now closed) included quotes cross-posted from Wikipediocracy and invocation of WP:BLPCRIME. The proposed decision page, which is meant to include principles, findings of fact, and remedies, is still a mere set of templates.



Reader comments

2023-05-08

Planning together with the Wikimedia Foundation

Summary

The Wikimedia Foundation has remained in a period of transition. It welcomed new leadership last year, including a new Chief Executive Officer and a new Chief Product and Technology Officer. The Foundation has navigated conversations with our global communities on a range of important issues, from a future charter defining roles and responsibilities to how we raise shared resources through banner fundraising. This year's Annual Plan gives greater clarity on multi-year strategic issues that don't have quick fixes, as well as more granular information on how the Foundation operates. As noted in the first part of this Signpost series, feedback from our many stakeholders is welcome and appreciated.

Where we are today

Last year we decided to focus our efforts at the Wikimedia Foundation on radically changing how we do our work. This included organizing our work regionally to respond to the varying needs of communities globally, to refreshing our values at the Foundation to improve our own levels of collaboration. This is putting us in a better position to more meaningfully shift what we do – especially as the world around us changes in more unexpected ways and we assess how to have more collective impact on common goals toward the 2030 strategic direction.

Once again, we must first consider the changing world around us, what it needs from us, and how we must adapt to it. We are grounding this annual plan in multi-year strategic planning to consider longer-term shifts to the Wikimedia movement's revenue, product and technology, and roles and responsibilities. External trends show that social platforms continue to displace traditional search engines, and that artificial intelligence threatens even more disruption to the digital world. The legal landscape on which our global movement relies is changing significantly after decades of relative stability. In response to continuing threats like mis- and disinformation, lawmakers are attempting to regulate internet platforms in ways that could fundamentally endanger our mission. These threats and increasing polarization create new reputational risks for our projects and work. Continued uncertainty in the global economy is accelerating the need to assess the trajectory of our revenue streams, and make new investments that can support growth in resources to fund our collective work and ambitions.

Our approach for the future

For the second consecutive year the Wikimedia Foundation is anchoring its annual plan in the movement's strategy to advance equity. Our intention is to connect the Foundation's work even more deeply with the Movement Strategy Recommendations, to make even deeper progress toward the 2030 Strategic Direction. We remain driven to do this through collaborative planning with others in the movement who are also implementing the recommendations. This is made more actionable in deepening our regional focus, so that the Foundation's support better meets the needs of communities in all regions of the world. The upcoming Movement Charter is expected to give more clarity on roles and responsibilities, possibly through new collaborative structures like hubs and a Global Council. We intend to continue our collaboration with the charter process to advance equity in decision-making for our movement.

This year the Foundation is recentering its plan around Product and Technology, emphasising our unique role as a platform for people and communities collaborating on a massive scale. The bulk of this effort – called "Wiki Experiences" – recognizes that volunteers are at the heart of the Wikimedian process of sensemaking and knowledge creation. So this year, we are prioritising established editors (including those with extended rights, like admins, stewards, patrollers, and moderators of all kinds, also known as functionaries) over newcomers, to ensure that they have the right tools for the critical work they do every day to expand and improve quality content, as well as their management of community processes. Managing the platform effectively also requires the Foundation to address large-scale infrastructure and data needs that may extend beyond the specific Wiki Experiences of the projects. This work is described as "Signals & Data Services." And in a category called "Future Audiences" we must accelerate innovations that engage diverse audiences as editors and contributors.

Trade-offs and choices

The financial model the Wikimedia movement has relied for most of its historic growth (banner fundraising) is reaching some limits. New funding streams to complement this – including Wikimedia Enterprise and the Wikimedia Endowment – will take time to develop. They are unlikely to fund the same levels of growth in the coming few years as we've seen in banner fundraising over the past decade, especially given an uncertain global economic outlook.

In response to these trends, the Foundation slowed growth last year compared with the previous three years. We're now making internal budget cuts that involve both non-personnel and personnel expenses, to ensure we have a more sustainable trajectory in expenses for the coming few years. Despite these budget pressures we will grow overall funding to movement partners, including expanding grants to take into account global inflationary costs, support newcomers to the movement, and increase funding for conferences and movement events.

This plan involves more funding in all regions while prioritising proportionally larger growth in underrepresented regions. To enable this growth for affiliates and newcomers, some grant programs (like the Research and Alliances funds) will need to be smaller. As we assess the Foundation's core capabilities we recognize that there are activities where others in the movement may be better placed to have meaningful impact, and are exploring pragmatic ways to move in that direction in the year ahead.

To be more transparent and accountable, this annual plan includes detailed financial information, notably on the structure of the Foundation's budget, as well as how the Foundation's departments are organized, and global guidelines and compensation principles.

Goals

The Wikimedia Foundation has four main goals in 2023−2024. They are designed to align with the Wikimedia movement's Strategic Direction and Movement Strategy Recommendations, and continue much of the work identified in last year's plan. These goals are:

In this mission together

For the overall draft of the annual plan we're inviting collaboration both on-wiki in over 20 languages, in live virtual conversations, and at in-person community events. You can share feedback on-wiki till Friday May 19.

Read the full draft of the Wikimedia Foundation's annual plan




Reader comments

2023-05-08

There Shall Be Seasons Refreshing – Stories from WikiConference India 2023

Placeholder alt text
Participants at WikiConference India 2023 in Hyderabad
Placeholder alt text
The Kashmiri Wikipedia's story

The clouds had set in, the cold breeze also joined and then came the pleasantful showers. Hyderabad is located on the very hot and dry Deccan plateau. Here, April is a time of summer heat, but all of nature just joined to make the 3rd edition of WikiConference India in Hyderabad this April an exciting, enriching and enhancing experience. For the languages and communities in India and its subcontinent, the seven-year wait to host another national conference and the challenges of the Covid pandemic all stood down, giving way to the drama of dedication, determination and deliverance by the communities.

Technologies are often said to solve every problem ... If so, learn about the Kashmiri Wikipedia's story. Keyboards don't support all characters, machine translation is still sub-par, little digital content on the internet. That's not all – the scenic, beautiful high-altitude mountain state of Kashmir where the language is native has internet shutdowns. Yet, Kashmiri Wikipedia never disappoints. The content is growing, and so is the readership and editor base. User:Amire80 and the larger language community also helped to get the Universal Language Selector working. Let there be problems, User:511KeV and the larger Kashmiri community will always find solutions and make a bigger history someday. The challenges will be overcome, so mark this space.

Placeholder alt text
Welcoming Angika Wikipedia

Listen to some more! Angika is a language spoken in Bihar, Jharkhand, West Bengal and parts of Nepal. If India falls under Global South, these regions are further south amongst the Global South. The language was in incubation for nearly twelve years and only went live on 22 March, 2023. Can a twelve-year wait be imagined? That's the art of resilience, which is almost a curiosity for today's generation of volunteers. As machine learning, automation and faster technological solutions became ever more commonplace in the movement, the word 'resilience' began to disappear. User:Angpradesh and the larger Angika community have shown the willingness and commitment to defy the odds and succeed. Meanwhile, Gondi, Kolami and many other languages are also writing their history and expect to succeed very soon.

While Angika work stretches back to the first WikiConference in 2011 and before, learn about Mr Bharathesha Alasandemajalu and Dr Vishwanatha Badikana who made Tulu Wikipedia active around the time of WikiConference India 2016 in Chandigarh. This time they came with even more exciting developments. Heard of Arebhashe? A dialect of Kannada having a history of more than 500 years. On invitation from Karnataka State Arebhashe Samskrithi Mattu Sahitya Academy, they have set up the first-ever lexicon of 18,000 words in 950 pages, as well as the first-ever encyclopedia of about 450 pages and still counting. This is a remarkable achievement for an endangered and underrepresented language, and you can read more about the story behind it here.

Placeholder alt text
A dictionary and encyclopedia for an endangered language

There are more stories to hear, there are more stories to appreciate. Doteli is one, Santali is another. Wikisource in India has its own fairytale from stories on digitization to audio books to new technologies. The writing space will fall short, the time space will be limited but the stories will not finish. These are communities from India and its subcontinent.

It is the same community who suffered the worst of the Covid pandemic and asked for help with vaccination, protection kits and expert counselling. The very same community which repeatedly has hardware and bandwidth challenges. What did some of these and others stories show at WikiConference India 2023? The challenge may be any, hard or soft, it is the communities who always make the difference. There are no two ways about it, communities in the Indian subcontinent will continue to rise.



Reader comments

If articles have been updated, you may need to refresh the single-page edition.

















Wikipedia:Wikipedia Signpost/Single/2023-05-08