Recent research: "As many as 5%" of new English Wikipedia articles "contain significant AI-generated content", says paper

In the media: Off to the races! Wikipedia wins!

Contest: A WikiCup for the Global South

Traffic report: A scream breaks the still of the night

Book review: The Editors

Humour: The Newspaper Editors

Crossword: Spilled Coffee Mug

" />

Recent research: "As many as 5%" of new English Wikipedia articles "contain significant AI-generated content", says paper

In the media: Off to the races! Wikipedia wins!

Contest: A WikiCup for the Global South

Traffic report: A scream breaks the still of the night

Book review: The Editors

Humour: The Newspaper Editors

Crossword: Spilled Coffee Mug

" />

Recent research: "As many as 5%" of new English Wikipedia articles "contain significant AI-generated content", says paper

In the media: Off to the races! Wikipedia wins!

Contest: A WikiCup for the Global South

Traffic report: A scream breaks the still of the night

Book review: The Editors

Humour: The Newspaper Editors

Crossword: Spilled Coffee Mug

" />
The Signpost
Single-page Edition
WP:POST/1
19 October 2024

News and notes
One election's end, another election's beginning
Recent research
"As many as 5%" of new English Wikipedia articles "contain significant AI-generated content", says paper
In the media
Off to the races! Wikipedia wins!
Contest
A WikiCup for the Global South
Traffic report
A scream breaks the still of the night
Book review
The Editors
Humour
The Newspaper Editors
Crossword
Spilled Coffee Mug
 

File:WMF Board 2024 election results.svg
Nadzik
CC BY-SA 4.0
387
26
700
2024-10-19

One election's end, another election's beginning

Preliminary results for 2024 WMF Board election

Board elections results

Christel Steigenberger
Maciej Artur Nadzikiewicz (left)
Victoria Doronina
Lorenzo Losa

The 2024 election for the Board of Trustees of the Wikimedia Foundation is now complete. Preliminary results were announced earlier this week. 12 candidates ran for 4 Community- and Affiliate-selected Trustee seats in this election. The vote was conducted using a single transferable vote system, which transfers votes from eliminated candidates after each round to the voter's next preferred candidate.

The four candidates with the most votes were –

These four winning candidates will need to pass a background check and have all other bylaw requirements confirmed before they can be officially appointed at the board's December meeting.

Two of them (Doronina and Losa) are already on the board and were re-elected. The election came not long after the board's controversial refusal to ratify the "Movement Charter" after it had been worked on over several years (Signpost coverage: "Wikimedia community ratifies Movement Charter, Wikimedia Foundation rejects ratification"). As one Wikimedian noted on the Foundation-l mailing list: "Victoria and Lorenzo, who were greatly associated with the Movement Chart BoT veto some months ago, were reelected by the community, despite many predictions that they would suffer a big backlash for making public their positions, and a number of people was quick to predict their certain removal from office, especially Victoria."

A visual representation of the vote is given by the Sankey diagram above. Incoming Board member Nadzik also analysed some voting statistics on this page on Meta.

Congratulations to the winners and our thanks to the other candidates who ran in the election: Bobby Shabangu, Deon Steyn, Erik Hanberg, Farah Jack Mustaklem, Lane Rasberry, Mohammed Awal Alhassan, Rosie Stephenson-Goodknight, and Tesleemah Abdulkareem. – Sb, H, S

Administrator elections trial begins

After a year of fewer RFAs, a flood of candidates have signed up in the upcoming trial of Administrator elections. With the call for candidates having closed on 14 October, 35 candidates will be seeking to be elected (or re-elected) as administrator. The full list of candidates can be found on the respective AELECT subpage.

There were a total of 14 RFAs this year until October. Considering AELECT, this brings 2024 to 49, reversing a multi-year trend of fewer admin candidates in successive years. The latest year with more RFAs was 2015, with 53 candidates over the year.

However, it is not clear how many of said candidates will end up passing. As far as is known, the previous record for the number of active RFAs at once is 28 on 6 December 2005, per this discussion in October 2007. The elections will also use private voting via SecurePoll, which may skew the results.

The discussion phase will be open for 72 hours, from 22 to 24 October. Anyone may ask questions at the election subpages, and candidates are "encouraged" to reply. Voting begins from 25 October and will continue till 31 October, closing at 23:59 UTC. After three days for discussion and questions, the candidate pages will be closed for discussion, and voting begins.

Admin elections and other RFA reform efforts were last covered by The Signpost in the 26 September issue. – S

Administrators still in decline

Note: This section was prepared before the large number of applicants emerged for the administrator elections trial.

In prior Signpost coverage we discussed the declining number of active administrators:

Now, on 7 October, Rick Bot has recorded 418 active administrators, another record low count. The monthly averages of data reported by the bot daily were:

2024
Period daily average
January 465.7
February 473.3
March 448.2
April 438.3
May 438.3
June 434.6
July 435.4
August 432.5
September 425.4
October 421.0[a 1]
  1. ^ As of 10 October
Active admins, 2011 through mid 2019 (chart by Widefox)

For comparison, using the same methodology, the average number of active administrators in the past several years (also shown in a chart created earlier through mid 2019) was:

Other years
Period daily average
2017 543.0
2018 528.6
2019 515.1
2020 508.3
2021 487.2[b 1]
2022 470.0
2023 464.1
  1. ^ Data anomaly October-November, only 300 data points in 2021

A trend of declining numbers of active administrators is apparent in both the monthly and annual sets of data. If no more changes occurred after the 10 October data cutoff date, then 2024 would tally a loss of 43 over the 2023 average. The next greatest annual decline in this table was about half as many: -21.1 between 2020 and 2021. – B

CEO of the WMF's Brazilian affiliate signs an op-ed supporting research on disinformation

On September 27, Brazilian Wikimedian João Alexandre Peschanski was revealed to be one of the over 150 international researchers who co-signed a collective op-ed that deemed disinformation to be "one of the biggest short-term threats to humanity", and called for more protection and easier access to online data for those who study it and its perceived impact on public opinion.

This is a notable bit of news, since Peschanski—also known on-wiki as Joalpe—currently serves as the executive director of the Wiki Movement Brazil User Group, an association founded in 2013 and regularly affiliated to the Wikimedia Foundation since 2019.

The op-ed is available in several languages, including Spanish, German, Italian and Portuguese, and has been re-published by several media outlets. – O

Product and Technology Advisory Council appointed

The Product and Technology Advisory Council (PTAC) was appointed this week. This council was introduced as a one-year pilot program as part of Board of Trustees' three proposals in lieu of the Movement Charter.

Apart from WMF CPTO Selena Deckelmann and other WMF staff, the council includes:

According to the initial proposal, the 8 volunteer members would include 5 at-large technical contributors and 3 Wikipedia volunteers, at least one of who would be from the English Wikipedia. While it's not clear which category each selection was for, 4 volunteers have English Wikipedia as their most edited project – Sohom Datta, TheDJ, GorillaWarfare (also admin and OS), Benjamin Mako Hill.

PTAC was last covered by The Signpost in the 14 August issue. – S

News from WMF

The WMF published two bulletins since our latest Signpost issue – one for late September and early October respectively.

Of note were The Power of the Commons, an event hosted by the Wikimedia Foundation and other organizations as part of The Summit of the Future at the Headquarters of the United Nations and a new change to the Wikidata Query Service.

Highlights from the October issue include an e-mail from the "Wikimedia Foundation Board Governance Committee" and its Movement Charter debrief process (Spoilers: They'll collect feedback until January 2025).

More notably, nominations are currently open for positions on AffCom, Ombuds commission and the trust and safety Case Review Committee. AffCom plays an advisory role in official accession of new affiliates to the movement; the ombudsfolk are tasked with investigating complaints about infringements of the WMF privacy policy and related policies by CheckUsers, oversighters and others with access to sensitive user data; and the case review committee represents the community for appeals of certain trust and safety office actions. Applications for the Affiliations Committee close on November 18, 2024, and applications for the Ombuds commission and the Case Review Committee close on December 2, 2024.

Wikimedia Foundation has filed an amicus brief in a Mexican case known as Richter v. Google – more information about the case here, it is compared to Section 230 in the United States. Although Mexico does not have statutory Section 230 protections, the Internet Society says that they are subject to similar content moderation restrictions and protections due to international treaties with the U.S. and Canada.

B, S

Knowledge Equity Fund: Final grantees announced, reports from 2023 grantees

Introducing the Round 3 Knowledge Equity Fund grantees

The Wikimedia Foundation announced a new round of grants from its Knowledge Equity Fund (see previous Signpost coverage), awarding "13 organizations in 10 countries, supporting work to address knowledge gaps and create and share new knowledge" with $1.362 million in total.

It also published impact reports for some of the Round 2 grantees announced in August 2023, and noted some highlights from three such reports:

Another round two grantee doesn't seem to have provided reports yet (but the Foundation "will share the rest of the final reports as we receive them"):

In the recent announcement, the Foundation notes that

Through multiple community conversations that we hosted in 2023, we heard feedback from volunteers about the goals and impact of the Knowledge Equity Fund which led us to make some key changes for our upcoming rounds of funding.

These changes include

  • More consistent and clear communication about the Knowledge Equity Fund, its grants and impact
  • Opportunities for movement groups to also receive grants for work they are doing to address knowledge equity
  • Clearer measures of impact for Knowledge Equity Fund grants

The announcement also states that

With the conclusion of Round 3, the Fund now has $815,000 USD left [from the $4.5 million it was set up with in 2020]. The Equity Fund will run one last “round” in the next 4 months, where we will choose a handful of the most impactful grantees from the first rounds and provide them with a final “top up grant” to deepen their work with the movement and ensure that the content they create is present on the Wikimedia projects.

AK, H

Brief notes



Reader comments

File:Ouroboros-Zanaq 00.svg
Eugenio Hansen
CC BY-SA 4.0
0
0
300
2024-10-19

"As many as 5%" of new English Wikipedia articles "contain significant AI-generated content", says paper

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"As many as 5%" of new English Wikipedia articles "contain significant AI-generated content"

A graph.
Figure 1 from the paper: "Using two tools, GPTZero and Binoculars, we detect that as many as 5% of 2,909 English Wikipedia articles created in August 2024 contain significant AI-generated content. The classification thresholds of both tools were calibrated to maintain a FPR of no more than 1% on a pre GPT-3.5 Wikipedia baseline, as indicated by the red line.

A new paper titled "The Rise of AI-Generated Content in Wikipedia"[1] estimates

"that 4.36% of 2,909 English Wikipedia articles created in August 2024 contain significant AI-generated content"

In more detail, the authors used two existing AI detectors (GPTZero and Binoculars), which

"reveal a marked increase in AI-generated content in recent[ly created] pages compared to those from before the release of GPT-3.5 [in March 2022]. With thresholds calibrated to achieve a 1% false positive rate on pre-GPT-3.5 articles, detectors flag over 5% of newly created English Wikipedia articles as AI-generated, with lower percentages for German, French, and Italian articles. Flagged Wikipedia articles are typically of lower quality and are often self-promotional or partial towards a specific viewpoint on controversial topics."

The researchers also conducted a small qualitative analysis "to better understand the motivations for using LLMs to create Wikipedia pages", by manually inspecting the edit histories of a smaller subset, namely "the 45 English articles flagged as AI-generated by both GPTZero and Binoculars" (corresponding to 1.5% of those 2,909), and looking at the other contributions of the editors who created them. In this rather small sample, they identify four different such motivations:

Advertisement

One prominent motive is self-promotion. Of the 45 flagged pages, we identify eight that were created to promote organizations such as small businesses, restaurants, or websites. [...]

Pages Pushing Polarization

[...] we also identify pages that advocate a particular viewpoint on often polarizing political topics. We identify eight such pages out of the flagged 45. One user created five articles on English Wikipedia, detected by both tools as AI-generated, on contentious moments in Albanian history [see also figure 2, below ...] In other cases, users created articles ostensibly on one topic, such as types of weapons or political movements, but dedicated the majority of the pages’ content to discussing specific political figures. We find two such articles that espouse non-neutral views on JD Vance and Volodymyr Zelensky.

Machine Translation

[...] We find three cases where users explicitly documented their work as translations, including pages on Portuguese history and legal cases in Ghana. [...]

Writing Tool

Other pages, which are often well-structured with high-quality citations, seem to have been written by users who are knowledgeable in certain niches and are employing an LLM as a writing tool. Several of the flagged pages are created by users who churn out dozens of articles within specific categories, including snake breeds, types of fungi, Indian cuisine, and American football players. [...]

A graph.
Figure 2 from the paper: "The activity of this user, who was flagged for instigating an ‘Edit War,’ reveals that within a single day, they created three articles (red border), all identified as AI-generated. Notably, at 13:00 (green border), the user edited the outcome of ‘War in Dibra’ from ‘Mixed Results’ to ‘Victory’ and removed key text, just an hour before creating a new page titled ‘Uprising in Dibra.’ That page (see Figure 3) has since been deleted by moderators."

Why these findings seem important

These are among the first research results providing a quantitative answer to an important question that Wikipedia's editing community and the Wikimedia Foundation been weighing since at least the release of ChatGPT almost two years ago. (Cf. previous Signpost coverage: Community rejects proposal to create policy about large language models, "AI is not playing games anymore. Is Wikipedia ready?", and in this issue: "Keeping AI at bay – with a little help from volunteers", summarizing media coverage of WikiProject AI Cleanup). The "Implications of ChatGPT for knowledge integrity on Wikipedia" were also the topic of a research project conducted in 2023-2024 by UT Sydney researchers (funded by a $32k Wikimedia Foundation grant) which just published preliminary results where "Concerns about AI-generated content bypassing human curation" are highlighted as one of the challenges voiced by Wikipedians.

The new study's numbers should be valuable as concrete evidence that the generative AI has indeed started to affect Wikipedia in this manner (but might potentially also be reassuring for those who had been fearing Wikipedia would be overrun entirely by ChatGPT-generated articles).

But how much can we rely on them?

That said, there are several serious concerns about how to interpret the study's data, and unfortunately the authors (a postdoc, a graduate student and an undergraduate student from Princeton University) address them only partially.

First, the researchers made no attempt to quantify how many of the articles from their headline result ("4.36% of 2,909 English Wikipedia articles created in August 2024 contain significant AI-generated content") had also been detected (and flagged or deleted) by Wikipedians. Inspecting the aforementioned smaller subsample of 45 (1.5%) articles where both detectors agreed, they found that

"Most of the 45 pages are flagged by moderators and bots with some warning, e.g., 'This article does not cite any sources. Please help improve this article by adding citations to reliable sources' or even 'This article may incorporate text from a large language model."

Even for this smaller sample though, we are not told what percentage of AI-generated articles survived.

A snake devours its tail.
Has the AI-Wikipedia ouroboros begun to devour itself? Or is it still being starved on a much more meager diet than the paper's results might make one believe?

In other words, the paper is a rather unsatisfactory read for those interested in the important question of whether generative AI threatens to overwhelm or at least degrade Wikipedia's quality control mechanisms - or whether these handle LLM-generated articles just fine alongside the existing never-ending stream of human-generated vandalism, hoaxes, or articles with missing or misleading references (see also our last issue, about an LLM-based system that generates gene articles with fewer such "hallucinated" references than human Wikipedia editors). Overall, while the paper's title boldly claims to show "The Rise of AI-Generated Content in Wikipedia", it leaves it entirely unclear whether the text that Wikipedia readers actually read has become substantially more likely to be AI-generated. (Or, for that matter, the text that AI systems themselves read, considering that Wikipedia is an important training source for LLMs - i.e. whether the paper is evidence for concerns that "The ouroboros has begun".)

Secondly and more importantly, the reliability of AI content detection software - such as the two tools that the study's numerical results are based on - has been repeatedly questioned. To their credit, the authors are aware of these problems and try to address them. For example by combining the results of two different detectors, and by using a comparison dataset of articles created before the release of GPT-3.5 in March 2022 (which can be reasonably assumed to be virtually free of LLM-generated text). However, their method still leaves several questions unanswered that may well threaten the validity of the study's results overall.

In more detail, the authors "use two prominent detection tools which were suitably scalable for our study". The first tool is

GPTZero [.....] a commercial AI detector that reports the probabilities that an input text is entirely written by AI, entirely written by humans, or written by a combination of AI and humans. In our experiments we use the probability that an input text is entirely written by AI. The black-box nature of the tool limits any insight into its methodology."

The second tool is more transparent:

An open-source method, Binoculars [...] uses two separate LLMs [...] to score a text s for AI-likelihood by normalizing perplexity by a quantity termed cross-perplexity [...] The input text is classified as AI-generated if the score is lower than a determined threshold, calibrated according to a desired false positive rate (FPR). [...] For our experiments, we use Falcon-7b and Falcon-7b-instruct [as the two LLMs, following the recommendation of the authors of the Binoculars paper.] Compared to competing open-source detectors, Binoculars reports superior performance across various domains including Wikipedia"

The "superior performance" of the Binoculars tool (online demo) for the Wikipedia "domain" sounds very reassuring, with both precision and recall at or near a perfect 100% according to figure 3 in the "Binoculars" paper[supp 1]. But it refers to the performance on a 2023 dataset called "M4",[supp 2] where the AI "articles" to be detected were generated in a rather simplistic manner. ("We prompted LLMs to generate Wikipedia articles given titles, with the requirement that the output articles contain at least 250 words", see also the results for e.g. ChatGPT). It seems unwise to assume that this is representative of all the ways in which actual editors try to use AI to generate new articles in August 2024. Indeed the authors explicitly acknowledge this in a different part of the "Rise" paper, pointing out they did not attempt to "simulat[e] the various ways Wikipedia authors might use LLMs to assist in writing—taking into account different models, prompts, and the extent of human integration, among other factors." As a small illustration of potential issues with this, the few concrete examples of articles detected as AI-generated that are included in the paper (figure 2, see above) all start with an infobox - something which ChatGPT can certainly generate if explicitly prompted to do so, but which seems to be absent from most or all of the examples in the M4 dataset.

What's more, as is evident from Figure 1 (above), both tools disagreed frequently, with GPTZero being much more detection-happy than Binoculars in English, French, and German, but much less in Italian. The authors acknowledge that "the tools we use are primarily for detecting AI-generated content in English. While GPTZero supports Spanish and French, it is not designed for other languages."

As mentioned in the paper's abstract (see above), the authors try to control for false positives by calibrating both detectors to a 1% false positive rate on the control dataset (of presumably AI-free Wikipedia articles from March 2022). A technical issue that appears to have been overlooked here is that this 2022 dataset was generated (by Hugging Face) from the source wikitext dumps using the well-known "mwparserfromhell" Python package, whereas the authors obtained their August 2024 articles by scraping the text rendered by the Wikipedia API and applying some of their own cleanup steps. LLM-based text classification tools can sometimes be quite sensitive to minor formatting aspects.

More importantly, it seems rather adventurous to assume that the articles from that March 2022 dataset are comparable in all relevant properties to the newly created articles from August 2024 (i.e. that the 1% false positive calibration on the former will mean a 1% false positive rate on the latter). The authors are clearly aware of this concept drift problem, but only make a very perfunctory attempt to address it:

"One concern is that pre-March 2022 pages may be more polished due to years of editing. However, we observe that a higher number of edits weakly correlates with a higher AI-detection score for pre-March 2022 articles (Appendix D), suggesting that the FPRs for those articles may even be inflated. While the base assumption cannot be watertight, we observe a relatively consistent distribution of page categories between the two data pools, and we rely on the consistency of our chosen tools’ reported FPRs."

For many people, the fact that additional edits by human Wikipedia editors make the AI detection score go up in both detectors might increase skepticism about their overall validity. But the authors take it as an argument strengthening their paper's overall "Rise of AI-Generated Content" claim, by alluding to the possibility that its estimate might be too low. At various other points of the paper the authors likewise express awareness that their measurement method is subject to substantial errors and uncertainties, but claim that these can only go in their favor (i.e. could only mean that the actual rate of AI-generated articles is higher than their estimated lower bound). And there are other issues that likewise make one wonder a bit about the stringency of the peer review process that the paper has undergone, for example its claim that "The Wikipedia data we collect is under a Creative Commons CC0 License."

The study has only been been published as an arXiv preprint at the time of writing. But according to a remark in the accompanying code, it has been accepted at the "NLP for Wikipedia Workshop" at next month's EMNLP conference.

The authors have commendably published code used for the paper (although not under an open source license). Unfortunately though for readers who might want to replicate part of the paper's quantitative or qualitative analysis (or check whether some of the AI-generated articles it detected have slipped through Wikipedia's New pages patrol), none of the data underlying the paper's main results is being published (even though they were based entirely on public information from Wikipedia):

"Detecting AI may have unexpected negative consequences for people implicated as having generated that text. We have therefore been encouraged to omit any identifying information in the specific pages we discuss; however, we will provide more specific data to researchers upon request provided that it not be disseminated further."

But these concerns did not stop the authors from discussing the aforementioned concrete examples in a way that makes it very easy to identify involved users. (One reader of the paper has already done so, pointing out the specific longstanding sockpuppeting case that the editor featured in figure 2 was involved in.)

Briefly

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

"Filling Gaps in Wikipedia: Leveraging Data-to-Text Generation to Improve Encyclopedic Coverage of Underrepresented Groups"

From the abstract:[2]

"This paper presents a new tool to support efforts to fill in these gaps by automatically generating draft articles and facilitating post-editing and uploading to Wikipedia. A rule-based generator and an input-constrained LLM are used to generate two alternative articles, enabling the often more fluent, but error-prone, LLMgenerated article to be content-checked against the more reliable, but less fluent, rule-generated article."


"Wikipedia’s socio-technical vision is over-determined by consensus" and "Wikipedia should strengthen its democratic commitment by engaging with dissensus"

From the abstract:[3]

"Wikipedia is composed from consensus. [...] While it is often positioned as a self-evident good, its usage on Wikipedia is not without concern. In this paper I mobilize Chantal Mouffe’s (2000) feminist critical political theory and Johanna Drucker’s (2014) methods of interface analysis to raise important questions about the relationship between consensus and peer production. [...] I identify the multitude of ways that Wikipedians perform consensus: not only through understanding and decision-making, but also through acts of composing, showing, processing, closing, and calculating. However, because Wikipedia’s socio-technical vision is over-determined by consensus, its political design is ill-equipped to address the political conditions of pluralist societies. As a result, I identify the reasons why Wikipedia should strengthen its democratic commitment by engaging with dissensus. By conducting this research, I demonstrate how consensus has transitioned from a democratic ideal into an interface and why it should be re-imagined within peer production projects."

"A feminist affective analysis of student writers' engagement with the 'be bold' guideline"

From the abstract:[4]

"Working from a feminist affective framework, this article reports on a study of student-editors' experience with Wikipedia-based writing, using their reactions to a key editing guideline, 'Be bold,' as an entry-point for examining their affective experience. The 'Be bold' guideline, which encourages would-be editors to 'just go for it,' is nearly as old as the English language version of Wikipedia itself yet has received little critical attention. Drawing on survey and focus group interviews from participants at the undergraduate and graduate level, this study's findings provide new understandings of novice editors’ affective experiences in Wikipedia while offering a critical analysis of the 'Be bold/ guideline."

See also a post by one of the authors in the "Wikipedia Weekly" Facebook group, with discussion


"Wikipedia’s Indian problem: settler colonial erasure of native American knowledge and history on the world’s largest encyclopedia"

From the abstract:[5]

"This article details settler colonial erasure of Native American and Indigenous histories, knowledges, and philosophies on [English] Wikipedia. I show that long-time Wikipedia editors follow the settler colonial logic of elimination to omit Native histories from Wikipedia’s American history pages; block Native and allied editors from adding scholarship that centers Native experience; and ban Native and allied editors from the website so that settlers can lay claim to digital space. To do so, I concentrate on Wikipedia’s United States and American history pages, and I detail editor discussions regarding Native histories on Wikipedia’s talk pages, noticeboards, and off-Wikipedia message boards where editors congregate. I supplement this information with interviews of Wikipedia editors engaged in editing these topics. [...] I ultimately provide suggestions for the Wikimedia foundation to combat settler colonial erasure on Wikipedia."

See also a "Public response to the editors of Settler Colonial Studies" by Wikipedia admin User:Tamzin in the June 8 Signpost issue, arguing that the paper's author had failed to disclose a conflict of interest and that the paper contained multiple factual errors.


The 2019 integration of Google Translate made Wikipedia editors more productive

From the abstract:[6]

"This study examines the impact of integrating Google Translate into Wikipedia's Content Translation system in January 2019. Employing a natural experiment design and difference-in-differences strategy, we analyze how this translation technology shock influenced the dynamics of content production and accessibility on Wikipedia across over a hundred languages. We find that this technology integration lead to a 149% increase in content production through translation, driven by existing editors become more productive as well as an expansion of the editor base. Moreover, we observe that machine translation enhances the propagation of biographical and geographical information, helping to close these knowledge gaps in the multilingual context."

See also mw:Wikimedia_Research/Showcase#July_2024

"Blocks Architecture (BloArk): Efficient, Cost-Effective, and Incremental Dataset Architecture for Wikipedia Revision History"

From the abstract:[7]

"[The] Wikipedia Revision History (WikiRevHist) [dataset ... can be a] valuable resource[] for NLP applications. [...] we report Blocks Architecture (BloArk), an efficiency-focused data processing architecture that reduces running time, computing resource requirements, and repeated works in processing WikiRevHist dataset. BloArk consists of three parts in its infrastructure: blocks, segments, and warehouses. On top of that, we build the core data processing pipeline: builder and modifier. The BloArk builder transforms the original WikiRevHist dataset from XML syntax into JSON Lines (JSONL) format for improving the concurrent and storage efficiency. The BloArk modifier takes previously-built warehouses to operate incremental modifications for improving the utilization of existing databases and reducing the cost of reusing others' works. In the end, BloArk can scale up easily in both processing Wikipedia Revision History and incrementally modifying existing dataset for downstream NLP use cases. The source code, documentations, and example usages are publicly available online and open-sourced under GPL-2.0 license."

"Historical Narratives in Different Language Versions of Wikipedia"

From the abstract:[8]

"The article compares selected entries on Wikipedia concerning significant historical events in three language versions: Belarusian, Lithuanian, and Polish. [...] I apply the method of ideological critique to investigate whether national values influence the objectivity of Wikipedia articles written in local languages. A comparison of multilingual Wikipedia entries reveals the prevalence of “local” points of view on controversial historical events."


"Community Vital Signs: Measuring Wikipedia Communities’ Sustainable Growth and Renewal"

From the abstract:[9]

"After 2007, researchers started to observe that the number of active editors for the largest Wikipedias declined after rapid initial growth. Years after those announcements, researchers and community activists still need to understand how to measure community health. In this paper, we study patterns of growth, decline and stagnation, and we propose the creation of 6 sets of language-independent indicators that we call “Vital Signs” [formerly available at https://vitalsigns.wmcloud.org/ ]. Three focus on the general population of active editors creating content: retention, stability, and balance; the other three are related to specific community functions: specialists, administrators, and global community participation. [...] We present our analysis for eight Wikipedia language editions, and we show that communities are renewing their productive force even with stagnating absolute numbers; we observe a general lack of renewal in positions related to special functions or administratorship."

See also:


"Characterizing, Detecting, and Predicting Online Ban Evasion" on Wikipedia based on data from past sockpuppet investigations

From the abstract:[10]

"we conduct the first data-driven study of ban evasion, i.e., the act of circumventing bans on an online platform, leading to temporally disjoint operation of accounts by the same user. We curate a novel dataset of 8,551 ban evasion pairs (parent, child) identified on Wikipedia [by "Wikipedia moderators", at Wikipedia:Sockpuppet investigations], and contrast their behavior with benign users and non-evading malicious users. We find that evasion child accounts demonstrate similarities with respect to their banned parent accounts on several behavioral axes — from similarity in usernames and edited pages to similarity in content added to the platform and its psycholinguistic attributes. We reveal key behavioral attributes of accounts that are likely to evade bans. Based on the insights from the analyses, we train logistic regression classifiers to detect and predict ban evasion at three different points in the ban evasion lifecycle. Results demonstrate the effectiveness of our methods in predicting future evaders (AUC = 0.78), early detection of ban evasion (AUC = 0.85), and matching child accounts with parent accounts (MRR = 0.97)."

See also earlier coverage on research about sockpuppets on Wikipedia

References

  1. ^ Brooks, Creston; Eggert, Samuel; Peskoff, Denis (2024-10-10), The Rise of AI-Generated Content in Wikipedia, arXiv:2410.08044 (accepted at the "NLP for Wikipedia Workshop" at EMNLP 2024) / code
  2. ^ Mille, Simon; Pronesti, Massimiliano; Thomson, Craig; Lorandi, Michela; Fitzpatrick, Sophie; Huidrom, Rudali; Sabry, Mohammed; O'Riordan, Amy; Belz, Anya (September 2024). "Filling Gaps in Wikipedia: Leveraging Data-to-Text Generation to Improve Encyclopedic Coverage of Underrepresented Groups". In Saad Mahamood, Nguyen Le Minh, Daphne Ippolito (eds.) (ed.). Proceedings of the 17th International Natural Language Generation Conference: System Demonstrations. Tokyo, Japan: Association for Computational Linguistics. pp. 16–19. {{cite conference}}: |editor= has generic name (help)CS1 maint: multiple names: editors list (link) / Code
  3. ^ Jankowski, S. (February 2022). "Making Consensus Sensible: The Transition of a Democratic Ideal into Wikipedia's Interface". Journal of Peer Production. 15. Peer reviews
  4. ^ Vetter, Matthew; Jiang, Jialei; Othman, Mahmoud; Vetter, Mercy (2024-04-18). "Navigating the emotional terrain of Wikipedia writing: A feminist affective analysis of student writers' engagement with the "be bold" guideline". Computers and Composition. 72: 102850. doi:10.1016/j.compcom.2024.102850. Closed access icon / Freely available version
  5. ^ Keeler, Kyle (2024). "Wikipedia's Indian problem: settler colonial erasure of native American knowledge and history on the world's largest encyclopedia". Settler Colonial Studies: 1–22. doi:10.1080/2201473X.2024.2358697. ISSN 2201-473X.
  6. ^ Zhu, Kai; Walker, Dylan (2024-01-28), The Promise and Pitfalls of AI Technology in Bridging Digital Language Divide: Insights from Machine Translation on Wikipedia, Rochester, NY, doi:10.2139/ssrn.4708614, SSRN 4708614{{citation}}: CS1 maint: location missing publisher (link)
  7. ^ Li, Lingxi; Yao, Zonghai; Kwon, Sunjae; Yu, Hong (2024-10-06), Blocks Architecture (BloArk): Efficient, Cost-Effective, and Incremental Dataset Architecture for Wikipedia Revision History, arXiv, doi:10.48550/arXiv.2410.04410 / Code, documentation
  8. ^ Kubś, Jakub (2021). "Historical Narratives in Different Language Versions of Wikipedia". Academic Journal of Modern Philology (12): 83–94. doi:10.34616/ajmp.2021.12. ISSN 2299-7164.
  9. ^ Miquel-Ribé, Marc; Consonni, Cristian; Laniado, David (January 2022). "Community Vital Signs: Measuring Wikipedia Communities' Sustainable Growth and Renewal". Sustainability. 14 (8): 4705. doi:10.3390/su14084705. ISSN 2071-1050. / Code, data
  10. ^ Niverthi, Manoj; Verma, Gaurav; Kumar, Srijan (2022-04-25). "Characterizing, Detecting, and Predicting Online Ban Evasion". Proceedings of the ACM Web Conference 2022. WWW '22. New York, NY, USA: Association for Computing Machinery. pp. 2614–2623. doi:10.1145/3485447.3512133. ISBN 9781450390965. / Data
Supplementary references and notes:
  1. ^ Hans, Abhimanyu; Schwarzschild, Avi; Cherepanova, Valeriia; Kazemi, Hamid; Saha, Aniruddha; Goldblum, Micah; Geiping, Jonas; Goldstein, Tom (2024-10-13), Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text, arXiv:2401.12070
  2. ^ Wang, Yuxia; Mansurov, Jonibek; Ivanov, Petar; Su, Jinyan; Shelmanov, Artem; Tsvigun, Akim; Whitehouse, Chenxi; Afzal, Osama Mohammed; Mahmoud, Tarek (2024-03-09), M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection, arXiv:2305.14902




Reader comments

File:Bret Hanover.jpg
anon.
PD
0
0
489
2024-10-19

Off to the races! Wikipedia wins!

"Politics of perception" persistence is perplexing, but proof is pending

The Independent Journal Review, borrowing from The Daily Caller, recently claimed that your Wikipedia donations might be funding "feminism and racial justice", instead of just keeping the lights on. The Commune Mag and OpIndia then joined the fray, alleging that Wikimedia's finances are tied to shadowy donors. Karah Rucker of Straight Arrow News listed left-leaning leaders like Art+Feminism and Black Lunch Table—programs we've previously highlighted in The Signpost. Meanwhile, in yet another familiar critique of Wikipedia's alleged political leanings, Voz branded the platform as “Wokepedia”, saying it is the world's largest encyclopedia in one sentence, and it "now resembles a sort of Orwellian authority" in the next one.

At this point, it's like watching a rerun. Every few weeks, a new outlet accuses the free encyclopedia of the mind-numbing non-napping known as part of the politics of perception. Next, Matt Walsh, a commentator for The Daily Wire (not the most reliable source, to be honest), described a scene from his own movie Am I Racist?, where a supposed "white-guilt" group tries to get him arrested by reading the Wikipedia article about him to the police. See previous Signpost coverage on Walsh's commentary. – JSG, S

A Hoosier hurrah! Indy Wiki Conference video

WISH-TV reported on the latest edition of WikiConference North America, which took place in Indianapolis from October 3 to October 6. The station interviewed Nigerian Wikimedian James Popoola—a frequent contributor to Wikidata and The WikiVibrance Project—Justin Clark, the digital initiatives director at the Indiana State Library, and the conference's organizer, Dominic Byrd-McDevitt. As further highlighted in the article, the mayor of Indianapolis, Joe Hogsett, proclaimed October 4 as "Wikipedia Day", while the IU Indianapolis Library received $280,000 from the local Library Fund in order to improve information on Indiana's digital heritage on Wikipedia. – S

Keeping AI at bay – with a little help from volunteers

Devex reported on how Wikipedia's army of human volunteers is being hailed as its greatest weapon against the rise of artificial intelligence. Interviewed by the platform, Wikimedia Foundation CEO Maryana Iskander emphasized that while AI churns out a lot of "slop" by prioritizing speed over accuracy, Wikipedia's crowdsourced approach has kept it a beacon of reliable information. Despite the hype surrounding the AI's quick progress, Iskander notes that Wikipedia's human editors remain cool and confident, because, well, it turns out that good old-fashioned community curation still works.

Additionally, 404 Media has picked up on the formation of WikiProject AI Cleanup (see previous Signpost coverage), created by volunteers to tackle a growing problem: AI-generated content that introduces errors or misleading information into articles. ExtremeTech also highlighted the surge of "AI-generated garbage," including a fabricated article about a non-existent Ottoman fortress and incorrect information added to existing articles. Even with the challenges posed by AI, at least this human editor is confident that community-driven curation will remain the best guarantor of quality. JSG

Massive pay to play

Ashley Rindsberg shows "How Wikipedia is Becoming a Massive Pay-to-Play Scheme" at Pirate Wires, writing that "a boomtown industry feeding an insatiable demand for services like article creation, editing, management and deletion has emerged." One of his first targets is Pakistani company Abtach, which is reportedly tied to "at least 130 different Wikipedia editing front companies that operate under domains like Wikicreatorsinc.com, Wikicreation.services, Wikipedia Pro, Wikipedia Legends, and USAwikispecialists.com." See this story in The Signpost for more details.

For examples of slightly more sophisticated paid editing, Rindsberg provides links to articles about British investment immigration consultancies, Canadian frozen foods producers, cellulite-busting self-massage accessories, custom T-shirt retailers, Swedish online travel agencies, German disinfectant brands, industrial waste management companies, RegTech software firms, as well as packaging producers, electronic device recyclers, and self-storage chains. That's just what he considers "black hat" paid editing.

The more sophisticated "white hat" editors are linked to articles on Bain & Co, Yelp, Qualcomm, Kaspersky Lab, software company Forcepoint, the RSA Conference, as well as a New York Times exec and corporate clients like Reddit, MetLife, Accenture, Intel, IBM, Hubspot, Hilton, Vox Media, Dick's Sporting Goods, United Airlines, Amdocs, Gallup, Allergan, Breyers, Vimeo and Waymo.

Along the way, he mentions several of the better known scandals involving paid editing, including MyWikiBiz (see related Signpost coverage), Legal Morning (see previous Signpost coverage) and WhiteHatWiki (see see previous Signpost coverage). It's an excellent introduction to paid editing on Wikipedia, and you shouldn't be surprised if you see some of these names again.

In the words of Rindsberg, "the question this leaves us asking is whether we can really apply the historic term 'encyclopedia' to a sprawling network of thousands of articles carefully pruned by the PR departments of billion dollar companies, or if Wikipedia is something else entirely." – S

From clickbait to culture

A crossword published in USA Today by constructor Ada Nicolle featured an unexpected nod to Wikipedia. One clue highlighted Annie Rauwerda, founder of the popular Depths of Wikipedia accounts on Instagram and other social media platforms, which showcase quirky and obscure content from the site. Other clues celebrated Icelandic-Canadian heritage, such as a reference to Gimli, Manitoba—the Canadian town with the highest population of Icelanders outside of Iceland itself.

In a recent interview with The Michigan Daily, Rauwerda reflected on how her account has grown into a hub for curious readers and Wikipedia enthusiasts, emphasizing the thrill of discovering—and correcting—obscure content. She also discussed the unique challenges of maintaining the account while navigating Wikipedia's complex editing rules, and the role her platform plays in demystifying the editing process and encouraging new contributors. From clickbait headlines to deep dives into obscure history, Depths of Wikipedia has evolved into a cultural phenomenon that brings new visibility to the weird and wonderful corners of our beloved encyclopedia. JSG

Kamala Harris accused of plagiarizing from Wikipedia in co-authored book from 2009

According to a report by Stefan Weber, an Austrian media researcher noted for his work as a "plagiarism hunter", current US Vice President and presidential candidate Kamala Harris "copy-pasted a Wikipedia article" into her 2009 book Smart on Crime (co-authored with Joan O'C. Hamilton), alongside plagiarism from other sources. The allegations were publicized by US conservative activist Christopher Rufo in what may be intended as an October surprise ahead of the November 5 presidential election. (Rufo and a co-author had previously investigated plagiarism in academic publications by Harvard University president Claudine Gay, contributing to her resignation.)

The New York Times summarized a different plagiarism expert's initial reaction as stating that "the errors were not serious, given the size of the [book]." However, a day later the same expert (Jonathan Bailey, who runs a site called "Plagiarism Today") followed up to clarify that his statement had been based on the examples provided in Rufo's post only, and that after reviewing the full dossier by Weber, he judged "the case [to be] more serious than I commented to the New York Times" although he still maintained that "the pattern points to sloppy writing habits, not a malicious intent to defraud." Specifically, Bailey stated that

"The most serious allegation concerns Wikipedia. Harris’ book contained roughly two paragraphs copied from Wikipedia without citation. To be clear, that is plagiarism. It’s compounded by the fact that Wikipedia is typically not seen as a reliable source, and, according to Weber, there was an error in the information."

The Harris campaign rejected the plagiarism allegations outright as "a partisan attempt to weaponize a 15-year-old work", as summarized in a Washington Post article. Conversely, Harris' political opponent JD Vance seized on them by posting on Twitter/X "I wrote my own book, unlike Kamala Harris, who copied hers from Wikipedia." The WaPo article also reported an unnamed source as claiming that while Harris' involvement in the book included "reviewing drafts", she "was not involved in the formatting of outside excerpts and citations".

The Wikipedia article in question is Midtown Community Court, which according to Weber was plagiarized in this revision. H

In brief

No one has uploaded a photo of racehorse Wikipedia yet, so here is Rysdyk's Hambletonian, great-great-etc. grandsire of Wikipedia
Not yet! Still nobody has uploaded a photo of racehorse Wikipedia. So here is Bret Hanover, his great-great-great grandsire
A pacing racehorse. May we ask anybody who enjoys Ontario racetracks to take and upload a photo?
  • Wikipedia paces to victory: In an unexpected twist, Wikipedia isn't just for late-night research dives anymore—it's also the name of a harness racing horse! Standardbred Canada reports the nobly named Standardbred pacer, a three-year-old son of Mcwicked, came from five lengths back in the stretch to win an Ontario Sire Stakes Gold division race, overcoming Crush Kill Destroy, Unrivaled Hanover, Legal Attack and Chain Gang. The next time someone will say “Wikipedia is fast,” they won't just be talking about the servers.
  • Camille Herron's race for wiki supremacy: Multiple magazines and news outlets, including Runner's World, Canadian Running Magazine, Women's Health, and Athletics Illustrated all reported that ultrarunner Camille Herron went from trails to keyboards, getting flagged for editing her own Wikipedia page to emphasize her achievements, as well as removing some of the achievements in the article about Kilian Jornet — turning this into a race of reputation, rather than endurance. Wikipedia's neutrality got a workout as Herron's edits sparked controversy over athletes curating their own narratives. Actions have consequences, though—Herron was swiftly dropped by her primary sponsor, Lululemon, making this yet another high-stakes lesson in online reputation management gone wrong.
  • Recent research piece picked up: The Arabian Post picked up on a study comparing how people perceive credibility between Wikipedia, ChatGPT, and Alexa. While their coverage didn't mention The Signpost directly, it did link back to our Recent research piece from two issues ago. So, even though we weren't named, The Signpost served as a bridge between the original Nature article and international media coverage.
  • Wikipedia "winner" goes viral!: A viral screenshot of a Wikipedia page claiming Abhijeet Sawant as the winner of Bigg Boss Marathi season 5 has stirred up social media, leading to debates on whether it's true or just another case of Wikipedia vandalism. The buzz even got coverage by The Free Press Journal, which pointed out the ongoing confusion.
  • Goa's newest archive: Indian outlet Goamankat Times highlighted local Wikipedian Tanmay Pereira Naik's efforts to expand and improve articles related to Goa on Wikipedia, which include everything from documenting local history and politics to covering underrepresented topics like traditional cuisine and Goan personalities.
  • Wikipedia training camp: Nigerian newspaper The Punch reported that a group of 20 participants attended a Wikipedia editing workshop in Nigeria, hosted by the Tyap Wikipedia User Group. The workshop aimed to improve digital literacy and expand representation of Nigerian topics on the encyclopedia.
  • Portland City Auditor reopens Gonzalez case: After initially determining there was insufficient evidence to determine that the Portland, Oregon mayoral candidate broke campaign finance law by hiring a firm to burnish his Wikipedia page, the city auditor reopened the case. KOIN reports that "Gonzalez's office had paid a company called Codename Enterprises, operating under WhiteHatWiki, to assist with edits and contentious matters on Gonzalez's Wikipedia page." See also the report in The Oregonian and previous Signpost coverage here, here, and here. The next installment of this long running saga is expected with the release of the revised City Auditor's report on or before Halloween.
  • "Under the radar" page move rankles: Scholar Asaf Romirowsky criticised the article on Israeli apartheid in an interview for the Jewish Journal, saying that the Wikipedia article's title and opening paragraph are "a work of fiction"; the same article also claims that the Requested move to the current title was accomplished "under the radar". This is not the first page move controversy reported by The Signpost; find more about previous instances here, and here.
  • Spotlight on Bangladeshi Authors: An article in The Daily Star explored how Wikipedia categorizes prominent Bangladeshi literary figures like Abbasuddin Ahmed and Humayun Kabir. The piece highlighted issues with labeling writers based solely on linguistic or geographical ties, which sometimes results in misclassifications—such as identifying them only as "Bengali" or "Indian", rather than acknowledging their heritage. The article argues for more precise representation on Wikipedia to honor these authors' contributions to the cultural and literary history of Bangladesh, and to avoid losing the nuances of their identities due to historical geopolitical changes.
  • High interest in 5784: In its year-end Internet culture wrapup, the Jerusalem Post noted that "Israel's Wikipedia entry... [got] approximately 14,769,946 views in a year, more than Israel's entire population." Other Wikipedia pageviews were noted, as well as Google search and YouTube popularity, to gauge interest in various Israel-related topics (and some Jewish public figures in the US and elsewhere).
Who's been trolling who?
  • If sockfarms don't succeed, try, try again: University of Sydney researcher Olga Boichak writes in Foreign Policy about "How Russia Invaded Wikipedia", commenting on the nationwide splinternet and the creation of Ruviki, the government-backed fork of ru.wiki.
  • The silent majority?: The Wikipedia article for the 2024 Silent Hill 2 remake was recently locked after persistent disruptive editing aimed at lowering review scores. According to IGN, the edits skewed the ratings from various sources to make the remake seem less well-received. In a move straight out of the game's own playbook, admins had to put the page under "lock and key"—much like the puzzles and locked doors players navigate in Silent Hill 2—to keep the chaos at bay.
  • Are press releases news?: Outlets such as Business Standard and NewsTap both covered the Wikimedia Technology Summit 2024, with the former publishing a press release and the latter seemingly slightly rewriting it. The summit was hosted with IIIT Hyderabad and focused on enhancing inclusivity and innovation within Wikipedia and other Wikimedia projects. The press release highlighted sessions on developing tools to better support diverse communities and promoting equal participation in tech development.
  • Check the stats first: UnHerd reported that Wikipedia recently renamed its article on the UK's grooming gang scandals to "Grooming gang moral panic in the United Kingdom", sparking controversy. The change, which was based on discussions citing sources like a 2020 Home Office report, was intended to reflect how the media and public framed the issue. Critics argue that the new title downplays the severity of the abuse, causing debate over whether Wikipedia is succumbing to political correctness at the cost of historical accuracy. The Wikipedia article highlights another government report and statistics that show that White men in the UK sexually abuse children more than Muslims or South Asians.
  • The court reads: Following the last issue's coverage of the ongoing legal case between India's Asian News Network and the Wikimedia Foundation, a dedicated article was written and published on 10 October 2024. However, the article has now been mentioned in the court on 14 October, and wasn't exactly well-received, as reported by Bar and Bench and Live Law. On 16 October, the court pretty much ordered a deletion of the article, which was snow-kept at a short-lived AfD.



Do you want to contribute to "In the media" by writing a story or even just an "in brief" item? Edit our next issue in the Newsroom or leave a tip on the suggestions page.




Reader comments

File:HMS Assistance (1850).jpg
Thomas Sewell Robins
public domain
75
5
450
2024-10-19

A WikiCup for the Global South

In early April, a few of us in the WikiCup thread of the Wikimedia Discord were discussing the problem of systemic bias on Wikipedia regarding postcolonial, economically emerging, and/or non-Anglophone countries. This is a common topic of complaint among both Wikipedians and non-Wikipedians, and for good reason; it's still a major weak spot for the English Wikipedia community. Epicgenius, one of the judges of this year's WikiCup, first suggested a contest focusing on developing countries on April 5th. Over the next few days, we brainstormed different ways to run this hypothetical contest: the name, the inclusion criteria, and scoring. Quite a few people chimed in, and Ixtal and I decided to BOLDly set up a project page, aiming for July–September to run the contest. Soon thereafter, we decided to recruit a third coordinator, TechnoSquirrel69; that ended up being a very good decision.

Initially we were thinking about calling it the "Global South WikiContest", but we soon realized that "global south" is too nebulous and subjective of a grouping, so we settled on the IMF's list of developing countries as a basis. We also decided to give bonus points for "least-developed" countries and for "higher-level" articles about broad topics relating to these developing countries, as both of those factors tend to be more challenging to write about. Some fun themed awards were also of course necessary.

While planning this back in April, we weren't expecting more than maybe twenty participants. Interest picked up in the Discord as July approached, and we refined the details on the contest's talk page. When we actually opened up for signups and TechnoSquirrel69 set up a watchlist notice, we ended up with around eighty participants, far more than we were prepared for. This actually ended up being fine as many of those who signed up didn't submit anything, but it was very exciting to us coordinators, as it indicated a desire to work on systemic bias.

The course of the contest was largely hiccup-free, and we didn't have to decline very many submissions. We weren't able to get a bot up and working in time either, so we manually updated the leaderboard as submissions came in. As I mentioned above, I was very happy that we had brought on TechnoSquirrel as a third coordinator, because I moved across the Atlantic Ocean at the end of August and was unable to properly keep up with submissions and discussion for a time.

After the contest wrapped up on September 30th, we opened a discussion on the talk page to get some feedback, and we've already gotten some great criticism, praise, and ideas.

I'm very happy we did this experimental contest. We ended up with three new featured articles (Qalaherriaq, Siege of Baghdad, and Genghis Khan!), ten new featured lists, 88 new good articles, and a truckload of article reviews and DYK nominations, all relating to parts of the world underserved by our encyclopedia—that's nothing to turn one's nose up at! I'd like to express my appreciation for my fellow coordinators, and Chaotic Enby for making our special "belt buckles" inspired by the map's resemblance to one.

GreenLipstickLesbian's write-up

Here are some super-basic stats! There's some overcount for sure (I am merely human, and didn't put in any effort to split dual-GA nominations and dual-DYK nominations), but here's some basic estimates for everybody to enjoy! (In terms of creations. Sorry, reviewers, the overcount would have gotten too bad if I'd included your activities!)

Overall, members of the drive managed to create or substantially improve content concerning a grand total of 86 different countries. 86! That's pretty impressive and, coincidentally, the telephone country code for China. A pretty amazing coincidence, considering which country's coverage members of the drive improved the most.

The top five most-improved countries were:

  1. China China, with 1 featured list, 6 GAs, and 13 DYK appearances
  2. India India, with 1 featured list, 15 GAs, and 1 ITN appearance
  3. Philippines Philippines, with 5 GAs and 7 DYK appearances
  4. North Korea North Korea, with 2 GAs and 9 DYK appearances
  5. Kiribati Kiribati, with 10 GAs and 4 DYK appearances

and then, more individually:

By terms of GAs submitted:

  1. India India with 15
  2. Tied: China China, Kiribati Kiribati, and Haiti Haiti with 6
  3. Philippines Philippines with 5

By terms of DYKs submitted:

  1. China China with 13
  2. North Korea North Korea with 9
  3. Philippines Philippines with 7

But just because we improved those countries so much, doesn't mean we forgot about the rest. Members of the drive found time to get either a single GA, FA, or DYK for 49 individual countries. Most impressive of those were the two countries that got "only" a featured article: Iraq Iraq and GreenlandGreenland, brought to you by AirshipJungleman29 and Generalissima. If you haven't already, you should take the time to read both of their articles. They're some of the best that Wikipedia has to offer.

Members of the drive also managed to get 3 important deaths (those of presenter India Aparna, wrestler American Samoa Afa Anoaʻi, and former PM of Lebanon Lebanon Salim Al-Huss) featured In The News. These appearances were brought to us by two editors in particular: Vacant0 and Jaguarnik. Everybody else, this is who you have to beat next year!

We also got plenty of lists: 10 in total, covering 8 different countries! (Three of these lists covered Ukraine Ukraine) To the uninitiated such as myself, the featured list process may appear somewhat strange, but not to editors such as MPGuy2824, with their impressive 3 FLS and 7 FLRs, Vanderwaalforces, who wrote Wikipedia's first ever Nigeria Nigeria-related FL, Dantheanimator with the 3 Ukraine-related FLs, and 48JCL with their fascinating Botswana List of World Heritage Sites in Botswana.

Let's look at a breakdown by continent:

The continent we worked on the most was Africa, with 36 countries represented. Second place was Asia with 19, followed by the Americas with 15. Oceania got 10, and Europe came in last with 7 countries represented. I bet no other Wiki-drive has ever had that result before!

In terms of content, a special shout-out has to go to Vigilantcosmicpenguin for their series about abortion in various African countries- not only did they bring Abortion in Africa through the DYK process, but they brought 11 different country-specific articles through as well. Not satisfied with that, they also had to pause and turn Sierra Leone Abortion in Sierra Leone into a Good Article too! In terms of eliminating systemic bias, the importance of their contributions cannot be overstated.

And while we're talking about Olympic feats, let's pause and appreciate the efforts of Arconning. They made 1 Olympic related Featured List, 7 Olympic related good articles, and then went above and beyond to get 3 of those articles through the DYK process as well. Talk about gold medalists!

But speaking of sports, BeanieFan11 saw the need to improve the articles on athletes from developing nations, and they more than rose to the challenge. But in between their GAs on NFL members, and DYKs on various athletes (did you know that Olympian judokan Valentin Houinato, from Benin Benin, is also a journalist?), they also substantially expanded or created articles on three different politicians! One of those politicians being current Prime Minister of Equatorial Guinea Equatorial Guinea, Manuel Osa Nsue Nsua.

While we're learning so many cool things, let me tell you all how our coverage of Kiribati Kiribati improved so much. Thebiguglyalien is responsible for five of our new Kiribati-themed good articles, and all four of its DYK appearances. And did you know how they got all those DYK appearances in just one hook? Find out for yourself here!

In a similar vein, if India India placed so high, it's only because of dedicated writers like Magentic Manifestations, who made India-related articles about a wide array of topics, from dairy engineers to crocodile trusts, into some Good-capital-G Articles.

And while I may not have focused on reviews too much, I am going to take a minute to highlight the contributions of our most prolific reviewer, Simongraham. They wrote Good Articles on seven different species and genuses of arthropods (mostly jumping spiders, from what it seems), and two different Soviet ships. Those with arachnophobia beware, but those who create species stubs, be even warier, for this is the type of editor we all wish we had more of.

Hopefully all the spider-phobes haven't left by now, because who isn't going to love this next topic? That's right, who wants to read about chocolate? Yue managed to write three articles on chocolate production, chocolate smuggling, and a chocolate manufacturer, and they even made one into a Good Article. But they didn't stop there! Those of you interested in the flags of the world, you need look no further than their new Good Articles on the flags of Togo Togo, North Korea North Korea, São Tomé and Príncipe São Tomé and Príncipe, or Rwanda Rwanda.

This Wikicup may not have seen many—or, actually, any—Good Topics, but one user came close: Chipmunkdavis! They wrote five Good Articles about fisheries in the Philippines Philippines, after finding out just how lacking we were. Generalissima's been prodding them to turn the entire thing into a Good Topic. And who knows, hopefully she'll have convinced them by next year? No pressure or anything, but all eyes are on you now, Chipmunkdavis!

Di did what they do best and told us all about the time "that German officials exiled the Samoa Samoan king from his own kingdom". In light of the recent Olympics, Riley1012 helped to expand our coverage of artistic gymnasts from around the world. TheNuggeteer brought you coverage of Philippines Philippines storms, Averageuntitleduser wrote Good Articles on Haiti important Haitian women, and Cambalachero proved that you can write Argentina a Good Article on pop-culture topics after all! PerfectSoundWhatever introduced us to the life and works of Kenya Kenyan musician KMRU, Noorullah21 wrote an impressive 1k+ word GA on the India Khalji Revolution, and Skyshifter got meta and treated us to hoax article! As in, Brazil a good article about a hoax article. You'll just have to read it for yourself. Fritzmann2002 specialised in article and list reviews, but took the time to write some good articles on Syria plants from Syria and Turkey Turkey.

Some of us were busy during the contest, however—but in a drive such as this, literally every attempt counts. With that in mind, let's take some time to appreciate Sohom Datta's India review of a GA about voting in India, TappyTurtle's Brazil review of the article about the Wikipedia hoax, Zanahary India review of an article about an Indian god of war, SunTunnels's article about a South Africa South African speed-climber, and Queen of Hearts's article on China a crab named after a League of Legends character.

I'd also like to say thank you to our lovely co-ordinators, Sawyer777, Ixtal, and TechnoSquirrel69. It may sound cheesy, but it's true—none of this would have happened without your ideas and dedication.



Reader comments

File:Essen_-_Neue_Isenburg_15_ies.jpg
Frank Vincentz
GFDL
125
300
2024-10-19

A scream breaks the still of the night

This traffic report is adapted from the Top 25 Report, prepared with commentary by Igordebraga, Vestrian24Bio, CAWylie, Ollieisanerd and Rahcmander.

You wanna know about atrocity, atrocity (September 15 to 21)

Rank Article Class Views Image Notes/about
1 Lyle and Erik Menendez 3,493,715 Oh look, another case of criminals topping this list after a Netflix show. The Menendez brothers were the sons of an entertainment executive who one day in 1989 decided to repeatedly shoot their parents with 12-gauge shotguns, passed as innocent so they could spent the inherited fortune, but the police eventually caught on, arrested them and their 1996 trial sent them into life sentences. Monsters: The Lyle and Erik Menendez Story (#8) tells this story, with Cooper Koch as Lyle and Nicholas Chavez as Erik.
2 Sean Combs 1,507,229 As recently sung by Kesha, "wake up in the morning like fuck P. Diddy!" Last year this rapper's former partner Cassie Ventura filed a lawsuit accusing him of sexual assault (and at a certain point security footage of him beating her in a hotel was leaked), other people took the opportunity to also denounce Combs, in March some properties of his were raided by Homeland Security, and now "Puff Daddy" has been downright indicted by a federal grand jury in Manhattan, charged with sex trafficking and racketeering, with allegations his commercial enterprises were used for felonies of their own. Combs has pled not guilty and was twice denied bail, so he remains in a detention center while awaiting trial.
3 Deaths in 2024 976,004 Don't they ever have to worry?
Don't you ever wonder why?
It's a part of me that tells you
Oh, don't you ever, don't ever say die
Never, never, Never Say Die! Again!
4 Tito Jackson 818,581 "—I'm Michael Jackson! You are Toto!
—You mean Tito! Toto is what we ate last night for dinner.
"
So, right after a jokey lyric in the obituary comes an addition to it. 15 years after the most famous member left this world, breaking Wikipedia along the way, another of The Jackson 5, who in 2012 reunited under their second name The Jacksons, went to attend the Great Gig in the Sky, namely Tito Jackson, at the age of 70. Along with performing Tito helped push more of the Jackson family onto the music business as his sons formed their own group, 3T.
5 Laura Loomer 700,884 This U.S congressional reject remains in the orbit of politics, namely as an influencer to Donald Trump. On September 12, towards the end of the second presidential debate, Trump made the claim of Haitian immigrants stealing pets in Springfield, Ohio, and eating them. This partly arose from Loomer's perpetuating the rumor in social media.
6 Beetlejuice Beetlejuice 693,903 Like on Stranger Things, Winona Ryder plays a mother who sees her child get dragged onto a separate dimension. Only this time around it's a daughter who goes to the afterlife, forcing Lydia Deetz to get help by summoning Beetlejuice (mwahahaha!), still pining for her after 36 years. Along with Ryder and Michael Keaton, there's also the return of Catherine O'Hara as Lydia's stepmom (but not Jeffrey Jones as her father, who gets a gruesome off-screen death given the actor's criminal record) and most importantly of director Tim Burton, along with the addition of Jenna Ortega as the daughter. Reviewers and audiences alike appreciated how Beetlejuice Beetlejuice retained the same funny and outlandish tone of the original, and the film has led the North American box office for three weeks straight while earning over $300 million worldwide.
7 Shōgun (2024 TV series) 675,653 This FX historical drama set in 17th century Japan garnered much praise earlier this year, and has now followed by converting 18 out of 25 Primetime Emmy nominations, the single most awarded season ever, with the wins including Outstanding Drama Series and Best Actor and Actress in Drama to Hiroyuki Sanada and Anna Sawai. This probably speaks well of the incoming seasons based on the rest of the Asian Saga. (and given no other Emmy winners came close to entering this week, Baby Reindeer, which like Shōgun appeared a lot in the Report in the first semester, took home Best Limited or Anthology Series, and Best Comedy went to Hacks, beating The Bear, which even the ceremony's opening monologue questioned whether it's a comedy)
8 Monsters: The Lyle and Erik Menendez Story 617,889 Season 1 of this Ryan Murphy true crime show sullied our humble list by pasting a cannibal dead for nearly 30 years all over our records page and downright as the most viewed page of the year. So I must express a relief that in spite of providing another #1 subject, this time around the numbers won't go as high as Dahmer's did.
9 The Greatest of All Time 614,818 Kollywood Thalapathy Vijay's penultimate film before he moves onto politics, an action thriller film released two weeks ago and that has shattered many records for a Tamil film already. It has so far grossed over 450 crore (US$54 million) and has became the highest-grossing Tamil film of 2024 and the third highest-grossing Indian film of 2024. It also became the third highest-grossing film in Tamil Nadu, the fifth highest-grossing Tamil film in overseas, the fifth highest-grossing Tamil film of all time, the 11th highest-grossing South Indian film of all time and the 29th highest-grossing Indian film of all time. It became the fourth film of the actor to reach the 300 crore (US$36 million) mark after Bigil (2019), Varisu (2023) and Leo (2023); the actor also became the only Tamil actor to have three consecutive films to do so. It became the sixth Tamil film to reach the 400 crore (US$48 million) mark within 11 days of its release. The actor's last film, tentatively titled Thalapathy 69 has been now officially announced as a political action thriller film and is scheduled to be released in October 2025.
10 Agatha All Along (miniseries) 605,989 The 11th television series in the MCU produced by Marvel Studios, via its new Marvel Television label, had its 2-episode premiere last Wednesday; with new episodes set to premiere every week, and a 2-episode finale on the Devil's Night. It follows up WandaVision, which ended with the villainous witch Agatha Harkness being put under a spell that made her think she was a character in an old sitcom. After 3 years, Agatha, still played with scenery-chewing gusto by Kathryn Hahn, gets her own show, where she is taken out of that spell by a goth kid and decides to recruit a new coven to recover her powers before some old enemies (who in the comics are her grandchildren!) come to confront her.

See into my eyes, you'll find where murder lies (September 22 to 28)

Rank Article Class Views Image Notes/about
1 Lyle and Erik Menendez 7,386,763 These American brothers who were convicted in 1996 of the murders of their parents managed to top the list thanks to the second season (#5) of the Netflix biographical anthology series which was based on their life story (and if you click the article you might notice the page name is now spelled without an accent).
2 Maggie Smith 3,006,977 Dame Margaret Natalie Smith died at the age of 89, leaving behind a storied career that included two Oscars (Best Actress for The Prime of Miss Jean Brodie and Best Supporting Actress for California Suite) and memorable roles like a Greek Goddess in Clash of the Titans, an elderly Wendy Darling in Hook, a stern nun in Sister Act, a stern witch in the Harry Potter movies, and a witty countess in Downton Abbey.
3 Sean Combs 1,968,290 Last night, I couldn't even get an answer. Well, the answers regarding the rapper also known as Puff Daddy and (P.) Diddy are turning out to be quite unsavory. His success in music and business (he's possibly a billionaire) led to wild "white parties" full of sex and drugs, and many crimes that sent Combs to court, most notably hundreds of accusations of sexual misconduct. The press and social media are also digging up who used to hang out with Combs.
4 Hassan Nasrallah 1,140,433 The leader of #7, a Lebanese militant group designated as a terrorist organization by many countries, was killed once the group's headquarters were bombed by Israel.
5 Monsters: The Lyle and Erik Menendez Story 1,034,454 The second season of the American biographical crime drama anthology television series Monster, centers on the 1989 murders of José and Kitty Menendez (portrayed by Javier Bardem, pictured, and Chloë Sevigny), who were killed by their sons, #1 (portrayed by Cooper Koch and Nicholas Alexander Chavez). It premiered last week on Netflix and achieved massive viewership numbers, despite being subject to controversies as well. Which means it's just like season 1 two years ago, only this time the Wikipedia views aren't high enough to revolt the writers of this Report.
6 Deaths in 2024 992,961 Jean, Jean, you're young and alive
Come out of your half-dreamed dream
And run, if you will, to the top of the hill
Open your arms, bonnie Jean
7 Hezbollah 807,182 The Israel–Hamas war is close to completing its first year, to the chagrin of everyone not named Benjamin Netanyahu. The Zionist state took the opportunity to retaliate constant bombings by the militant group ruling over neighbor country Lebanon, with actions like the Lebanon pager explosions and the airstrike that killed #4. The international community fears it will escalate to another major armed conflict as some are already predicting a WW3.
8 Murder of Felicia Gayle 751,513 While a journalist being found dead from 43 stab wounds in her own home in 1998, and her killer (Marcellus Williams) was charged, convicted and sentenced to death in 2001, should be shocking in and of itself, his appeals since then, coupled with DNA evidence supporting his innocence and Gayle's family not wanting him to be executed, did not sway the State of Missouri (state seal shown) from ending his life on September 24.
9 Devara: Part 1 751,115 As two Indian films leave the list, a new one enters. Written and directed by Koratala Siva (pictured), this Tollywood action-drama film starring N. T. Rama Rao Jr in his 30th film as a lead actor was released last Friday. This film also marks Saif Ali Khan, Janhvi Kapoor and Shruti Marathe's Telugu acting debuts. The film opened to mixed reviews from critics and grossed over 250 crore (US$30 million) in just 3 days against a budget of 300 crore (US$36 million).
10 Daniel Dubois 677,941 "Are you not entertained?", the IBF heavyweight champion shouted to the crowd of 96,000 at Wembley Stadium, after defeating opponent Anthony Joshua on September 21.

The death call arises, a scream breaks the still of the night (September 29 to October 5)

Rank Article Class Views Image Notes/about
1 Lyle and Erik Menendez 4,565,092 These American brothers who were convicted in 1996 of the murders of their parents managed to top the list for the third consecutive week, thanks to a Netflix show based on their lives and crime.
2 Kris Kristofferson 2,788,936 The American country music star and actor died on September 28 aged 88. Kristofferson had hits like "Me and Bobby McGee" (which topped the charts when covered by Janis Joplin), was a member of the country group the Highwaymen, and his acting roles include 1976's A Star Is Born, Pat Garrett and Billy the Kid, Blade and the 2001 Planet of the Apes. His death leaves Willie Nelson as the last surviving member of the Highwaymen.
3 Joker: Folie à Deux 1,771,197 While a HBO Max series based on a Batman villain missed out, a film based on another Batman villain has made it to the list. The 2019 Joker film (which is unrelated to both the old DCEU and the upcoming new Rebooted DCU, thus being labeled as DC Elseworlds) had a sequel greenlit within a month of its release as it made over $1 billion and went on to win two Academy Awards, while also managing to be nominated for Best Picture, the first DC film to do so. The sequel has Joaquin Phoenix reprising his role as Arthur Fleck/Joker along with the addition of Lady Gaga as Harley Quinn, and somehow the marketing sold Bonnie & Clyde but the film itself was Chicago, a crime/courtroom musical (except in not being a comedy, in spite of the protagonist being a clown!). Just about every critic questioned what director Todd Phillips was doing, and audiences are also not as enthralled by Folie à Deux, with low expectations for the opening weekend's numbers and thus the possibility of the hefty budget that can be as high as $200 million leading to a big box office bomb.
4 Sean Combs 1,310,093 The rapper who once bragged that "It's All About the Benjamins" now needs to see if his fortune estimated near the billion mark can save him from a long stay in prison – hasn't fully worked yet, he was twice denied bail – for a laundry list of accusations, including sex trafficking, racketeering, and overall creation of a criminal enterprise. In the meantime the public discovers which celebrities used to be friends with "Puffy", and musicians like Kesha, Maren Morris and Joe Jonas are removing references to him from their songs.
5 Devara: Part 1 1,211,130 This Tollywood action-drama film starring N. T. Rama Rao Jr (pictured) in his 30th film as a lead actor was released last week and opened to mixed reviews from critics and has so far grossed over 370 crore (US$44 million) against a budget of 300 crore (US$36 million).
6 Pete Rose 1,188,107 Pete Rose, who died at 83 and was also known as "Charlie Hustle", was named to the Major League Baseball All-Century Team for a career that included three World Series titles, most notably two as part of the Cincinnati Reds team nicknamed "Big Red Machine". Yet his legacy was tarnished when in 1989 an investigation discovered Rose, then the coach of the Reds, was betting on the team's own games, leading to a permanent ban from MLB and subsequently being ineligible for the Baseball Hall of Fame.
7 Deaths in 2024 1,088,743 From #2's work:
Yesterday is dead and gone and tomorrow's out of sight.
Lord, it's bad to be alone.
Help Me Make It Through the Night.
8 Jimmy Carter 1,040,620 The 39th American president already became the longest-lived U.S. president in 2019, and now, defying expectations after nearly two years in hospice care, Carter celebrated his 100th birthday on October 1, becoming the first former president to reach the age of 100.
9 Dikembe Mutombo 993,594 A Congolese basketballer who died at the age of 58 of a brain tumor. He had a great proficiency in blocks and rebounds, earning him the nickname "Mount Mutombo" and four NBA Defensive Player of the Year Awards; he only couldn't get a championship ring, at most being in the 2001 NBA Finals as part of the Philadelphia 76ers. Dikembe Mutombo was also known for extensive humanitarian work, including financing a hospital in his hometown of Kinshasa which he named after his mother.
10 John Amos 891,050 This American actor died at age 84 in August, but it wasn't announced until October 1 (even his own daughter didn't know). Amos was famous for portraying the adult Kunta Kinte in the TV miniseries Roots, as well as his role as patriarch James Evans Sr. in the sitcom Good Times, a character which was killed off because of Amos' dissatisfaction with how African Americans were portrayed.

Exclusions

  • These lists exclude the Wikipedia main page, non-article pages (such as redlinks), and anomalous entries (such as DDoS attacks or likely automated views). Since mobile view data became available to the Report in October 2014, we exclude articles that have almost no mobile views (5–6% or less) or almost all mobile views (94–95% or more) because they are very likely to be automated views based on our experience and research of the issue. Please feel free to discuss any removal on the Top 25 Report talk page if you wish.



Reader comments

File:Erasmus of Rotterdam MET DP815934.jpg
Albrecht Dürer
PD
100
100
500
2024-10-19

The Editors


Stephen Harrison in 2023

The Editors by Stephen Harrison, 412 pp.
Published by Inkshares, August 13, 2024
ISBN 978-1-950301-67-6

Smallbones

Smallbones was the editor-in-chief of The Signpost from March 2019 through April 2022; he now writes the Disinformation report column, while also contributing to In the media. Disclosure: he received an electronic review copy of the book and in August received a signed paperback copy.

Stephen Harrison is a journalist and beat reporter who writes about Wikipedia and is likely the favorite such journalist of many of the editors of this encyclopedia. He knows his subject. He knows our complicated rules, our sometimes vicious politics, and the importance of getting to know the people involved.

His work has been published in Slate, The Washington Post, The New York Times, and other media outlets. Now, though, he has also released his debut novel, The Editors.

It's not the first work of fiction about Wikipedia, but it is the first I've seen that is not science fiction, speculative fiction, nor likely to be seen as overly literary. Harrison is writing for normal curious readers about the actions of Wikipedians dealing with an extraordinary – but not unimaginable – situation.

The novel's encyclopedia is called Infopendium, and has slightly different rules than Wikipedia. But people who have never edited Wikipedia, or even thought of what goes on behind the article pages, will still learn about how the Wikipedia community works.

The Editors is a suspense novel, and a real page-turner: at 412 pages, you might be tempted to finish it in one or two days. If you take this book to the beach, be sure to take along a big bottle of SPF 50+ sunscreen.

Despite the fast-paced story, Harrison's ability to develop interesting characters – much of it done on-the-run – might be his most surprising skill. Several of his black-hat editors are presented very sympathetically; the careful reader can understand their motivations. Perhaps this is to keep the reader guessing about how the action will be resolved. Perhaps it's due to the influence of Wikipedia's oft repeated principle, "assume good faith". Or perhaps he just recognizes that all editors, like all people, are flawed mixtures with different cultural backgrounds, beliefs, interests, abilities, experiences and blind spots.

I'll also note that all readers are flawed and have different experiences. This review has turned out to be a very personal experience for me, as there are several situations in the book that feel familiar. Other readers may draw on their experiences of other situations. Wikipedians might see reflections of themselves in the book and of people they know, as well as situations they have worked in.

But make no mistake: the novel is not a disguised version of a specific event or of a specific editor's actions, or a roman à clef where a simple key will unlock the meaning of the whole story. Rather it is about situations such as the pandemic – that many editors have participated in over long periods – or situations such as conflict of interest editing that occur every day.

The characters are not cheeky portrayals of real individual editors, though you might wonder for a couple of pages after they are introduced, until the same character reminds you of a different editor you know. They might be a combination of several editors who have acted in similar ways in similar situations, or even just Harrison playing with the idea of "what would a character like User:Real Person do if their interests, beliefs, or experiences were different?"

I'll only reveal the seemingly apparent identity of one such real-world editor: Jimmy Wales, the founder of the encyclopedia in both the novel (as User:Prospero) and in real life, gives a keynote speech at Infopendium's annual conference. But after the speech, Prospero's actions will surprise almost any Wikipedian who thought they knew Jimbo. Harrison keeps you guessing.

Wikipedians have, however, seen many of the situations that pop up in the novel: for example, a young editor struggling through a request for adminship or attending the annual Wikimania conference.

Paid editing happens nearly every day on Wikipedia. Every active long-time editor has seen how Wikipedia reacted to the pandemic. We've all seen news items covered here, especially celebrity deaths. Regular readers of The Signpost should know how billionaires attempt to rewrite Wikipedia.

The novel's action starts right from the first sentence of the prologue. Harrison has reported on this phenomenon called "deaditing": a very notable person dies, and editors compete to report the death, revise the article, and just change the "is" in the lead section to "was".

Pay close attention here! Almost all the major characters in the novel are introduced in the first seven pages, as well as several of their sockpuppets (multiple deceptive accounts used by a single editor).

The first six chapters introduce a freelance journalist who is both a hard-bitten, experienced and dedicated hack, and a flighty young woman worried that she can't make a career in the dying newspaper business. While she might represent the entire industry, she doesn't strike me as a copy of any journalist that I've met. Perhaps it's her indecisiveness. She meets one of the heroes of the prologue, as well as the "editor of the year" and many others at the Global Infopendium Conference.

The conference is thrown into chaos during Prospero's welcoming speech by a hack of the conference's computer system, while the editor-of-the year is insulted on everybody's screens. A black-hat editor is reintroduced to readers, as is a newly-minted billionaire.

And then the action really starts, as the very existence of Infopedium is at risk. There's a wild road trip across North America, some gunplay, a break-in, some government intimidation of editors. There are even a couple of romances. It's not our usual daily routine by any means.

You might think that editors would never engage in gunplay... but you'd be wrong. Do governments intimidate editors? Unfortunately, it's fairly common. Long road trips are also fairly common, especially among photographers. So, yes, the situations are all well within the realm of possibility.

Casual Wikipedia readers, newbies, and disinformation journalists will learn more from this novel than from any edit-a-thon, instructional video, or how-to-edit manual. Experienced Wikipedians will enjoy the fresh view of ground they know well and of editors they feel like they might know.

According to two interviews, Harrison has already begun working on a new novel, this time about the U.S. Federal Reserve, where he formerly worked. I expect it to be similarly well-researched and grounded in reality (if not specific facts), as his epigraph modestly suggests, "This is a reported work of fiction."

He explains this sentence to the first interviewer with an example: he wants to write "a story based on themes that we're finding in our lives. Some of the fiction that I've always admired the most would be Tom Wolfe. The Bonfire of the Vanities is deeply reported."

He quotes William Faulkner to the second interviewer: "The best fiction is far more true than any journalism." In that sense, this is a very good piece of fiction.

Many Wikipedia editors will likely have their own personal reactions to the novel, but please remember that the book is not about you, it's not about me. It's about us, the editors, the Wikipedia community.

Enjoy!

Sgerbic

Editor Susan Gerbic reading with helpers Hamilton, Imogene and Ariadne
Sgerbic leads the editing group Guerrilla Skeptics on Wikipedia and was interviewed twice by Harrison, the journalist who is the author of the book. Disclosure: she received an advance copy of the book for review.

An announcement came across my desktop recently that Slate journalist Stephen Harrison had published a book called The Editors. His beat is Wikipedia (yes, that is a thing), so I was a bit surprised he would be writing about Wikipedia editors, but hadn’t reached out to interview my team members. Harrison knows about the Guerrilla Skeptics on Wikipedia (GSoW) project, as he has interviewed me twice, most recently in 2023 for this article. To my surprise, I received an email from Harrison a few weeks ago, asking if I would be interested in an advance copy of his book. Of course I would! Keeping my ego in check and not asking why he didn’t interview any of my amazing editors for a book about Wikipedia editors, I gave him my home address, and soon I received a paperback advance reader copy. I found a sunny spot on the couch, the obligatory cats joined me and I settled in to see whom Harrison had chosen to interview. Caution: some vague mild spoilers are contained below.

I was surprised to learn that this is not a book interviewing Wikipedia editors; it is rather a novel about a group of editors who seek to keep misinformation off the fourth most viewed website, Infopedium. This group is called The Misinformation Patrol. Sound familiar?

With my three cats taking turns keeping me warm and settled on the couch, I finished the 414 pages mostly in a single five-hour reading session – yes, it was just that good. And completely not what I was expecting. It's Infopedium and not Wikipedia. Harrison selected new tenets, "Aim for Neutrality", "We Need Better Sources", "Anonymity Is Fundamental" and "Keep Developing", directly inspired by from Wikipedia's five pillars, like "Wikipedia is an encyclopedia... written from a neutral point of view", "free content", "respect and civility", and most important — and yet confusingly frustrating at times — "Wikipedia has no firm rules".

Everything else was very familiar. Discussions, edit history, administrator elections, ambiguous rules, conflict of interest issues, sock-puppets, paid editing problems, conventions, and edit-a-thons. But more than all of this, what Harrison really understood was the passion of the editing community: he has clearly been talking to real people.

The novel centers around user Alex718, who runs The Disinformation Patrol, and the journalist Morgan Wentworth (who is clearly Harrison's alter ego), focusing on the uncovering of a paid editing mystery of billionaire Pierce Briggs, who treats Infopedium as an extension of his media empire and seeks to control articles to protect his reputation and to influence the public. Briggs understands the power of Wikipedia... er, Infopedium, I mean. The story introduces other editors with their own agendas, such as DejaNu, a librarian whose mission is to create articles for women and people of color, despite being one of those "the ends justify the means" people who will bend the rules in order to right wrongs of the past. And to a community of people hellbent on following rules, DejaNu is a polarizing character (I didn't like her at all). Another editor, turtle~dragon, who has moved from Texas to China and started a business editing for pay, moves the storyline along: he begins working for Briggs, using sockpuppet accounts to influence articles, until the Chinese government finds out and forces him to cover-up a novel illness appearing in Wuhan.

And this is where Harrison's expertise with this platform really shows; a crowd-sourced, all-volunteer community can't be stopped from its mission, whether it is Infopendium or Wikipedia. The idea of writing an online encyclopedia that anyone can edit is just insane: it should not exist, it should be a mess, it can't possibly maintain itself without ads and masses of paid employees! Yet it does, and it thrives; issues with the software and head-butting with other editors notwithstanding, it works.

I saw the story of Morgan Wentworth – a freelance journalist who, out of desperation for a story, attends an Infopendium convention at Columbia University in New York – and the rest of the characters developed by Harrison as reflections of the Wikipedia editors he has interviewed over the years for Slate, VICE Sports, The Washington Post, The New York Times, OneZero and other media. A good chunk of his published articles are storylines in this novel: very clever. For instance, when in the novel editors turtle~dragon and KonaYeziq get wrapped up in censorship in Beijing, Uyghurs are sent to camps, and then there are the beginnings of a novel virus that the Chinese government tried to cover up, I knew Harrison's articles "The Coronavirus is Stress-Testing Wikipedia’s Systems", "Why China Blocked Wikipedia in All Languages” and "Why Wikipedia Banned Several Chinese Admins" were necessary research.

The inspiration behind character DejaNu, the librarian who wins the Editor of the Year award for her unwavering tenacity to right the wrongs of the past, by politically correcting articles to current standards and encouraging new editors to sign up, probably came from Harrison's articles "How Wikipedia Became a Battleground for Racial Justice" and "Closing Wikipedia’s Gender Gap". Character Alex718, whose unhygienic habits and obsession with Infopedium led him to create articles for trains and public transit since the age of seven, is probably based on this story Harrison wrote: "Wikipedia's Terrific Subway Railfans" about editors Ryan Ng and Shaul Picker, who have made hundreds of thousands of edits to Wikipedia, mostly about New York’s subways.

Harrison begins the novel with several pages of edit history for the article on billionaire Pierce Briggs, and introduces the cast of editors, which will set them on a spiral of intrigue to unmask who is behind the edits that are seeking to control information and spread misinformation. Personally, I found this an intriguing way to begin: maybe it's just me, but as I read the back-and-forth of the edits, I found myself in editor mode thinking, "How I would have handled some of these edits myself?" It was quite realistic.

Though the characters do become a bit of a caricature of an obsessive Wikipedia editor with a savior complex — think lots of missed meals, lack of relationships and days without showering — Harrison understands the passion, desire to be fair and neutral, rule-following, with free-information-to-the-masses attitude that I see in the Wikipedia editing community. It shows that Harrison has done his homework. At the end of the novel, the heroes of the story, who start out with many different agendas, find that they all share the same agenda. The storyline is fun, there is mystery and intrigue and the characters' individual stories and agendas seem plausible. I'm not giving away too much, because there are many twists and mysteries for even the non-Wikipedia editor to enjoy.

Coming back to GSoW for the moment, the group referenced in the novel, the Misinformation Patrol, focused on general problems with all articles, was made up of experienced editors who were recruited to join the Patrol and were operating anonymously with each other on Infopedium. Very different from the GSoW – we specialize in articles on science and pseudoscience. We also have created our own training program (that generally takes four months to complete), operate in many languages, and we are gathered off-Wikipedia in a private Facebook group called "The Secret Cabal", which allows us to know each other in real life. What Harrison is describing is more akin to WikiProjects, which exist all over Wikipedia, but are repositories of ranked lists of things to do, and are mostly dormant. Good intentions, but in my opinion, the anonymity of the users and the lack of mentorship and leadership are what kill these projects. GSoW not only trains one-on-one, but about half of our new people have never coded or worked in IT; we shun wall-of-text instructions and give lots of personalized feedback and mentoring. It's a different "vibe", which hopefully will keep us focused on our goals of making the best encyclopedia – the same goal that Harrison’s characters have with Infopedium.

One of Harrison's storylines involves a paid editing scheme from the viewpoint of a paid editor themselves falling victim to his "customer". Personally, I have dealt with several cases of someone in real life paying for an edit, then being blackmailed for more money to protect the edits from being removed. Something I warn professionals of often is that there is really no shortcut to having a Wikipedia article written about you, since it stands on its own or not, and money paid or special favors aren't going to help. Also different was the "training" that DejaNu tried to do at her library, which is something I suspect happens at Edit-A-Thons: brand new editors creating brand new Wikipedia articles in a couple hours with little instruction. Although I suspect this happens, it is not how I run my team. We start with minor edits and after much instruction over weeks and many small changes build up to a rewriting a stub article; never ever would we instruct someone new to create a brand new article in a few hours.

Overall, this is a fun read, with a very interesting sub-text and very well researched. I suspect Harrison would love to thumb his nose at a tennis-playing billionaire who treats Wikipedia as his own social network, and travel across country in a Scooby-doopedia Mystery Machine to uncover who is behind the misinformation. I wonder how Wikipedia founder Jimmy Wales — if he ever reads this novel — would think of his character Gerald Budd’s home in Palo Alto with "half-eaten microwavable meals covering the coffee table" and "(T)iny gray hairs fram[ing] the drain of the bathroom sink, clinging stubbornly to the porcelain surface". But as I've never met Wales, I don't know; maybe he would get a good laugh out of it?

Just for the fun of it, if Harrison created characters and storylines from the people he has interviewed over the years, which of these characters might I have influenced? On page 34, while Morgan Wentworth is walking past a series of editors, she overhears these two conversations: first, a man in a fedora who says, "If a page presents false balance, that actually undermines the truth", and then a woman with a French accent who says, "We cannot return to the past and compose 'better sources' for those people who lacked power". If you blend those two random editors and add in a lot of imposter syndrome that someone like me could be running a project like GSoW, I think Harrison has me figured out. Plus my pure joy of training new editors.

Harrison’s The Editors and my GSoW project have more in common than not. We both revere this important encyclopedia, remove misinformation, admire journalists, respect the pillars of editing and the editors themselves. Hopefully this review inspires you to pick up a copy.

WereSpielChequers

Disclosure: WereSpielChequers attended a book launch event and received a complimentary copy of the book.

The Editors is set in our world before and during the COVID lockdowns. But instead of Wikipedia, there is a similar online encyclopaedia, called Infopedium.

If the works of JRR Tolkien can be oversummarised as an exploration of the trope of the hero as orphan in a faux medieval setting, then Mr. Harrison's work is an estrangement of daughters in near modern times. But I'm sure readers of this publication will be more interested in the similarities and contrasts between the Infopedium of the book and Wikipedia. Imagine, if you will, a project as dominant on the Internet as Wikipedia was when twenty years old, but with the shonky business continuity practices of a much younger Wikipedia, long before dual sites, possibly even before that time when Wikipedia lost its main anti-vandalism defence for a weekend because the relevant volunteer needed his spare server for other purposes.

There are significant commonalities ranging from a very similar hell week, to sockpuppetry and malicious interference from foreign government. But there are also marked differences, including the age profile of retirees and schoolkids, as if the smartphone had not been invented (one assumes that Infopedium, unlike Wikipedia, has cracked mobile editing). While it isn't explicitly stated that Infopedium has decided to standardise on American English, its US focus and complete absence of any characters from the rest of the English speaking world are rather suggestive. The narrow cast of characters, repeatedly crossing each others' paths in very different parts of the site, gives the feel of a much much smaller community, more akin to that of the Georgian or Welsh language Wikipedias than what we are used to on EN Wiki. Another unsubtle difference between the two lies in editor motivation: yes, we've had our obsessives and hagiographers (a subset of POV warriors), but there are also people who edit Wikipedia because we enjoy it, and many of us manage to balance editing with other interests. An RfA candidate who appeared to be editing in all possible waking hours would likely be told to take a break, as the expectation at RfA is that you answer questions in order - but it isn't just a seven-day "open book" exam, it is one where it is normal for the candidate to absent themselves for twenty hours at a time.

Of course, it is possible that almost every character in this book is at least loosely based on a real current or former member of this community. But somehow I doubt I'd be the only Wikipedian who reads this book and concludes that they wouldn't be an Infopedian. All that said, in accordance with edicts from my wife, I now have to pick two books from my library to go to the charity shop because I've decided to keep this one, and yes, I'd buy the sequel, were there to be one.



Reader comments

File:20111110-OC-AMW-0035 - Flickr - USDAgov.jpg
U.S. Department of Agriculture
CC BY 2.0
0
50
300
2024-10-19

The Newspaper Editors

See also: Wikipedia:Identifying blatant advertising

Complaining about The Signpost and its self-righteousness is an exhausting task. Its poor editorial judgment continues, and it's almost unbelievable that, once again — contrary to WP:NOTPROMO — the newspaper mentions this same book. What a surprise, given its consistent blatant disregard for core Wikipedia policies. It's somewhat impressive how consistently it showcases this particular novel. We've moved well beyond subtle promotion, it's full-blown product placement now, complete with a link to the article about the publisher, which is, you've guessed it, tagged as containing promotional content. There is, needless to say, a double standard because if this book wasn't a novelization of Wikipedia, it wouldn't even get a footnote.

Man reading newspaper.
Researchers researching in their research facility for their research report
Figure 1: A graph from the researchers research report.

Many world-renowned researchers have researched this and produced many scathing research reports that found astonishing evidence that The Signpost violated Wikipedia's policies and guidelines regarding neutrality throughout the pre-order marketing window of this book. Through cutting-edge analysis of its coverage in their prestigious research facilities, worrying patterns of promotion, sensationalism, and undue weight in its reporting on this book were exposed in their bombshell research findings (shown in Figure 1). Wikipedia's policies and guidelines on maintaining a neutral point of view are made crystal clear, so it's scandalous that the newspaper — which is hosted on Wikipedia — ignored them when covering this book.

This unrequited love is made even clearer by the fact that the book doesn't even mention the newspaper. It takes real dedication to promote a single book this often, and it's reassuring to see that some things never change. Forfeiting neutral coverage in favor of incessant mentioning of this book is, to me, the beginning of the end of Wikipedia as we know it. If Wikipedia can no longer enforce a neutral point of view on its pages, it empowers editors to use the newspaper and its talk page subscriptions to launch outrageous promotional campaigns. It's a mockery of the need for neutrality to host this on Wikipedia. Imagine editing for Wikipedia, strictly adhering to policies and guidelines, and then seeing how the media elite of this website blatantly ignore all of these policies.



Reader comments

File:Child-Messy-8207.jpg
David R. Tribble
CC BY-SA 3.0
0
0
300
2024-10-19

Spilled Coffee Mug

...if it doesn't look like anything, squint harder. As always, click on the squares to type your letters, but don't press enter. You've been warned.

.........
.. ..
1
.
2
.
3
.
4
.
.. .. .. .
.. ..
5
.
.
.
.
6
.
7
.
8
.
.
.. .. ..
.
.. ..
9
.
.
.
.
.. .. .. .. .. ..
.
.. .. .
.. ..
10
.
..
11
.
.
.
12
.
13
.
.
..
14
.
.
.
.
..
15
.
.
.
16
.
.
.
..
17
.
.
.
.
.
.
..
18
.
.
19
.
.. ..
20
.
.
.
21
.. .. ..
22
.
.
.. .. .. ..
 

Across

Proud owner of "the only man-made talk page that can be seen from space"   
(-ing) Eventual fate of death-defying articles   
(acronym) The don't-be-a-meanie policy   
11  (shortcut) Guideline that determines whether Michael Jordan and LeBron James are notable or not, circular structures   
14  (acronym) 2012 bill Wikipedia blacked-out for   
15  Pasta recipe #2: Put the water to a ____, Across 5 to preference.   
16  (shortcut) When you have 10,000 edits and still add unsourced statements   
17  Troublemaker's fish   
18  Acronym to use when the subject goes by a lot of names   
20  (shortcut) MOS section on cross-referencing articles   
22  (acronym) Corp department, the bane of reviewers   
 

Down

First part of the Spanish Wikipedia's url   
___wig, a copyvio finder, named after a bug   
Shortcut to the WikiProject for Amsterdam, Anne Frank, and Van Gogh   
(acronym) A gaggle of geese Good Articles   
Fact panel of the kind that led to a 2013 Arbcom case   
The famous P versus __ problem in computer science   
(acronym) The designation where you have to worry about MOS:LAYOUT but not MOS:DAB   
10  Splitting, but tastier (and good for pastas)   
11  Shorthand for a template for closing discussions, now expanded to also mean collapsing comments   
12  Pasta recipe #1: ____ water into a pot.   
13  Where spiders build their nets, a source type   
14  (acronym) List articles, but fancier   
19  (acronym) The first news agency to have reported on 9/11, known for its namesake stylebook   
21  (acronym) 8 down, but all of the MOS pages matter this time   



Reader comments

If articles have been updated, you may need to refresh the single-page edition.

















Wikipedia:Wikipedia Signpost/Single/2024-10-19