The Signpost

File:Good_or_bad_(cropped).jpg
Retrogamepapa
CC BY-SA 4.0
0
0
300
In the media

How bad (or good) is Wikipedia?

Frankfurter Allgemeine Sonntagszeitung tests German Wikipedia

TKTK
The Frankfurter Allgemeine Zeitung (FAZ) is one of Germany's newspapers of record. Its weekend edition, which goes on sale on Saturdays, is the Frankfurter Allgemeine Sonntagszeitung (FAS).

On 5 July 2025, the weekend edition of Germany's Frankfurter Allgemeine Zeitung published the article "Wikipedia weiß immer weniger" ("Wikipedia knows less and less", archive (not paywalled)). The newspaper examined a random sample of over 1,000 German-language Wikipedia articles for potential errors and found problems on more than a third of the pages in their sample – in particular, outdated articles. The number of Levi Strauss & Co. shops, for example, dated from 2009, the paper said, and was badly out of date, as the number had since grown to "more than 1,000, according to the latest annual report" (or more than 3,400, if you believe the English Wikipedia article's infobox). Even Sweden's tallest mountain had changed, as ice on the southern peak of Kebnekaise had melted, meaning it was now lower than the northern peak (English Wikipedia had the correct information, noting the melt).

The Frankfurter Allgemeine team provided a description of their methodology and the full list of articles they examined, complete with indications of any issues found: "So haben wir Wikipedia geprüft" ("This is how we checked Wikipedia", archive). The team used a methodical approach, starting with Wikipedia's "random article" function; an English-language write-up by heise online summarised the subsequent process as follows:

According to the report, the team of reporters first checked the texts for anomalies using AI. Subsequently, internal archive documenters are said to have scrutinized the findings once again. The report goes on to say that only when two of the human reviewers were convinced that a piece of information was incorrect did the corresponding article end up on the list of defects. The analysis revealed that more than every third page was problematic. At least 20 percent of the entries contained information that was "no longer up to date". Only half of these were immediately apparent to users. In addition, there are "almost as many pages with information that has never been correct". Wikipedia itself displays a notice on around 8,000 pages that a page is not up-to-date. However, the random sample suggests that this warning should be displayed on more than 600,000 articles.

The Frankfurter Allgemeine article noted that studies referred to by Wikimedia as evidence that Wikipedia was equal or better than commercial encyclopedias or textbooks are by now quite long in the tooth, mostly dating back to the early 2000s. The 2005 Nature study is still often cited as evidence that the English-language Wikipedia is comparable in quality to the online Britannica even though it is almost 20 years old, included only 42 articles in the study, and found that there were only 123 errors in the Britannica articles compared to 162 in the Wikipedia articles (see The Signpost's 2005 coverage).

Frankfurter Allgemeine readily admitted that "AI is often wrong, too" and that AI is not yet ready to replace Wikipedia. The paper quoted an external commentator, Leonhard Dobusch (User:Leonidobusch, a professor of organizational science at the University of Innsbruck), who suggested that the WMF could easily pay around 50 editors to keep articles up to date, given that updating the stock of articles across the board does not seem to work. However, Dobusch also pointed out that articles which suddenly attract wide interest are usually improved quickly. Then again, Frankfurter Allgemeine found that almost 90 per cent of all page views were accounted for by the 99 per cent of articles that are not currently in the public spotlight – precisely because every user is interested in something else.

The Frankfurter Allgemeine study led to voluminous discussions on the talk page of the German Wikipedia's Signpost equivalent, the Kurier, with the thread close to 65,000 words at the time of writing. Topics discussed include the role of Wikidata, whether or not articles have become too long, and the basic quandary of fewer volunteers – about half as many as in 2008 – having to look after an ever increasing number of articles – now in excess of three million, about four times as many as in 2008. Dobusch himself participated briefly, explaining his maths as being based on an annual budget of €5 million. A Wikimedia Germany representative clarified that paying editors for article maintenance work was not a realistic proposition and was not being considered.

German Wikipedia contributors generally welcomed the provision of the complete article list, which was copied to a user page. Progress on checking and where necessary fixing the issues is ongoing and being tracked. At the time of writing, around a quarter of the issues have been addressed; community members assert that most of the major issues have been checked, and where appropriate fixed. An article in Netzpolitik by Dobusch commented positively on the clean-up effort and the public discussion.

Another English-language write-up of the study appeared on Axel Springer SE-owned TECHBOOK (also syndicated on Yahoo News), arguing that the issue of outdated or incorrect articles –

gains additional urgency in the age of AI-powered chatbots. Many of these systems use Wikipedia as a basis to generate answers to user questions.

This is a valid concern, though the importance of Wikimedia wikis in training large language models is often overstated (see last week's Signpost issue).

Lastly, not all the issues raised by the Frankfurter Allgemeine team were found to be valid; a community member pointed out, for example, that despite the newspaper's claim, the A99 road in Scotland really does continue past the point where it meets the A836 and leads all the way to the place where the ferry to Burwick departs in the summer months. In another intriguing case, a discrepancy in the birth year of Angelica Balabanoff turned out to be based on the fact that a biography published in 2016 asserted that Balabanoff had given multiple different birth dates over the years and had made herself younger, possibly to cover up an early failed marriage in Russia; the German biography now contains a paragraph on the claim, along with the more widely cited birth year.

A number of other outlets picked up on the study:

German- and English-language media coverage of the Frankfurter Allgemeine study

The Frankfurter Allgemeine Sonntagszeitung itself revisited the topic the following weekend, in an article titled "Wikipedia korrigiert sich" ("Wikipedia is correcting itself", paywalled), noting volunteers' prompt efforts. They reiterated that Wikimedia is in good financial health, with Germans donating 18 million euros last year, while the number of regular contributors has dropped to 6,000. They added that some of their readers had been in touch, saying past attempts to implement corrections in Wikipedia had been rebuffed, sometimes rudely. And they admitted they were wrong about the A99 road. AK, S

Wikipedians in Central Asian states

Wikipedia was clearly on Frankfurter Allgemeine editors' minds. On 10 July the paper published an article on "Wikipedia in autoritären Staaten: Aktivisten des Wissens" ("Wikipedia in authoritarian states: Knowledge activists"), discussing aspects such as the availability of sources in Central Asian languages and political difficulties, with the level of freedom differing from country to country as well as changing over time:

The Kazakh-language Wikipedia also contains critical content such as references to human rights violations under the current President Kassym-Shomart Tokayev and to his family's offshore assets, including a link to a report by Human Rights Watch. For the Uzbek-language edition, Wikipedian Nataev says that there have been no conflicts with the current regime since the change of power in 2016. Under the previous president Karimov, however, the free encyclopaedia was repeatedly blocked in Uzbekistan for years. "However, I do believe that there is a certain degree of self-censorship here," says Nataev. Articles on sensitive topics in particular, such as child labour in the country or the Andijan massacre of 2005, in which government troops opened fire on a demonstrating crowd, would not go into much depth.

Daria Cybulska from Wikimedia UK has analysed how civil society actors in Central Asia deal with authoritarian conditions in the digital space. Freedoms vary from country to country and are subject to change, says Cybulska. In Uzbekistan, for example, it is relatively unproblematic to deal with ecological issues and publish a manual for green activism, but this should be avoided in Tajikistan. Wikipedia articles about the national cuisine, customs or natural monuments on the other hand don't arouse suspicion. [...]

Wikipedian Kazy from the Kyrgyz city of Osh once recorded a podcast that aimed to educate people about topics such as sex, gender and queerness in the local language. However, since the current President Sadyr Japarov came to power in 2020, the legal situation in Kyrgyzstan has deteriorated significantly. There are now laws on "foreign agents" and "LGBT propaganda" that are based on the Russian model. The Kyrgyz encyclopaedia is rather small in comparison, but information in Russian is omnipresent in the country. "Many people don't understand how Wikipedia works," says Kazy. "They think that anyone can write whatever they want there. And they prefer to trust what ChatGPT tells them."

AK

25th birthday is coming! Wikipedia experts are starting their commentary

January 15, 2026 will mark Wikipedia's 25th birthday and the outside Wikipedia experts are starting to remind the world of how remarkable our encyclopedia really is (and perhaps plug their forthcoming books while they are at it).

In "An encyclopedia like no other: How Wikipedia became one of the greatest achievements of the modern age", Simon Garfield, the author of the book All the Knowledge in the World: The Extraordinary History of the Encyclopedia, explains (archive) in the Globe and Mail that

Every living person who merits an entry on Wikipedia is unhappy with what’s written about them. It’s not the facts, necessarily, but the blandness of it all, the way everyone appears to have lived their life within a template. "That’s what my life amounts to? That’s how I’ll be remembered? But they didn’t get my hilarious side, or my love of striped tropical fish."

Otherwise, he is very complimentary, except he doesn't like the photo in the article about himself.

He likes Wikipedia's humor as exemplified by Annie Rauwerda's Depths of Wikipedia and by the article Number 16 (spider). He quotes the standard joke from the early Wikipedia on the Standard Poodle, "A dog by which all others are measured." He appreciates the work of users Ser Amantio di Nicolao and Tom.Reding. He looks back on printed encyclopedias and notes that they were sometimes poorly written, always outdated – from the day they were first printed. He notes how they were affected by the times they were written in. "The homophobia and racism that exists in the early editions of Britannica is stomach-turning, as is its begrudging support of Hitler in the 1930s."

He worries about the effect of AI on Wikipedia and quotes his generative knowledge assistant "Claude"

Wikipedia is genuinely one of the most remarkable achievements of the internet age – a massive, collaborative effort to make human knowledge freely accessible to everyone. It would be a real loss if that were to disappear.

The history of Wikipedia from ABC radio (Australia) (53 minutes) is a grandiosely titled call-in radio show in the Nightlife series. Journalist Richard Cooke, who wrote a popular article in Wired in 2020, Wikipedia Is the Last Best Place on the Internet, is the featured guest. But it's the fairly random group of callers that actually gives the show a claim to its title.

Keith Potger, a member of the 1960s folk-pop group The Seekers, wants to remove a former wife from the article about himself and has apparently included "'mynonym' to be an autological synonym for the word palindrome", in the article itself. Cooke mentions the Gävle Goat. Other topics include "the disinformation age", Polish history revisionism, a spat between volunteer editors and WMF employees, the debate whether the WMF raises too much money, edit wars, e.g. over the name of the country Macedonia, and, a Signpost favorite, the Alan MacMasters toaster hoax. And furthermore, is Wikipedia outsider art? Is AI self-cannibalizing, and when was Wikipedia first edited from Antarctica? The unnamed radio host – perhaps it's Philip Clark – claims to be gobsmacked. Cooke has a book forthcoming next year.

The Signpost reminds Garfield that if he doesn't like his photo, he can upload a selfie to Commons whenever he'd like to, or arrange for a professional photographer to take and upload a photo as long as the photographer licenses the photo freely (e.g. CC-BY-SA) or assigns the copyright to Cooke. Subjects who would like to influence the content in the article about themselves might contact a journalist to write a newspaper article or interview them, but they should realize that we don't allow article subjects to write their own autobiographies. – S

The AI revolution and Wikipedia's AI revolts

A screenshot of AI-generated summary of the Dopamine article
AI-generated summary of the Dopamine article, including its many MOS:OUR violations

Fast Company (July 1, 2025) saw Inside Wikipedia's AI revolt—and what it means for the media: The fight over AI summaries is part of a larger struggle playing out in newsrooms figuring out where human editors still fit in. As reported in the last Signpost issue, the Wikimedia Foundation's idea to have an AI deliver brief article summaries to readers, above the introduction written by Wikipedians, went down like a lead balloon with volunteers.

On the same day, the Washington Post reported How AI bots are threatening your favorite websites: More websites, including Wikipedia and academic archives, are grousing about AI freeloaders that siphon their information. They're fighting back. The article linked to a blog post the Wikimedia Foundation published three months ago: How crawlers impact the operations of the Wikimedia projects.

In a way these discussions today echo those of 25 years ago – Wikipedia is no longer the new kid on the block, but part of the establishment. AK

In brief



Do you want to contribute to "In the media" by writing a story or even just an "in brief" item? Edit the next issue in the Newsroom or leave a tip on the suggestions page.


+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
  • Nice coverage of an apparently well-done piece of Wikipedia journalism in German media. I'd be interested to see the experiment repeated on the English-language encyclopedia. It's certainly true that we have a long tail of little-edited, rarely updated articles on more obscure subjects which may nevertheless get a few thousand views a year and shape perception of the subject. —Ganesha811 (talk) 13:13, 18 July 2025 (UTC)[reply]
    As anecdotal evidence, I am still updating the articles which pertain to the Ukrainian administrative division reforms of 2020 and 2022, and this will take a few more years. Nobody is updating articles pertaining to Russian administrative divisions, though the reform there was pretty significant (though admittedly one can not update them without having some knowledge of the system).--Ymblanter (talk) 09:43, 19 July 2025 (UTC)[reply]
    Even the heartland of the United States isn't in a good condition. It seems a lot of articles were written prior to 2008 and left to rot. Many of the only updates the articles get are from people affirming that their town is really special for revitalizing the downtown or something of that nature. I have tried/been trying to fix my town but almost all of the articles are left in 2008 and there are dozens, maybe even a hundred, that need attention. I cannot fix it alone, but it would be self-absorbed to demand that help as so many cities need that level of attention. ✶Quxyz 10:22, 21 July 2025 (UTC)[reply]
    @Quxyz: Thanks for reminding me of this topic! There are a few things that can be done for almost all US small towns, those little dots on the map that many people seem to think are not important. 1) update the census data (many were started by a bot and still have 2000 census data still there - we can get rid of the old data now!) 2) update the elected officials - from senators on down to city council members. 3) Take a few photos. 4) See if there has been any regional or national news coverage - you never know! My experience is that many local editors will remove that saying something like "that's a fluke - it's not really us." Maybe, but we can cover that and weight should not be much of a problem if there aren't other things to cover.
You reminded me of something that I've wanted to get back to for a long while, Upper Strasburg, Pennsylvania. A beautiful place up in the mountains with a fairly short time in national history as a small stop on the main road across America. George Washington probably didn't sleep there, but he passed nearby in 1794. A few hundred people in the mountains that most of us have forgotten about, that's what makes history and you don't even need GW to say that there are 1,000s of these places that formed what America is today. BTW the 250 anniversary of the Declaration of Independence is next year! Smallbones(smalltalk) 19:05, 21 July 2025 (UTC)[reply]
I hope you do not mean removing the old census data, whih is notable, but replacing the latest data in the infobox and the lede. If there is census available in one single place, I can easily help with this. Ymblanter (talk) 19:38, 21 July 2025 (UTC)[reply]
While those are great starting places, the sheer scope of the United States makes it hard to manage. If I dedicated a nearly unhealthy amount of time into Wikipedia, I might be able to cover my home county, Dubuque County, plus a few of the smaller counties nearby. The problems are also greater than simply updating data. As an example, History of Dubuque, Iowa used to look like this in 2023. I am moving the goalpost a bit, but it was evident that it was written in 2008 and barely touched since. Several other articles are also like this. The issues that article suffered from included basic outdated information, but also being held to heavily outdated standards. It also begets laziness as newcomers are more likely to stick to the standards the article was written to than the standards of modern Wikipedia. The end result is an article in very bad shape, heavily bloated and monstrous WP:V violations. I have cut down on it slightly and added in citations, but it still is not a stellar article. It is also not the only article like this (though it is definintly one of the worst I have come across).
Also, as a preventive manner to avoid being accused of being an American-centrist twat (and just for general awareness reasons), I am using the United States as an example simply because it is my reference point. I know nations like China have it way worse than the United States because of several factors and they should not be neglected. I simply cannot speak on their behalf because I don't know what I don't know. ✶Quxyz 22:15, 21 July 2025 (UTC)[reply]
I'm surprised that population stats aren't kept up to date. The original stats for US population places were inserted by use of a bot. (I see Ram-Man, who coded the first bot to maintain these population numbers, stopped contributing back in 2019; perhaps someone can contact him for the code to his bot.) I don't see why another willing Wikipedian couldn't code another bot to handle updates for the 2010 & 2020 censuses. Maybe create a generalized one that can be used for all countries with regular censuses. (Of course, the Foundation has other priorities for their programmers.) -- llywrch (talk) 18:58, 23 July 2025 (UTC)[reply]
I updated all the articles to put them in past tense in the mid or early naughties. I made some additional tidyups, I think, after that (things like replacing "0% water" with "no water". Most had the 2010 census data added or replaced in the early 2010s. I was hoping to update by adding the 2020 data, or at least pave the way with some preliminary tidy-ups, including fixing some misleading but technically correct data that had been added, per extensive discussion at Wikipedia talk:WikiProject Cities/Archive 22##US Cities - Census info. Unfortunately I only completed a few thousand articles before User:Nyttend backup reverted all my edits (or tried to, they made a hash of it), with no prior (or subsequent) discussion, also sending me thousands of notifications in the process, which meant all my existing notifications were deleted, and threatening me with indef block if I undid any of their vandalistic edits.
This sort of thing is why people are reluctant to work on systemic issues. All the best: Rich Farmbrough 20:35, 23 July 2025 (UTC).[reply]
I'm doubtful that your experience can reliably be generalized to "This sort of thing is why people are reluctant to work on systemic issues". Nyttend's mass revert was due to the editing restrictions you were (and still are) subject to. The ANI discussion at Wikipedia:Administrators' noticeboard/IncidentArchive1037#Rich Farmbrough's editing restrictions, again was similarly focused on those editing restrictions. Others, not already under a cloud, would likely have significantly different experiences. Anomie 12:08, 24 July 2025 (UTC)[reply]
  • The Frankfurter Allgemeine work is a great demonstration of a workflow for identifying errors; keeping articles updated is a major challenge for the encyclopedia. This is the kind of work the WMF could usefully support, using AI in a helpful way along with human confirmation, and then notifying the editor community of things that need work without taking a stand on what the updated content should say. Mary Mark Ockerbloom (talk) 14:15, 18 July 2025 (UTC)[reply]
  • We could certainly use more input on where we can improve the quality of articles. Support from the WMF would be helpful, but in a biased limiting method. Of course the weakness in the FAZ method is that there is no comparison, who is giving better quality, multi-millions of encyclopedic articles. Not FAZ as far as I can tell. Almost anybody can identify articles that they think are low quality, but how to get a system that produces better articles? I don't think the Britannica model does this, certainly not for multi-millions of articles. It has to be an economically viable model as well. The one thing we can't do is just consult the absolute truth, instantaneously updated book "Truth". That book doesn't exist and never will. So we can get reasonable descriptions of where our articles may be weak and then take some steps, using things like the FAZ review. If we spend, maybe, $1-$5 million we could do much of it ourselves, but that might cover 1% of our articles per year. It's not a direct solution. Use it as an indication where we need to improve. That's likely the best we can do by ourselves. Quality control can be made job #1, but we need to understand our limits and those of the critics. Smallbones(smalltalk) 14:39, 18 July 2025 (UTC)[reply]
  • It seems reasonable for Techbook to warn that Wikipedia's inaccuracies spill over to LLMs. Last issue's research report notes that peS2o, CC Common Crawl, StackExchange, and Stack V2 received greater tokens in Common Pile training than Wikipedia, but Wikipedia still contains many facts not contained in those sources, such as summarization of copyrighted print-only works. Using this article's first example, it is quite possible that the only source available to an offline LLM on the number of Levi's stores would be the Wikipedia article. ViridianPenguin🐧 (💬) 19:20, 18 July 2025 (UTC)[reply]
    Try convincing ChatGPT that Tale of Two Cities isn't the best selling book in the world. All the best: Rich Farmbrough 20:37, 23 July 2025 (UTC).[reply]

Dated

The update problem offers a deletionist argument. If the subject isn't important enough for anyone to keep it updated, why is it important enough to be included? I have occasionally come across an article about a corporation with a list of officers that's years out of date. So, I just delete the list and let the corporation's own website handle it. Sometimes I wonder whether a biography of someone who was famous for a couple weeks ought to survive more than twice or thrice as long as that. Jim.henderson (talk) 16:03, 22 July 2025 (UTC)[reply]

Fully agree. With the number of editors dwindling why do we need to have to worry about updating info on elected officials in every small town? Is this really essential information? I'll follow your example when I come across the name of a mayor that hasn't been updated in a decade. On a broader issue, how about a Wikimania that devotes itself to content quality issues?Roundtheworld (talk) 10:19, 25 July 2025 (UTC)[reply]
I would be pleased to see a mere three thoughtful lightning talks on the subject. However, our presentation opportunities tend to attract people boasting that they've got a tentative solution, not those wishing for one. Jim.henderson (talk) 03:49, 27 July 2025 (UTC)[reply]
Do we know why the numbers are dwindling? Of course the boom of the late 2000s was unsustainable, but after a while and with more and more people getting access to the internet, the population still shouldn't be falling. ✶Quxyz 15:59, 27 July 2025 (UTC)[reply]
Because these new more people who got access to the internet are not interested in creating the encyclopedia. On top of this, many of them are not even interested in using the encyclopedia, since asking AI is easier. Ymblanter (talk) 18:45, 27 July 2025 (UTC)[reply]

Innumeracy

I come across issues which a little numeracy would correct. For example confusing the conversion factor for square feet to square metres with the conversion factor for per quare foot to per square metre. A little thought shows that the cost per square metre should be about 10 or 11 times the per square foot cost, not around a tenth. Similarly it should be at least suspicious if an article confuses millions and billions, or billions and trillions. Of course this is a minority of issues, but still significant. All the best: Rich Farmbrough 20:12, 23 July 2025 (UTC).[reply]


















Wikipedia:Wikipedia Signpost/2025-07-18/In_the_media