Ten thousand translated articles: OKA's experience with AI-assisted Wikipedia translation

Article display preview:

Three years in, why the experiment is worth continuing.

This is a draft of a potential Signpost article, and should not be interpreted as a finished piece. Its content is subject to review by the editorial team and ultimately by JPxG, the editor in chief. Please do not link to this draft as it is unfinished and the URL will change upon publication. If you would like to contribute and are familiar with the requirements of a Signpost article, feel free to be bold in making improvements!

This draft article ...

Y ... has a title defined.
Ten thousand translated articles: OKA's experience with AI-assisted Wikipedia translation
Y ... has a blurb defined.
Three years in, why the experiment is worth continuing.
Y ... has been copyedited.
Y ... has an image.
N ... is not yet approved for publication.

Writer resources ...

The Newsroom (talk)

deadlines

Writing: 20 June 01:00 (25 days left; 86%)

Publishing: 21 June 01:00 (26 days left; 86%)

There are 24 days, 4 hours, 56 minutes and 49 seconds until deadline. (refresh)

Last revised 06:40, 21 April 2026 (UTC) (35 days ago) by JPxG (refresh)

← Back to Contents

View Latest Issue

Drafts

Op-ed

Ten thousand translated articles: OKA's experience with AI-assisted Wikipedia translation

Contribute —

By 7804j

Disclosure: Parts of this article were drafted with assistance from an LLM. All substantive content, analysis, and conclusions were written and reviewed by the author.

Most Wikipedia editors work within a single language. Knowledge that exists in French but not in Spanish, or in German but not in Portuguese, is largely invisible — both to editors and to the hundreds of millions of readers who only access Wikipedia in those languages. Translating well-sourced articles across language editions can often have higher more impact than writing new content from scratch. The sourcing is already done, the structure is there, and the gaps are known. That was the idea behind OKA, the Open Knowledge Association.

Three years, 80 translators, and 10,000 published articles later, the experiment has attracted both interest and serious scrutiny. This article reflects on what we've learned, what we've had to improve, and what this experiment suggests about a broader question for the Wikimedia movement: what role should Wikipedia play in an information ecosystem increasingly shaped by AI?

How the workflow works

Part of the OKA translator workflow

The workflow used by OKA translators looks like this:

Translators freely select articles they wish to translate. Optional lists highlight articles missing an equivalent in their language (for example based on pageviews or the featured article label).
The translator reviews the source article and decides whether it is suitable for translation.
A draft translation is generated using automated tools.
The translator then performs a full human review. This includes verifying facts, checking citations, and adapting wording to the target language and community norms.
The article is published under the translator’s own Wikipedia account.
More experienced translators help onboard newer participants and occasionally review samples of their work.

Translation on Wikipedia is rarely mechanical work. Articles often require adaptation to different citation practices, templates, and sourcing expectations across language communities. In practice, human verification is still the most time-consuming step.

OKA compensates translators with hourly pay, rather than per article. There are no quotas and no bonuses tied to volume. The intention is to support careful work rather than rewarding speed.

Translators participate as independent editors. OKA provides guidance and funding, but editorial decisions remain theirs. Participants disclose their paid status on their user pages in accordance with Wikimedia’s paid editing policies.

Early in the project we mostly relied on traditional machine translation tools. When newer language models became available, we found that newer technology often produced clearer first drafts and handled complex sentence structures better. The result was a shift in where translators spent their time. Instead of rewriting awkward machine output, they could focus more on verification: checking claims against sources, ensuring citations were correct, and adapting the article to local context.

What we found

“

[Y]ou cannot publish [AI] output unreviewed.

”

In late 2025, we conducted a structured evaluation of AI-assisted translation across 119 articles and 10 language pairs, supported by Wikimedia CH. The study analyzed 1,068 hours of translation work and tracked the corrections made between AI drafts and the final published articles.

27% of AI-generated text was modified before publication.
74% of corrections addressed AI-introduced issues.
26% corrected weaknesses already present in the source article.
5–6% of errors altered the meaning of the original text

The numbers probably won't surprise anyone, but they're worth stating plainly: you cannot publish this output unreviewed. At the same time, the translation process often improves the original. Translators regularly identify unclear wording, outdated phrasing, or ambiguous citations while working through the text. In that sense, translation can also function as a form of maintenance for the encyclopedia.

This analysis also showed substantial variation between models, which informed subsequent adjustments to the tools which OKA advised translators to use.

On quality, scrutiny, and what we changed

Any organized editing effort at scale attracts attention. The community discussion that preceded a recent article in 404 Media was at times more heated than productive. But the discussion highlighted several real issues: specific articles with fabricated citations, formatting breakages, and instances where translators clearly hadn't read their own output. Those findings convinced me to invest more in verification, even where it reduces efficiency or translator autonomy.

The most serious case involved a translator who went beyond straightforward translation and added content not present in the source – including a citation that pointed to a source page that had nothing to do with the subject. This wasn't a mistranslation. It was an editorial decision that went wrong, and the AI draft made it easier for a bad decision to slip through unnoticed. The risk, I came to understand, isn't primarily in translation itself. It's in the moments where translators act as editors: adding content, filling gaps, inferring.

Even rare mistakes matter for an encyclopedia that relies on trust.

To be clear about proportion: well under 1% of the 10,000 articles we have published were ever flagged for issues of the kind described above – and in most of those cases, the problems affected only a small portion of the article itself. If the quality problems were systematic, we would have seen deletions at scale or waves of editor suspensions. Neither happened. The community discussion identified real issues that needed addressing, but it was a discussion about raising standards, not evidence of a project producing broadly unreliable content. It's important to say that plainly, not to deflect criticism, but because the proportion matters when evaluating whether AI-assisted translation can work at all.

What struck me about the discussion – despite its heat – is that Wikipedia's governance functioned. Editors identified problems, discussed them publicly, and implemented restrictions through a transparent process. I would prefer the community were more open to AI-assisted experimentation rather than defaulting toward restriction; the process was messy and at times hostile. But it produced a real outcome through a legitimate mechanism, and that is more than exists anywhere when AI systems consume this same content without any community having a say.

Following that discussion, several changes were introduced:

Stricter sourcing guidance. Translators are now instructed not to translate any sentence unless its claims can be clearly supported by inline sources that they have manually checked. No expansions, no inferences from unlisted sources. Earlier guidance allowed more discretion.
Comparison checks. We are testing prompts where a second language model compares the source and translated text to highlight possible discrepancies. AI checking AI is obviously not reliable in isolation — this is a complement to manual review, not a substitute for it.
Exploring peer review. We are studying whether peer review among translators before publication could help detect issues earlier.
Alignment with the community's "four strikes" rule. The recent community discussion introduced restrictions for OKA editors producing repeated problematic translations. We welcome this clarification, which formalizes quality expectations that already existed within OKA.

These measures complement the existing requirement that translators manually review every sentence. No single safeguard is perfect. The goal is to layer multiple checks.

Incentives and participation

Some editors have raised concerns about incentive distortion. Funding can change behavior.

For that reason, compensation was designed to be hourly rather than per article. There are no quantity targets. Monitoring output is mainly used to detect anomalies, not to reward productivity.

Interestingly, unusually high output has sometimes served as an early signal that closer review was needed. That reinforced the importance of tracking quality indicators rather than celebrating volume.

Another recurring question concerns where translators come from.

Roughly half of the articles produced so far have been published in the Spanish and Portuguese Wikipedias, where large communities still face significant coverage gaps in foundational topics. Many of our translators are multilingual editors who move fluidly between language communities – a translator based in Latin America might translate a French article into English. The fact that OKA funding goes further in some regions than others is simply part of the model: the same budget supports more contributors, which means more coverage. It is better to acknowledge that openly.

Many translators participate part-time alongside other activities. Some combine translation work with university studies. Others do it as a complement to other professional work that may pay better but is less personally engaging. The flexible structure allows contributors to participate at different stages of their careers.

Participation can also create pathways into the broader Wikimedia movement. Translators are encouraged to engage with local Wikimedia communities, and in some cases OKA provides financial support to help participants attend regional Wikimedia events.

In several cases, translators have later moved on to better-paid professional opportunities after gaining experience with translation, sourcing, and collaborative editing through the program. We see that as a positive outcome and encourage it.

In other words, the program is not simply producing translated articles. It can also function as an entry point for new contributors to learn Wikipedia’s editorial practices and become involved in the wider Wikimedia ecosystem.

Wikipedia's role in the AI knowledge ecosystem: a forced choice we should refuse

“

'Should Wikipedia try to remain a destination for human readers, or adapt to becoming ground truth for AI systems?' ... I don't think this should be a forced choice.

”

The roundtable report published last year by Wikimedia CH and Open Future – a discussion in which I participated alongside around twenty Wikimedians, AI researchers, and data governance experts – framed a question I keep returning to: should Wikipedia try to remain a destination for human readers, or adapt to becoming ground truth for AI systems?

The trend is already visible. Wikipedia has seen an 8% decrease in human traffic alongside 50% growth in overall traffic attributed to bots. AI tools increasingly access Wikipedia in real time as a live reference rather than directing users to visit it. The concern isn't hypothetical. Wikipedia could become what the roundtable called "highly used but politically weak infrastructure" – indispensable to AI systems but invisible to human users, underfunded, and increasingly unable to defend the public interest.

I don't think this should be a forced choice. Wikipedia cannot prevent AI systems from using its content, nor should it try: that reuse is a feature, not a bug, of the open license. But Wikipedia's value to those AI systems depends entirely on the content remaining human-curated, sourced, and verifiable. If the human editorial community degrades because editors stop coming, because the site feels irrelevant, or because there is no strong base of content to improve on, Wikipedia loses its value in both directions simultaneously. You cannot have the ground truth layer without the living community that produces and maintains it.

This is also why I think the translated articles OKA produces matter beyond their direct value to readers. A well-translated, well-sourced article in Spanish or Portuguese is a foundation. It attracts human editors who can improve, correct, or extend it. These editors might not have started from scratch but will engage seriously with something that already exists. The maintenance work of fixing citations, updating outdated content, and improving structure is genuinely valuable and often unpopular among volunteer editors. Paid programs can help absorb some of that load, not as a replacement for volunteer editing, but as infrastructure that makes volunteer editing more productive.

What follows, I think, is that Wikimedia communities need to engage actively with how AI tools are used within Wikipedia instead of either avoiding them entirely or watching external systems deploy them without accountability. The roundtable noted that Wikipedia was itself a force of disruptive innovation when it emerged. Unlike institutions that had to navigate the shift from analogue to digital, the Wikimedia movement has never faced a moment where its fundamental ways of working were seriously challenged. That history may make it harder to recognize the urgency now.

Tools can sharpen a first draft; they can't decide what belongs in an encyclopedia. OKA is one small experiment in using these tools under community governance, with transparency about what works and what doesn't. Some approaches have worked; others haven't. What I've become convinced of is that these experiments must happen openly and remain subject to community oversight not because that is comfortable, but because that is what keeps Wikipedia's value intact on both sides of the equation.

In this issue

Drafts (all comments)
Wikipedia:Wikipedia Signpost/Drafts

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

Keep up with The Signpost on Twitter, Facebook or Mastodon.

Home

About