This is a draft of a potential Signpost article, and should not be interpreted as a finished piece. Its content is subject to review by the editorial team and ultimately by JPxG, the editor in chief. Please do not link to this draft as it is unfinished and the URL will change upon publication. If you would like to contribute and are familiar with the requirements of a Signpost article, feel free to be bold in making improvements! · next-next issue draft
| |||||
Most Wikipedia editors work within a single language. The knowledge that exists in French but not in Spanish, or in German but not in Portuguese, is largely invisible to them — and to the hundreds of millions of readers who only access Wikipedia in those languages. Translating well-sourced articles across language editions can often have higher marginal impact than writing new content from scratch. The sourcing is already done, the structure is there, and the gaps are known. That was the idea behind OKA. the Open Knowledge Association.
Three years, 80 translators, and 10,000 published articles later, the experiment has attracted both interest and serious scrutiny. This article reflects on what we've learned, what we've had to improve, and what this experiment suggests about a broader question for the Wikimedia movement: what role should Wikipedia play in an information ecosystem increasingly shaped by AI?
The workflow used by OKA translators looks like this:
Translation on Wikipedia is rarely mechanical. Articles often require adaptation to different citation practices, templates, and sourcing expectations across language communities. In practice, the most time-consuming step remains human verification.
OKA compensates translators with hourly pay, rather than per article. There are no quotas and no bonuses tied to volume. The intention is to support careful work rather than reward speed.
Translators operate as independent editors. OKA provides guidance and funding, but editorial decisions remain theirs. Participants disclose their paid status on their user pages in accordance with Wikimedia’s paid editing policies.
Early in the project we relied mostly on traditional machine translation tools. When newer language models became available, we found that newer technology often produced clearer first drafts and handled complex sentence structures better. This changed where translators spent their time. Instead of rewriting awkward machine output, they could focus more on verification: checking claims against sources, ensuring citations were correct, and adapting the article to local context.
In late 2025 we conducted a structured evaluation of AI-assisted translation across 119 articles and 10 language pairs, supported by Wikimedia CH. The study analyzed 1,068 hours of translation work and tracked all corrections made between AI drafts and final published articles.
Several patterns emerged:
Two observations stand out. Publishing automated output without review would clearly be unacceptable: human verification is essential. At the same time, the translation process often improves the original. Translators regularly identify unclear wording, outdated phrasing, or ambiguous citations while working through the text. In that sense, translation sometimes acts as a form of maintenance for the encyclopedia.
This analysis also showed substantial variation between models, which informed subsequent adjustments to the tools which OKA advised translators to use.
Any organized editing effort at scale inevitably attracts attention. The community discussion that preceded a recent article in 404 Media was at times more heated than productive. But several concrete findings were real: specific articles with fabricated citations, formatting breakages, instances where translators clearly hadn't read their own output. Those findings convinced me to invest more in verification, even where it costs efficiency and translator autonomy.
The most serious case involved a translator who went beyond straightforward translation and added content not present in the source — including a citation that pointed to a source page that had nothing to do with the subject. This wasn't a mistranslation. It was an editorial decision that went wrong, and the AI draft made it easier for a bad decision to slip through unnoticed. The risk, I came to understand more clearly, isn't primarily in translation itself. It's in the moments where translators act as editors: adding content, filling gaps, inferring.
Even rare mistakes matter for an encyclopedia built on trust.
It is worth being clear about proportion. Well under 1% of the 10,000 articles we have published were ever flagged for issues of the kind described above — and in most of those cases, the problems affected only a small portion of the article itself. If the quality problems were systematic, we would have seen deletions at scale or waves of editor suspensions. Neither happened. The community discussion identified real issues that needed addressing — but it was a discussion about raising standards, not evidence of a project producing broadly unreliable content. I think it's important to say that plainly, not to deflect criticism, but because the proportion matters when evaluating whether AI-assisted translation can work at all.
What also struck me about the discussion — despite its heat — is that it demonstrated something important: Wikipedia's governance functioned. Editors identified problems, discussed them publicly, and implemented restrictions through a transparent process. I would prefer the community were more open to AI-assisted experimentation rather than defaulting toward restriction — and the process was messy and at times hostile. But it produced a real outcome through a legitimate mechanism, and that is more than exists anywhere else when AI systems consume this same content without any community having a say.
Following that discussion, several changes were introduced:
These measures complement the existing requirement that translators manually review every sentence. No single safeguard is perfect. The goal is to layer multiple checks.
Some editors have raised concerns about incentive distortion. Introducing funding can change behavior.
For that reason, compensation was designed to be hourly rather than per article. There are no quantity targets. Monitoring output primarily helps detect anomalies rather than reward productivity.
Interestingly, unusually high output has sometimes served as an early signal that closer review was needed. That reinforced the importance of tracking quality indicators rather than celebrating volume.
Another recurring question concerns where translators come from.
Roughly half of the articles produced so far have been published in the Spanish and Portuguese Wikipedias, where large communities still face significant coverage gaps in foundational topics. Many of our translators are multilingual editors who move fluidly between language communities — a translator based in Latin America might translate a French article into English. The fact that OKA funding goes further in some regions than others is a real part of the model: the same budget supports more contributors, which means more coverage. We think that's worth saying plainly rather than obscuring.
Many translators participate part-time alongside other activities. Some combine translation work with university studies. Others do it as a complement to other professional work that may pay better but is less personally engaging. The flexible structure allows contributors to participate at different stages of their careers.
Participation can also create pathways into the broader Wikimedia movement. Translators are encouraged to engage with local Wikimedia communities, and in some cases OKA provides financial support to help participants attend regional Wikimedia events.
In several cases, translators have later moved on to better-paid professional opportunities after gaining experience with translation, sourcing, and collaborative editing through the program. We see that as a positive outcome and encourage it.
Seen from this perspective, the program is not simply producing translated articles. It can also function as an entry point for new contributors to learn Wikipedia’s editorial practices and become involved in the wider Wikimedia ecosystem.
The roundtable report published last year by Wikimedia CH and Open Future — a discussion in which I participated in alongside around twenty Wikimedians, AI researchers, and data governance experts — framed a question I keep returning to: should Wikipedia try to remain a destination for human readers, or adapt to becoming ground truth for AI systems?
The data behind this question is already real. Wikipedia has seen an 8% decrease in human traffic alongside 50% growth in overall traffic attributed to bots. AI tools increasingly access Wikipedia in real time as a live reference rather than directing users to visit it. The concern isn't hypothetical. Wikipedia could become what the roundtable called "highly used but politically weak infrastructure" — indispensable to AI systems but invisible to human users, underfunded, and increasingly unable to defend the public interest.
My view is that this shouldn't be a forced choice — and the reason is not sentimental. Wikipedia cannot prevent AI systems from using its content, nor should it try: that reuse is a feature, not a bug, of the open license. But Wikipedia's value to those AI systems depends entirely on the content remaining human-curated, sourced, and verifiable. The moment the human editorial community degrades — because editors stop coming, because the site feels irrelevant, because there is no strong base of content to improve on — Wikipedia loses its value in both directions simultaneously. You cannot have the ground truth layer without the living community that produces and maintains it.
This is also why I think the translated articles OKA produces matter beyond their direct value to readers. A well-translated, well-sourced article in Spanish or Portuguese is a foundation. It attracts human editors who can improve it, correct it, and extend it — editors who might not have started from scratch but will engage seriously with something that already exists. Maintenance work — fixing citations, updating outdated content, improving structure — is genuinely valuable and often unpopular among volunteer editors. Paid programs can help absorb some of that load, not as a replacement for volunteer editing but as infrastructure that makes volunteer editing more productive.
What does follow from this, I think, is that Wikimedia communities need to engage actively with how AI tools are used within Wikipedia — rather than either avoiding them entirely or watching external systems deploy them without accountability. The roundtable noted that Wikipedia was itself a force of disruptive innovation when it emerged. Unlike institutions that had to navigate the shift from analogue to digital, the Wikimedia movement has never faced a moment where its fundamental ways of working were seriously challenged. That history may make it harder to recognize the urgency now.
Tools can improve the quality of a first draft; they can't decide what belongs in an encyclopedia. OKA is one small experiment in using these tools under community governance, with transparency about what works and what doesn't. Some approaches have worked. Others haven't. What I've become convinced of is that these experiments must happen openly and remain subject to community oversight — not because that is comfortable, but because that is what keeps Wikipedia's value intact on both sides of the equation.
Discuss this story