The Signpost

Disinformation report

Croatian Wikipedia: capture and release

A decade-old case of "project capture" of the Croatian Wikipedia (Hr.WP) by nationalist administrators may have been resolved with the help of a report published this month on Meta by the Wikimedia Foundation titled "The Case of the Croatian Wikipedia". The report was authored anonymously, presumably to avoid harassment, and is an independent view of an expert on the subject matter.

The admins, led by Kubura, inserted disinformation and used sockpuppeting and other abusive tactics, according to a separate RfC which globally banned him last November. Blablubbs, who participated in the RfC, said that Kubura had an "army of socks". Blablubbs decided to help at the RfC "partially because of the whitewashing ... and partially because of draconian crackdowns on dissent inside the project".

The admins were linked by the report to Croatian nationalist positions by their downplaying the UN war crime convictions of Croatians who fought in the 1991–1999 Yugoslav Wars, their use of biased unreliable sources and by their support of the World War II era Nazi-puppet state, the Independent State of Croatia (NDH), as well as the military group, the Ustaše which the report calls "terrorist".

The report echos earlier accounts of administrator abuse including a 2019 article in The Signpost, "The curious case of the Croatian Wikipedia", Croatian and international news stories going back to 2010, and complaints by Wikipedia editors starting about 2007. The report concludes that "Hr.WP had been dominated by ideologically driven users who are misaligned with Wikipedia’s five pillars, confirming concerns about the project’s integrity from the global community."

Articles are being re-written and disaffected editors are rejoining the project. The report notes this progress but warns that the transformation is not complete and that the banned admins may use new accounts to try to recapture the project.

The report also observes that this case highlights a "significant weakness in the global Wikimedia community and – by extension – Wikimedia Foundation platform governance."

The report

The WMF began its planning for the report in November 2020 as the RfC on banning Kubura was in progress, but the author's investigation started in February 2021. He is an external expert on the subject matter and provides three recommendations to the WMF. Jan Eissfeldt, Global Head of Trust and Safety at the WMF told The Signpost that the author is a native speaker of Serbo-Croatian with "decades of relevant international experience analysing ... patterns of organised disinformation." The report states "opinions expressed in this report are those of the author and do not necessarily reflect the official policy or position of the Wikimedia Foundation."

The Croatian, Serbian, and Bosnian Wikipedias are unusual in that they all separated starting in 2003 from the Serbo-Croatian Wikipedia, which continued to exist. All these languages are mutually intelligible variants of Serbo-Croatian, which is termed "pluricentric".

The report states that

this structure enabled local language communities to sort by points of view on each project, often falling along political party lines in the respective regions. The report asserts, furthermore, it deprived the newly-created communities of editorial diversity that normally guides and underpins the traditionally successful process of editorial consensus in other pluricentric language projects.
— Report, p.2

The limited number and diversity of editors on the new Croatian Wikipedia allowed it to become politicized, and allowed Kubura, his sockpuppets, and followers to capture the administrative structure of the project.

Evidence of this capture is shown in the report's section on "Key findings and case studies" (pp. 15–35) which makes up one-third of the report. It includes subsections on

  • "Measuring disinformation," which uses a sample of articles on 32 people convicted of war crimes committed during the Yugoslav wars and compares how these crimes were covered in 8 Wikipedia language versions. Though the sample size is low, Hr.WP clearly covered Croatian war criminals differently than the other language versions.
  • "Illustrative examples," or case studies from the above subsection
  • "Propaganda," examining a dozen specific articles
  • "Questionable sources," which examines eleven sources used in the project and the frequency of their use, classifying them as self-published sites, those without any declared editorial standards, openly extremist sites, etc.

This section is the core of the analysis and may set the standard if any future reports of this type are needed.

Based on these findings the report makes three recommendations to the WMF and the Serbo-Croatian communities:

  1. that the Croatian community seek to "continue re-establishing a robust local governance system, requesting oversight and support from the rest of the Wikimedia movement as needed;"
  2. that they seek to unify the selection of admins and functionaries with other Serbo-Croatian communities; and
  3. that they explore a full reunification into the original Serbo-Croatian language project.

Adding some urgency to these recommendations, the report warns that, as currently constituted, Hr.WP is at risk of being recaptured by nationalist editors and admins.

An additional observation – strengthen global governance

Following the recommendations, the report's author makes a statement that goes beyond the Serbo-Croatian community and the Hr.WP disinformation problem.

The evident failure of the Meta RfC system to resolve the structural misalignment of Croatian language Wikipedia and the lack of an adequate alternative pathway to resolution, points at the significant weakness in the global Wikimedia community and – by extension – Wikimedia Foundation platform governance. This is a problem for public and regulatory confidence in the self-governance model provided for within the framework of the Foundation’s policies ... While devising possible pathways to address this identified bigger challenge is beyond the scope of this disinformation assessment, it strikes the author of this evaluation as increasingly important to resolve in the light of heightened regulatory scrutiny of user-generated platform models, including Wikipedia ...
— Report, p.14

The Signpost asked Jan Eissfeldt of the WMF for his reaction to the report's observation. He recognizes that disinformation is a growing problem, and emphasized that the WMF would work with the communities as openly as possible. For cases where safety is a potential problem, they might need to work through stewards or other trusted users. "We are investing in our movement's capacity to identify and respond to all kinds of influence operations, including those led by government actors, to ensure the accuracy of the information shared on Wikimedia projects. An example of this was the taskforce we put together ahead of the U.S. presidential election."

While not promising to start any new program to systematically evaluate disinformation problems, he said "the Foundation aims to conduct project evaluations, in collaboration with volunteers in the Wikimedia movement, to explore potential issues in projects openly and transparently."

How well did the WMF respond in the case of the Croatian Wikipedia? He says that the WMF "did not adequately understand some of the unique risks now identified in the report," in particular the risks of having separate Wikipedias for parts of pluricentric language communities.

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
  • Eh? How on Earth did may have been resolved with the help of a report published this month on Meta by the Wikimedia Foundation apply? It hasn't, yet, done anything at all? The partial resolution was done by various efforts by Community members, including global bans, some new good admins on their side and the further removal of 3 of the worst eggs by the hr-community. Nosebagbear (talk) 20:46, 27 June 2021 (UTC)[reply]
Well I did write "*may have been* resolved *with the help* of a report" (new emphasis) Yes the admins, folks at the RfC, the communities, etc deserve all the credit. Make that ten times all the credit (from 10 years worth of trying with little help from the WMF). At first when I saw the report, I thought "Here comes the WMF trying to grab some credit". I don't think so now. The official reason for publishing it was "for the sake of transparency" which aligns with what WMF employees are usually allowed to say. The report itself I think does deserve some credit - not for this round of the battle - but for the next time. So there's a method available to deal with this type of thing that might only take 3-4 years, rather than 10. But I shouldn't get started on this ... Smallbones(smalltalk) 22:13, 27 June 2021 (UTC)[reply]
So if it was "may help resolve future hr-wiki issues" then that would be fine, but saying may have been resolved without knowing if it has seems...premature, at best Nosebagbear (talk) 22:51, 27 June 2021 (UTC)[reply]
Well "*may have been*" does mean "*may have been*" Words do matter in journalism. But at the same, you do have me wondering whether I was being completely honest in my reporting here. There's a difficult step in writing a straight news story - and I do believe this needed to be a straight news story before any real analysis could be done. The step is to try to give up all your biases and POVs at the start, including some of your skepticism and assume the folks you're dealing with are acting in good faith. Then as the facts become clearer you can start asking deeper questions. Perhaps I could have concluded that this was a show report with no real meaning. But I didn't and still don't believe that. In any case I got the report on Thursday and some good info from the WMF on Friday, and more or less made sense of a 60 page report in 1200 words (or whatever), so I'm not going to be too hard on myself. Believe it or not, this is fun. Anybody who wants to try it, The Signpost needs some good news reporters. Smallbones(smalltalk) 02:15, 28 June 2021 (UTC)[reply]
  • +1. The Kubura ban was enacted on November 28th, 2020, the new administrators were elected between November 2020 and January 2021, the desysops/decrats were completed on March 1st, and the Meta RfC was closed on March 20th. The report was published on June 14th (just under two weeks ago), so I fail to see how it has helped anything, unless I'm missing something. Giraffer (talk·contribs) 22:04, 27 June 2021 (UTC)[reply]
    +100 @Giraffer excellent points. WMF did not make any explicit or strong efforts to handle this in any significant way, aside from now (post-festum) content research (which is only part of the problem, but the one WMF is focused on due to reputation)... @Smallbones this article needs much more updates. --Zblace (talk) 19:06, 5 July 2021 (UTC)[reply]
  • "... the risks of having separate Wikipedias for parts of pluricentric language communities." There are quite a few of these communities, are there not? The decision as to whether something is a "dialect" or a separate "language" is always political as well as linguistic, and the decision as to whether there should be a separate Wiki for some dialect/language is equally fraught. What controls do we have, if any, to prevent other Wikis from being splintered as Serbo-Croatian was? Bruce leverett (talk) 22:22, 27 June 2021 (UTC)[reply]
    There was no split; the other three Wikipedias were founded later, starting from zero. Serbo-Croatian is a language as much as Scandinavian: it's good for linguistic book-keeping, but not a language taught in schools. The three Scandinavian countries have not three but four wikis (2×N, D, S ), yet no one calls that a failure. In the end, this wasn't about languages, but about people and their abuse of power. And that can happen in any Wikipedia, even the polycentric ones. Ponor (talk) 13:44, 28 June 2021 (UTC)[reply]
  • Very few would agree that the split of these language communities should have been done - and it is rare, with it not occurring really post-2010. However, merging these communities now would be functionally impossible. Merging is a tough task, and doing that for many thousands, if not tens of thousands, of different articles and differing policies would be a nightmare. Nosebagbear (talk) 22:51, 27 June 2021 (UTC)[reply]
  • I would amend the report’s recommendation to "strengthen global governance" to "strengthen global governance for small projects". We all know the dangers of this when applied to large ones like enwiki. —pythoncoder (talk | contribs) 00:51, 28 June 2021 (UTC)[reply]
  • This is welcome news, although I note that the long-term POV-pushing on hr will take a long time to address. As someone who edits in the Balkans space on en wiki a lot, but who can read Latin script sh, I agree with Nosebagbear that the split of sh wiki was a very bad idea and enabled a lot of nationalist claptrap, but that it is essentially now impossible to put the pieces back together. We should robustly examine any such proposals in future. Peacemaker67 (click to talk to me) 03:25, 28 June 2021 (UTC)[reply]
  • I remember the original Signpost article about the Croatian mess. Glad to finally see some progress on this issue, though I share concerns about what pythoncoder alludes to—that this will only be used by the WMF as evidence that they need to assert themselves over communities that don't really need their oversight. Their lack of attention on this issue over a decade until enough enWiki users joined the chorus of complaints speaks volumes about what their priorities are. -Indy beetle (talk) 04:51, 28 June 2021 (UTC)[reply]
  • As discussed in length on the discussion page of the report on Meta, i think that this whole report is a PR stunt and that the „expert“ who wrote it is a fraudster. It's quite telling that neither the author nor any of the Wikimedia people tried to answer any of the questions or take part in a serious debate. The so called expert has no idea of the inner life of the 4 projects. The purpose of the report was to whitewash Wikimedia and LangC in front of an audience which has very little to do with hr,srwiki&co whatsoever. It's a propagandistic device. The report and it's conclusions thus should be rejected in it's entirety. And Wikimedia should apologise for a smear campaign. --Ivan VA (talk) 07:59, 28 June 2021 (UTC)[reply]
    • I believe it's Ivan who said this elsewhere, and I still do not believe the report whitewashes LangCom in anyway. They don't have the authority in their remit to reunite projects against their will, and the fault for the original unwise split is laid very clearly at their doorstep. Wikimedia is everyone involved, so you need to clarify if you mean the WMF Nosebagbear (talk) 10:23, 28 June 2021 (UTC)[reply]
      • @Nosebagbear: As i said in the discussion (and answered to u there as well), the people who order this kind of report and, not for the first time, play the same music like in this report. I call them Wikimedia. I rly don't know how else to call them, coz i don't know the inside of the organisation. Anyway, let's call them „the guys who have the power to order such an report and then put the Wikimedia signature on it“. As for the merit, it's a clear whitewash. As i already have said to u on the discuss. page, they play the double game. They say we'd like to see them merge, but it's up to them not us. Meaning, if these communities (sr.wiki&co.) don't listen to the advice of the heads of the (wiki) movement, and that of the 7 scientists at LangC, then the kind of stuff/encyclopedia they are creating there is suspicious, odd etc. And from there is the explicit leap to the (bad) N word — nationalists. Basically labeling the efforts of hundreds of contributors as nationalistic trash. And, as the reports says, if these people reject and rebel against the label, it just reinforces the statement made in the report. And for what? For a group of people who claim to the world that they always act enlightened, to show a clean cheek and hide the dirty stuff in their closet. And, by my standard, look even more stupid than they did when making that decision to split. Coz of defiance to logic (exploring the implications) — whole communities have been built around that decision. And community reasoning is different than that of 7 linguists.
      • As for the 2nd merit, u said fault for the original unwise split...Who said, in reality, it was unwise? I can easily bring u the arguments to refute that. As im familiar with the situation on sr.wiki and in Serbia (and Bosnia somewhat), i can tell u that sr.wiki is a success story. U can compare the market stats yourself. Sr.wiki is 10 times or more being read daily in these 2 countries than Sh.wiki. It's in the top 10 visited sites every month. The market doesn't lie. The customer has a right to choose which product he want's to buy/use. So, as a product, it beats the competition (sh.wiki) easily...And that's why i said it's a label (nationalistic), the WP people use. Meaning, has no substance. How do these people know that 1 wikipedia would be better than the 4 existing? The market tells a different story. Perhaps the readership base would go buy another product? Secondly it's a label coz it tendentiously exacerbates an argument made by a small number of people on that merger discussion which starts now-and-then, mostly on Meta pages. That the merger doesn't take place coz the editors are overwhelmingly nationalists. I have to disappoint u, but that's not the reality of the most of the opinion. It's conservatism regarding the projects. As i told u, it's a success story. Why jeopardise that? Risk a leap into the dark? For what? U can make a cost-benefit analysis yourself with the assumptions i wrote above, as a poker player, and u'll see that the benefits are unknown, the loses present (with a varying impact) and that the status quo is too good to be changed. It's not vitriol what keeps the merger from happening, it's the success of the project. And the broad social legitimacy it enjoys. It's going too good to change anything. --Ivan VA (talk) 11:40, 28 June 2021 (UTC)[reply]
        • That more people read a distinct project now doesn't have any bearing on whether it shouldn't have been split to start, unless you have some way of clearly demonstrating that no-one would read it had it always been a singular combined project. So that reasoning is a wash. Wikimedia is the movement (projects, affiliates, Foundation etc), the Wikimedia Foundation (WMF) is the actual non-profit whose Trust & Safety team commissioned the report. The report has a number of examples and reasoning on why having the project split is a bad idea, not to mention the duplication of effort. You also seem to then morph your argument into saying you shouldn't now be merged...which isn't something I propose, as you should know from our discussions on meta and if you read my comments above. I assume you did read the previous discussion before commenting on it? I'd also note that lots of people reading something isn't really grounds for not changing something because it is a source of conduct issues - that's akin to the argument that productive editors should never be blocked. One primary point about being a nonprofit is we aren't obligated to make choices depending on what drives the most traffic. Nosebagbear (talk) 12:20, 28 June 2021 (UTC)[reply]
          • U didn't go in to the merit of my argument about the volume of traffic, u just recanted why it shouldn't be taken into account. U didn't go into the *why*, but if it is good or bad. As for your response, i didn't say that **LangC** should take traffic volume into their account when granting projects. As for the argument itself and the motive behind it, i think it's a noble one; (btw, i use the markets analogy as a metaphor, not meaning literally, i know this is a non-prof but none the less). A lot of ppl who become editors (me, partly included) do it for the goal of giving. They want to share knowledge with the world. My motivation for writing would drop if i knew no one reads the articles. So i want the the stuff i'm writing about to be read. I'm quite sure i'm not alone in this. Which brings me to the *why*. U have it black on white that sr. and hr.wiki have a much much bigger readership than sh.wiki. Theres no manipulation. All 3 options are on the table, on display, transparent, and people choose freely to opt much more for sr. wiki than for sh.wiki. No coercion no force. Obviously there is something in reality (i indicated what in the discussion on Meta) which makes people go much more to sr. than sh.wiki. As for the report, i doesn't go into that something. The report is a travesty. All what is being said in the report on this issue is completely false, has no value whatsoever. I said what i think about that, don't wanna repeat it. As for the actual impact of this report or the discussion we are having right now...there is none. The purpose of the report was for WMF to extract themselves from the merger debate and put the blame of a result they (and some people around them) see as unsatisfactory onto the communities badmouthing them. Implying, that it's up for the communities to decide if they wanna merge or not. Since this is the case, i rly don't see any point in this discussion being taken here. If u want to propose a sr&sh wiki merger, there are official procedures u can do that. Go to the sr.&sh. wiki village pumps, make a call for a debate, it will probably last for a month, then it will got to the ballet box, and we'll see what happens. --Ivan VA (talk) 15:12, 28 June 2021 (UTC)[reply]
Croatian and Serbian languages are not variants of the so-called Serbo-Croatian language. There was a great political effort in the past to make Croatian and Serbian one language, and Croats and Serbs one nation. Neither occurred; regrettably if you ask some people today. A number of those people with regrets exist even among linguists; and also among wikipedia editors - mainly among the editors of Serbo-Croatian wiki.
You have there, though, closely related languages: something like Swedish, Norvegian and Danish, which are closely related, but NOT variants of the same language. The clue to understanding the "complication" with Serbo-Croatian is in Bosnia and Herzegovina (it is, between the territories of Croatia and Serbia) - where for a very long time the schools thought the children (Croats, Serbs, Bosnians) the same language. Well, this language ("Serbo-Croatian") is - more or less the modern Bosnian language. Speakers of Croat and Serbian languages understand Bosnian, too (like speakers of Slovak understand Czech language, but the attempt to make Czechoslovak language - failed). However, only Croats from Bosnia and Hercegovina and only Serbs from Bosnia and Hercegovina tend to be proficient in the Bosnian language.
I will cite the examples from here: https://hrcak.srce.hr/30869
A slight difference is demonstrated by:
Sr. (Serbian) Sačekaj minut da uporedim tvoja i moja dokumenta.
Cr. (Croatian) Pričekaj minutu da usporedim tvoje i moje dokumente.
‘Wait a minute so I can compare your documents and mine’.
Greater differences are demonstrated by the following:
Sr. Što ga biješ?
Cr. Zašto ga tučeš?
‘Why are you beating him?’
Sr. U januaru sam rešio da uradim sve što me ranije mrzelo.
Cr. U siječnju sam odlučio učiniti sve što mi se ranije nije dalo.
‘In January I decided to do everything I didn’t feel like doing before’
We could even make up similar or identical phrases that have different meanings in the two languages, or in fact only one of them, while in the other they may sound as nonsense: suprotni pol Sr. ‘opposite sex’ Cr. ‘opposite pole’; Zemljina osa Sr. ‘Earth’s axis’; Cr. ‘Earth’s wasp’ ; prava stvar Sr. ‘straight thing’, Cr. ‘real thing’.
Sr. Odojče igra na zraku. ‘An infant is dancing on the ray’.
Cr. Odojče se igra na zraku. ‘A piglet is playing on the air’.
Or: Pravi zrak igra svoju igru.
Sr. ‘Straight ray is dancing its dance/Real ray is playing its game’.
Cr. ‘Real air is playing its game’.
Kad počinje slovenski čas?
Sr. ‘When does the Slavic lesson begin?’
Cr. ‘When does the Slovenian moment begin?’ RadioElectrico (talk) 15:22, 6 July 2021 (UTC)[reply]

















Wikipedia:Wikipedia Signpost/2021-06-27/Disinformation_report