The Signpost

In focus

WikiLoop DoubleCheck, reviewing edits made easy

Macruzbar formerly worked for the Wikimedia Foundation, as the communications lead in the Community Engagement department. She now works in the Google Open Source program office, where she leads the community engagement program.

An authority, according to Clay Shirky, "is a person or institution who has a process for lowering the likelihood that they are wrong to acceptably low levels." (see full quote at right). Do you want to develop into such an authority? Do you want to review other people's edits, using scores from ORES while helping to improve the ORES edit quality prediction model? WikiLoop DoubleCheck helps you do this by making the peer-review process of Wikipedia a collaborative effort that anyone, including non-registered users, can use. It is an open-source, crowd-sourced counter vandalism tool for Wikipedia and Wikidata. WikiLoop DoubleCheck is built on web technology and can be launched quickly from either desktop or mobile phone without installing resident software. Its goal is to reduce the barriers for editors to assist in patrolling Wikipedia revisions.

What is WikiLoop?

WikiLoop is an umbrella program for a series of technical projects intended to contribute datasets and editor tools from the technical industry back to the open knowledge world. This program was originally conceived as a virtuous circle: providing data and tools to enhance human editor's productivity, and making the Wikipedia editorial input more machine-readable for open knowledge institutions, academia and researchers interested in advancing machine learning technology. It originated at Google as the missing link in the data loop: the Knowledge Graph relies on the open knowledge source to be healthy. This is the reason why the program focuses its efforts on editor tools that can improve the content quality of Wikipedia.

Learn more about the program on its page on Meta. You can also try the tool now, and leave feedback or comments on the tool's talk page on English Wikipedia.

The encyclopedia that anyone can review

WikiLoop DoubleCheck in action.

WikiLoop DoubleCheck (WLDC) works on a different premise compared to tools like STiki and Huggle, which both require rollback permission to use. WLDC intends to move to a tiered, trusted model: just like Wikipedia aspires to be the encyclopedia that anyone can edit, with permissions given to editors based on account seniority and their editing activity, WikiLoop DoubleCheck explores how to grant everyone an ability to review and label a revision with their opinion, while allowing higher-tiered (trusted) editors (such as admins or those with WP:Rollback permissions) to conduct faster and more powerful actions (e.g., direct-revert) with the tool. It allows anonymous users or less-experienced (or not-yet-trusted) editors to review and conduct actions with lower risks, while gradually building up their reputation using the tool.

Using DoubleCheck also helps to improve the ORES prediction model. While the tool displays scores from ORES and other anti-vandalism tools, like STiki and Huggle, there is also a feedback loop: tags added by editors on the tool are sent back through a route called JADE that improves this machine learning model with each revision.

Visit the WikiLoop DoubleCheck web app to start reviewing content on Wikipedia. For the direct-revert feature which is available to more experienced editors, visit DoubleCheck on WMF Labs, which is hosted on the Wikimedia Foundation's Cloud VPS.

Building WikiLoop DoubleCheck together

WikiLoop DoubleCheck didn't always have that name. About a year ago, when the prototype of the tool was launched and shared with the English Wikipedia community, several editors raised concerns about the tool's original name: Battlefield. With input from users Sadas, Xinbenlv, ElanHR, Nizil Shah, ToBeFree, Nick Moyes, and others who provided new name ideas, and a community vote, the tool was recently re-named DoubleCheck.

If you would like to get involved and contribute to WikiLoop DoubleCheck, here are two things you can do:

Other resources

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

Initial reactions

It sounds like a useful tool, but sorry to say, the article is rather incomprehensive for a layman. A dense combination of PR babble with techtalk. Taking it seriously, I have re-read it 3 times but could not make heads or tails of it: how can I use it and how specifically will it allow me to improve wikipedia. If the author wishes, I can comment on the text nearly line by line, but I have to be sure that I was heard, otherwise I'd rather waste my time on something equally useless, such as writing up something like "Administrative-command system" nobody seems to care about :-) Staszek Lem (talk) 22:38, 2 August 2020 (UTC)[reply]

@Staszek Lem: thanks for leaving your feedback. I am Xinbenlv, the lead developer of this tool. I understand you hope the explanation could be clearer, we will definitely work harder to improve our communication. In the meanwhile, please don't hesitate to visit the tool page when you have time and try the tool yourself, and see if using the tool could make it more clear! xinbenlv Talk, Remember to "ping" me 23:07, 2 August 2020 (UTC)[reply]
I see. You are basically saying "bug off, we know better" in a well-rounded PR way. I understand you are from google. If a customer of a small company received this kind of reply to his suggestion of help with an improvement, they would drop your tool on the spot. Staszek Lem (talk) 23:21, 2 August 2020 (UTC)[reply]
Not all, we sincerely appreciate your feedback and we pledge to improve our communication, and I mean, we will need some time to plan for better instrucitons such as video recordings, workshops etc. or better text descriptions in the future. But please absolutely feel welcome, no "bug off". Any feedback is great, we are all ears here! xinbenlv Talk, Remember to "ping" me 06:37, 3 August 2020 (UTC)[reply]
  • I agree that it could have been communicated better, but the tool itself is self-explanatory and seems very useful for those doing anti-vandalism. (t · c) buidhe 05:03, 3 August 2020 (UTC)[reply]
@Buidhe: thank you, we hope it makes your reviewing easier, if any suggestion of how we could improve it is appreciated!xinbenlv Talk, Remember to "ping" me 06:37, 3 August 2020 (UTC)[reply]
I have tried it right now, and I liked it, but after reading the text like WikiLoop is an umbrella program for a series of technical projects intended to contribute datasets and editor tools from the technical industry back to the open knowledge world - I was kinda hesitant to click "try the tool now", just like my mom fears to click anything on Skype. I was not sure I wanted to try "a series of technical projects to contribute datasets", because the first thing popped in my brain was "GitHub". Staszek Lem (talk) 05:27, 3 August 2020 (UTC)[reply]
Thank you @Staszek Lem:, I totally understand that sentiment and I am a Wikipedian myself. xinbenlv Talk, Remember to "ping" me 06:37, 3 August 2020 (UTC)[reply]
Two things are definitely missing: the "Undo judgement" button and "Edit". Both have the same workaround I quickly found: there is "rev." link leading directly into wikipedia, so I may find "Edit" functionality covered. (still, minor on-the-fly edits be handy, but that's sugar) But "Undo judgement" is tool-internal, and if the tool's scores are based on some kind of "human-assisted machine learning", then my wrong "judgement" may skew it. And in this case the "undo" has a certain importance. Staszek Lem (talk) 05:40, 3 August 2020 (UTC)[reply]
Thank you, I filed this feedback as issue#317, and technical design and implementation updates will show up there. Feature wise, we originally think that "undo judgement" can be done by clicking "Not sure". There is our reasoning: even though we know that "undo judgment" means "delete my judgement on this revision" and "Not sure" means the judgement will be stored as "not sure", we originally wanted to keep only the "not sure", because we think if a reviewer care enough to undo, they may also want to keep them as "not sure". The difference is that not sure means the revision is at least not obviously a vandalism or damaging, and such non-obvious-ness is also useful for some of the machine learning researchers. When you worry your "wrong judgment" may skew it, I really appreciate your sense of responsibility. We want to assure you that even though we are working on supplying our data to other Wikimedia movement efforts such as ORES / mw:JADE and en:WP:ClueBotNG, but I think some individual revision assessment, even if wrong, is at acceptable tolerance to training the machine learning model. Unless, of course, if a reviewer happens to be not good faith, and continuously supply reversed assessment - just like people can vandalism Wikipedia's editing, allowing everyone to review means we need to find way to avoid reviewing being vandalised too. We will publish our proposal of imposing trusted user model in the upcoming weeks. Please stay tuned. xinbenlv Talk, Remember to "ping" me 18:10, 3 August 2020 (UTC)[reply]
Sorry, I have already deleted the comment you are responding (but you restored it, probably edit conflict), because after some time the "Undo" button suddenly started appearing. Probably it was a glitch in my browser. I am running an old version of Linux with Chrome at home, because I am lazy to do upgrades. Staszek Lem (talk) 20:17, 3 August 2020 (UTC)[reply]
Minor nitpicking: the "i" icon does not have a tooltip. And "Active Users" is the only all-caps tooltip. Staszek Lem (talk) 05:55, 3 August 2020 (UTC)[reply]
filed as issue#318, and issue#319, will address soon xinbenlv Talk, Remember to "ping" me 18:10, 3 August 2020 (UTC)[reply]
I just fixed this. Thank you! xinbenlv Talk, Remember to "ping" me 22:10, 3 August 2020 (UTC)[reply]
My window shows "index feed", whatever it means, but "Featured feeds" does not list it, so after switching to "ores feed" I cannot get back to the default one via GUI, fortunately I managed to accomplish this via the browser's History functionality (BTW, clicking "History" widget of the tool gave me "Application error" screen, again, recoverable throught browser's History). Staszek Lem (talk) 06:26, 3 August 2020 (UTC)[reply]
I am glad that you find the tool interesting to you and start to try them out. Certainly there are many features we could do better and probably many bugs we need to fix. It's 11:35pm at my timezone, I will come back to carefully read your feedback tomorrow and put these feedbacks into our bug and feature trackers to start working on them. We value your feedback a lot xinbenlv Talk, Remember to "ping" me 06:37, 3 August 2020 (UTC)[reply]
@Staszek Lem: You do have a very good acumen of software, yes there are issues with index feed and other feeds. In fact, index feed is the default feed that's the Version 1 of our feed mechanism. The other featured feeds are newer version, Version 2 of feed mechanism. They are currently under gone fast iteration of development and sometimes buggy. I have filed your described behaviors as issue#319. Thank you @Staszek Lem:, you won the "champion of user feedback!", if only I have the WP:CIR to create a better barnstar for such awards. We develop software but we really need users like you who gave us feedback like this! xinbenlv Talk, Remember to "ping" me 18:10, 3 August 2020 (UTC)[reply]
xinbenlv Talk, Remember to "ping" me 21:30, 3 August 2020 (UTC)[reply]
  • However useful the tool may be, this description is not. I read through it several times and still had no idea what it does and how (I gather Staszek Lem, above, had the same problem). The only sentence that seemed to communicate something helpful was It is an open-source, crowd-sourced counter vandalism tool for Wikipedia and Wikidata. - everything else feels ancillary or even obfuscatory. If you want people to just try the tool and "get it", well maybe that works, but for those who read this piece trying to find out whether they should try it, it's probably a miss. --Elmidae (talk · contribs)
Thank you for the feedback. We will iterate our way of communication based on this feedback. Thank you @Elmidae:! xinbenlv Talk, Remember to "ping" me 23:40, 3 August 2020 (UTC)[reply]

I noticed the introduction mentions ORES' article quality model, but from reading the whole piece it seems it instead uses ORES' edit quality prediction models? The latter is what predicts reverts and bad faith edits (depending on the model), whereas the former predicts article quality classes (such as the English Wikipedia's content assessment ratings). Cheers, Nettrom (talk) 02:56, 4 August 2020 (UTC)[reply]

@Nettrom:, good catch! we actually only use edit prediction model not article prediction model. @Macruzbar: could you help update: change the improve the ORES article quality prediction model? `s article to edit. xinbenlv Talk, Remember to "ping" me 04:18, 4 August 2020 (UTC)[reply]
Done! Thank you for catching that, @Nettrom:. Macruzbar (talk) 22:16, 4 August 2020 (UTC)[reply]

"ORES scores Considered Harmful"

This seems to be another interface for recent changes. I tried it a couple of times. The first time, the ORES prediction was wrong, saying it was bad faith when it wasn't. The second time, it was some sort of WikiData change, which was incomprehensible. What makes the tool useless for me is that there's no context or filter – it's just a stream of arbitrary, random changes. As it takes time to digest the context for each change, this is not efficient. Only button-pushing gnomes are likely to use this and the result seems likely to be low value-added. Andrew🐉(talk) 20:22, 4 August 2020 (UTC)[reply]

Well, they write they are working on feed customization. I was planning to suggest to reuse the existing filter-bots, such as User:AlexNewArtBot/PolandSearchResult. Also, you will be surprised to learn how many BP-gnomes-patrollers are around. :BTW I suggest to exclude wikidata from standard feeds and put it into a dedicated feed, because only wikidata buffs can make sense of it. And personally, I think wikidata is over-engineered to the degree of uncomprehensiveness, which explains your observation (and mine as well, but I simply disregarded it). Staszek Lem (talk) 20:38, 4 August 2020 (UTC)[reply]
I would not be at all surprised at the number of button-pushing gnomes as it's already my observation that this sort of low-grade busywork dominates the Wikipedia edit stream. Typically, I start an article which requires some research and care to draft the text. You then get a stream of edits in which gnomes make minor tweaks or run scripts to do things like fiddle with the length of dashes, tinker with the categories or just amend the amount of whitespace. The worst are the editors with high edit counts who will find any excuse to make another edit and so boost their score. Giving such editors a tool like this is dangerous as they will be inclined to follow the ORES recommendation, regardless of its accuracy, and just punch the buttons as fast as they can to maximise their score. Andrew🐉(talk) 22:18, 4 August 2020 (UTC)[reply]
As if they are not doing this right now. I have the same experience: I barely manage to save a new stub and get slapped with half dozen of ridiculous hatnotes. It is just as easy to hit "undo" using Twinkle. Although I see your point about the score: maybe it is a good idea to hide it, forcing human brain to make the unbiased decision as an independent check against the "AI/Borg takeover". Staszek Lem (talk) 00:21, 5 August 2020 (UTC)[reply]
@Andrew Davidson:, @Staszek Lem:: thank you for your feedback. Let me try to summarize what I learn and put them into our issue tracker to follow up addressing those. If I understand it correctly, some I will answer directly. Some I will file bugs to follow up development on:
1. ORES Prediction is wrong(1) - this is actually part of the reason we create WikiLoop DoubleCheck: AI will never (at least for a foreseeable future) be able to be as good as human being. In the end our tool is assisting human Wikipedian reviewers to review it, we didn't build a bot nor do we intend too. The WikiLooop DoubleCheck only provides ORES score as a reference. Meanwhile, ORES is a score developed by the WMF foundation. We look forward to other 3rd-party scoring systems to provide even more different scores in the future.
2. ORES Prediction is wrong(2), and another usage of WikiLoop DoubleCheck is to harness editor's assessment and provide them to machine learning algorithms to better train the models. In the interest of transparency and usefulness, we make it 1-click away to download from the home page.
3. No context or filter - revisions shows up random and arbitrary: Filed as issue#323. Yes, we start with a pure recent change so new reviewers can jump in and start reviewing with least experience required but also given least reliable assessment. You ask this question probably because you are more experienced and advanced reviewers who is already using other tools such as watch-list and filtering. We plan to provide such functionalities and even more allowing reviewers to review topics of their interest and domain expertise. Stay tuned.
4. reuse the existing filter-bots: filed as issue#324 this is new to me. Thank you, I will look into them
5. exclude wikidata from standard feeds and put it into a dedicated feed: filed as issue#325, agreed, thank you for point out.
6. The worst are the edtors with high edit counts who will find any excuse to make another edit and so boost their score., at current state, the tool itself doesn't provide faster editing than discovering a revision on one's watchlist and revert them on Wikipedia page. We do allow direct edit but it will currently require ROLLBACK permission just like other tool. In the future, we plan to work on features that cross check the review accuracy between users (part of the reason of having a name called "DoubleCheck"), and also giving more trust worthy reviewers more power while reduce or ignore the reviewers who provides lower quality, accuracy or even vandalising their assessments, based on other reviewer's opinion. Stay tuned for this part as well.
7. forcing human brain to make the unbiased decision, agreed, the current index feed is a version 1 and soon to deprecate, the newer version is featured feed such as http://doublecheck.wikiloop.org/feed/covid19 which requires an extra click on "show judgement" then it will show other reviewers judgement and AI scoring judgements as a reference. Thus we hide the score when reviewers doesn't explicitly ask for them to foster unbiased decision, while still provide them as an option when reviewers want them. We will, however, store the information whether the judgement is provided with such references shown, so it can be looked up and filtered out when doing machine learning training. xinbenlv Talk, Remember to "ping" me 01:58, 5 August 2020 (UTC)[reply]
Again thank you very much for your feedback and we understand there is still a long way to go to make it more useful and powerful for experienced and advanced reviewers. xinbenlv Talk, Remember to "ping" me 01:58, 5 August 2020 (UTC)[reply]
You're welcome. It's good see that the observations are being noted and followed up. Andrew🐉(talk) 11:09, 5 August 2020 (UTC)[reply]

"Rat race" against bots

During a prolonged usage, several times when I clicked "revert" I was coming to a page from which I saw that someone else did this already. I do not mind if some quicker-minded Wikipedian beats me to a punch, but I hate the idea of competing with artificial intelligenicies :) Why don't you filter the feeds through the existing anti-vandal 'bots before pushing it to the live meat? So that I waste less of my editing time. Staszek Lem (talk) 17:31, 4 August 2020 (UTC)[reply]

@Staszek Lem: Could you point me to the revision ids so we could look into it? Thank you! xinbenlv Talk, Remember to "ping" me 06:20, 5 August 2020 (UTC)[reply]

"WMF" part of tool unreachable

Xinbenlv: As of this moment the link you provided for the version of the tool for trusted users is unreachable. Asaf (WMF) (talk) 02:16, 12 August 2020 (UTC)[reply]

@Asaf (WMF):: Hi ~ Thank you, it currently only works with HTTP that than HTTPS so if you click on the original link, it shall work directly, unless, the browser changed HTTP to HTTPS automatically xinbenlv Talk, Remember to "ping" me 21:17, 19 August 2020 (UTC)[reply]
@Xinbenlv: oh, indeed! Is there a plan to switch to HTTPS? These days, it feels very wrong to use HTTP, especially for a service provided by Google. Asaf (WMF) (talk) 09:07, 20 August 2020 (UTC)[reply]
WikiLoop DoubleCheck's production instance, https://doublecheck.wikiloop.org supports HPPTS. The instance developed on WMF Cloud VPS, http://wmf.doublecheck.wikiloop.org will soon go to HTTPS as well, there are, however, some technical challenges we need to resolve. xinbenlv Talk, Remember to "ping" me 17:06, 20 August 2020 (UTC)[reply]

















Wikipedia:Wikipedia Signpost/2020-08-02/In_focus