The Signpost

Technology report

On the indestructibility of Wikimedia content

WMF wiki content now almost indestructible


The content of Wikimedia wikis has recently moved significantly closer towards indestructibility, it was announced this week by WMF developer and data dumps specialist Ariel Glenn.

Masaryk University, in the Czech Republic, is one institution now mirroring Wikimedia dumps.

Specifically, data from all Wikimedia wikis is now being successfully replicated to three non-WMF sites around the globe: C3L in Brazil, Masaryk University in the Czech Republic and the servers of Your.org in the United States. Each site holds ("mirrors") at least five monthly snapshots ("dumps") of the publicly available wikitext-based content of all of the many hundreds of Wikimedia wikis. Your.org also hosts a copy of all previous dumps and will hold a single snapshot of all publicly viewable media. Moreover, Glenn reports, "getting the bugs out of the mirroring setup [has made it] easier to add new locations" as well as providing the latest snapshots to already established mirrors. As reported then, the first dump mirror came online in October last year, but this is the first time so many have been available concurrently.

Increasing the number of mirrors—made possible by the free licensing of Wikimedia wikis—helps to ensure that content is sufficiently accessible and geographically diverse to survive natural and artificial disasters; while multiple websites do host live copies of the English and other major Wikipedias, dump mirroring is particularly useful for protecting the content of smaller wikis, which do not enjoy such protection; the same used to be the case of the English Wikipedia, whose 2001 articles were long thought to be lost until old backups were uncovered in December 2010. Theoretically, dump mirrors could also offer better download speeds at times of peak usage, but that is unlikely to be a primary use case for Wikimedia wikis.

Of course, not everyone is so concerned at the possibility that Wikimedia's content might be destroyed in the immediate future, dump mirrors or no dump mirrors. As WMF Lead Platform Architect Tim Starling commented in a 2011 discussion of forking Wikipedia, "the chance of [WMF financial collapse] appears to be vanishingly small, and shrinking as the Foundation gets larger. If there was some financial problem, then we would have plenty of warning and plenty of time to plan an exit strategy. The technical risks (meteorite strike etc.) are also receding as we grow larger". That discussion focussed rather less on the technical aspects of making Wikimedia content indestructible, and more on allowing separate communities to emerge if Wikimedia communities broke up.

In brief

Signpost poll
Bugzilla
You can now give your opinion now on next week's poll: Which of the following do you consider the greatest threat to Wikipedia?

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.

  1. JYBot, modifying, adding and removing interwiki links. At the time of writing, 16 BRFAs are active. As usual, community input is encouraged.

















Wikipedia:Wikipedia Signpost/2012-05-21/Technology_report