The Signpost

Technology report

Wikimedia down for an hour; What is: Wikipedia Offline?

Wikimedia wikis down for an hour

As noted in last week's "Technology Report", Wikimedia wikis underwent a scheduled downtime of one hour on Tuesday 24 May at around 13:00–14:00 UTC. The downtime meant that the Foundation has already missed previous aired targets of limiting downtime to just 5.256 minutes per annum (equivalent to 99.999% uptime) and 52.6 minutes (99.99% uptime) for this calendar year. However, the work does appear to have been successful at reducing the quantity of out-of-date pages served to readers and other similar problems.

During the downtime, designed to allow the operations team sufficient time to "update the router software and tune the configuration", access to Wikimedia sites was intermittent. The episode and associated issues was alluded to by cartoonist Randall Monroe on his comic strip xkcd (see also this week's "In the news" for more details). Wikimedia developers enjoyed dissecting the technical aspects of the cartoon on the wikitech-l mailing list.

What is: Wikipedia Offline?

Related articles
What is...?

Wikimedia Labs: soon to be at the cutting edge of MediaWiki development?
23 April 2012

MediaWiki 1.20wmf01 hits first WMF wiki, understanding 20% time, and why this report cannot yet be a draft
16 April 2012

What is: agile development? and new mobile site goes live
12 September 2011

The bugosphere, new mobile site and MediaWiki 1.18 close in on deployment
29 August 2011

Code Review backlog almost zero; What is: Subversion?; brief news
18 July 2011

Wikimedia down for an hour; What is: Wikipedia Offline?
30 May 2011

Bugs, Repairs, and Internal Operational News
25 April 2011

What is: localisation?; the proposed "personal image filter" explained; and more in brief
21 March 2011


More articles

Many Wikipedia editors can now access the Internet from multiple locations: at home, at work, even on-the-go with smartphones. In 2010, however, only 30% of the world had any access at all to the so-called "World Wide Web", even when the high rates of availability found in the developed world are allowed to skew the data (source: CIA World Factbook). Since the Wikimedia Foundation's aim is to "encourage the growth, development and distribution of free, multilingual content", it is clear that either the remaining 70% will have to be supplied with the Internet so they can access the online versions of Wikimedia wikis, or the Wikimedia wikis will have to be provided in an offline-friendly format (in contrast, 50% of the world has used a computer, according to Pew Research). The "Wikipedia Offline" project, then, is a WMF initiative aimed at spreading its flagship product freely to the two billion people who use a computer but cannot access the Internet.

There are two parts to the challenge: firstly, in ensuring that there are Wikipedias in as many languages as possible. The number of users for whom a Wikipedia exists in a language they speak was recently estimated as above 98% (foundation-l mailing list); about 82% have a Wikipedia in their native tongue (also foundation-l). The second challenge is the technical one of supplying the information. A current strategy of the Foundation is to continue to make the raw data of Wikipedias available via so-called "dumps", while simultaneously supporting open-source programs that can process these files. In combination, this will allow whole Wikipedias to be either downloaded when an Internet connection is available, or to be shipped on DVDs or other portable media. This runs alongside the Foundation's existing project to select the most useful articles from a given Wikipedia, hence condensing an encyclopedia onto a single CD.

While "dumps" are largely tried and tested (though recent work has focussed on improving their regularity and reliability), there have also been efforts to enable the export of smaller "collections" of articles, for example those relating to major health issues faced by developing countries. This was in part provided by a new export format (ZIM, developed by the openZim project) that can be read by some offline readers. However, ongoing efforts focus mainly on the second half of the strategy: the provision of a good-quality reader capable of displaying off-line versions of wikis. A number of possible readers were tested. The "Kiwix" reader was selected in late 2010, and the Foundation has since devoted time to improving its user interface, including via the translation of its interface. There is also competition from other readers, including "Okawix", the product of the French company Linterweb. User:Ziko blogged last week about the differences he found between the two. Which, if either, will become the standard is unclear, because it is such a fast-moving area.

See also: Wikimedia strategy document, update on Wikimedia's progress (as of March 2011).

In brief

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.


















Wikipedia:Wikipedia Signpost/2011-05-30/Technology_report