The Signpost

News and notes

One decade of Wikisource; FDC recommendations raise serious questions

Editor's note: last week's issue, which would have been published on 27 November, was skipped.

Wikisource turns ten

The sister project Wikisource, the digital library that hosts free-content primary sources, is now a decade old. Wikisource, which now has versions in 63 languages, is the sixth type of project to reach its ten-year milestone and will be the last until 2016.

Working on Wikisource is fundamentally different from Wikipedia. Most editors first start by uploading a pdf or djvu file of a source work; there is no notability standard required beyond it having been professionally published, and the Proofread Page extension gives Optical character recognition-based text that has to be proofread. Translations of these works and author bibliographies are also accepted, while original writings are delegated to Wikibooks. The project also offers interwiki links to relevant articles in the Wikimedia-verse, annotation, different editions of the same works, metadata, and ease of classification.

Highlights on the English Wikisource include items as varied as poetry, laws, constitutions, US Supreme Court decisions, modern novels, short stories, children’s literature, science fiction, and scientific papers. Wikisource also has extensive author indexes and featured texts such as A Jewish State (1896; 1917 translation).

Project Sourceberg, as Wikisource was first known, arose in 2003 because of edit wars on the English Wikipedia over the inclusion of primary sources. The name did not last long; several subdomains and a vote later it was renamed "Wikisource". The project has since developed its own community and has forged collaborations in its own right with prestigious institutions such as the US National Archives and Records Administration and organized the transcription of major portions of very large works like the Dictionary of National Biography and Popular Science Monthly. There are 61 active wikisource projects, and two closed projects. Haitian was closed because it was a tiny jumbled mess. Old English Wikisource closed because it is a dead language.

John Vandenberg has had an active presence on the English Wikisource for many years. He told the Signpost that among the strengths of Wikisource are its simplicity of use for new contributors, and that disputes are rarely about content, the bane of Wikipedia politics. "Instead, community debates tend to have concerned stylistic faithfulness to the original—or more technically, the provenance of the material."

Vandenberg says that many contributors are dedicated librarians and archivists. "Some ten multilingual users travel between the main versions—the English, the French, and the German Wikisources—providing at least some cohesion between the sites", he points out. The French site has historically emphasised reader-friendliness, with much attention given to the look of the pages. The German site has been more concerned about faithfulness to sources, and it was that project that first introduced the technology Proofread Page, in 2008, which allows much more control over the uploading of text and images of a range of file-types; at the same time, the German community banned what had become the mainstay of Wikisource uploads on all language versions: what is colloquially known as "dumping". The English site still allows dumping, but encourages the use of the new technology. Interestingly, he says, this occurred at around the same time that the main Wikipedias started insisting on the proper verification of claims in articles.

A significant challenge nowadays, says Vandenberg, is textual criticism—adding annotations to a text—which needs developer input to integrate it into the wiki system. "There's a good application called TEI (Text Encoding Initiative) for academics that allows contributors to add a semantic layer on top of raw text; but it needs to be made compatible so that it maintains the features of a wiki and at the same time doesn't become too complicated for new users."

Having met the major milestone of a ten-year anniversary, Wikisource editors have been commemorating it with a proofreading contest; this includes prizes for the winners funded by the UK Wikimedia chapter. Over this long period of time, lessons have been learned, and there have been major accomplishments—but what does this achievement mean to the editors who work there, and where will they go from here?

AdamBMorgan points to the Dictionary of National Biography and Popular Science Monthly transcriptions as major victories for Wikisource, but believes that the site must "de-mystify" itself to the general public. Inductiveload added the 1911 Encyclopædia Britannica as a major achievement, though that gigantic reference work is also not fully transcribed yet. Acélan, from the French Wikisource, noted that all 16,000 pages of the famous Encyclopédie are completely transcribed there, only needing to be validated.

The future of Wikisource appears bright. Tpt sees the coming introduction of the VisualEditor as a potential point of success for the small project, noting that it will "make very easy for anyone to proofread" and facilitate the introduction of an export tool with "the adoption of a powerful metadata management system based on technologies built for Wikidata". Zyephyrus put it more succinctly: he sees the future as whether or not the project will complete its mission of "the complete library accessible to all humans on Earth."

In addition, the new Wikisource Community User Group was recently approved by the Affiliations Committee. The group plans "to support the Wikisource community in international communication tasks, outreach to external groups, coordination of software tools development, and facilitate fundraising according to its member needs", but what do regular users of the site think? Remarking on one of Wikisource's largest stumbling blocks, Viewer2 wonders if in trying to help and "inject some kind of sanity into the copyright strait-jacket", the organization "might just [be] occupied forever". John Carter hopes that it can help publicize the little-known site; if new editors come in bringing transcriptions of, for example, local and regional histories, that could be just the niche that Wikisource can fill and thrive in.

The site's contributors are upbeat, too: Maury, who is retired in real life, told us that it was a question of doing good for others, not just yourself. "Why carry knowledge to the grave when it, like real life itself, can be applied to building to better the world?" And has the site reached its full potential? As Carter stated, "The scope of this site is, really, only limited to the scope of the printed word and other historic works."

FDC recommendations raise questions about clarity of metrics, rationale

The FDC's third six-monthly round of annual grants: what the applicants asked for (blue) and what they are likely to get (red), both calibrated on the left vertical axis; the percentage of their bid that the FDC will recommend (transparent bars) is calibrated on the right axis.

The Wikimedia Foundation's volunteer Funds Dissemination Committee has published its recommendations to the Board of Trustees on 11 new applications for annual grants by 11 WMF-affiliated organisations. The announcement comes after the FDC-related staff revealed their assessments and comments on the applications last month. The maximum total budget for the current and upcoming March rounds is US$6M. In this round $4.4M has been recommended, leaving a maximum of $1.6M for the second and final round in 2013–14. The FDC reports that a total of $1.4M is likely to be requested in March.

Most returning applicants received significant increases over last year's allocations, despite the FDC's concerns about rapid growth in budgets and staffing, underspending, and planning. In particular, the staff ratings in this round were sharply reduced compared with those a year ago for four returning chapters—the UK, Germany, Switzerland, and Israel—the first three of which are large European entities. There has been debate about value for money in the traditional chapter model, with warnings by the Foundation's executive director, Sue Gardner, that the FDC is "disproportionately chapters-centric", and her questioning of the cost–benefits of "setting up bricks-and-mortar institutions ... alongside sometimes difficult dynamics between staff and community".

Evolving context

The current round is occurring in a changing environment for funding. This is throwing up challenges for a multilayered, intricate system that is little more than a year old and is likely to factor into how the FDC, and WMF grantmaking more broadly, evolve over the next few years. Affiliated organisations are now returning for a second annual grant, which was always going to bring into serious play what is known as the "guardrails" guideline. Spelled out in the FDC's framework, this specifies that from year to year an applicant's funding should be within the range of 80–120% of their previous year's funding; this is for the sake of stability in both affiliate organisations' finances and FDC outlays. In the FDC's first year, the guideline was loosely based on the amount of WMF funding applicants had received in the previous year through other means. Likewise, this year the benchmarks for Serbia and India, newcomers to the FDC process, were established on the basis of non-FDC Foundation grants for the 2012–13 financial year.

At a high-profile WMF Metrics Meeting just before the deadline for applications, FDC support staff raised concerns that most of the bids for the current round were well over the maximum 20% increase allowed under the guardrails guideline; only the Netherlands' bid was within the allowed increase, at a full 20%. Our reporting of these figures prompted one chapter to email complaints to the Signpost's editor in chief that the cited increase in their application bid was distorted by fluctuations in the US dollar exchange rate; we understand that these complaints were taken up with FDC staff.

A turbulent year for some chapters has also called into question how accountable FDC funding should be in relation to standards of governance and transparency. There have been further conflict-of-interest issues for the WMUK board, despite the joint WMF–WMUK inquiry into governance in the chapter last year in the wake of Gibraltargate. There appear to be electoral irregularities and conflict-of-interest problems concerning the board of the Indian chapter. And the management of the German chapter received a scathing report by the chapter's auditors concerning financial procedure and a lack of detail in the annual plan.

Complications: the guardrails, exchange rates, and underspending

The Signpost faced difficulty in comparing how the chapters had been treated in relation to each other, to last year's funding, and to the FDC's written assessments. It appears that the figures are complicated by two factors. The first is the exchange-rate issue. The FDC's statement about this is unclear—that recommended funding is now "in requested currencies; the amount in US dollars is for comparative purposes only (using recalculated conversion rates from 1 October 2013)". When we queried what this means, FDC member Anders Wennersten confirmed that local currencies were used in applying the guardrails guideline. The figures supplied to the Signpost—in local currencies—do not include the exchange rates used to arrive at last year's funding as the benchmark, and seem to involve other factors as well.

The second complication is that several applicants significantly underspent their FDC allocation in the 2012–13 financial year—the subject of repeated criticism in the assessments (the word "underspend" and varieties appear 10 times in the FDC's recommendations). The FDC's comments about the German chapter (WMDE), for example, are highly critical: "WMDE does not propose any clear solution to the fact that it has a significant carry-over of $675,000 from its 2013 budget. Briefly stating that it plans to allocate this amount to software development in 2014 is insufficient. The amount proposed is equal to the annual budget of several Wikimedia organizations combined and cannot be treated lightly. ... [WMDE] often chooses to rely on a more general and enigmatic overall outcomes assessment, which is somewhat problematic for an organization this size". ... This large requested amount of two million US dollars ["$2.4 million" in the next sentence] does not have a clear rationale."

The Signpost initially assumed that WMDE's funding has been cut by 2.2% from last year's grant of in straight US$ terms ($1.75M vs $1.79M). In contrast, the FDC's recommendations cited "an effective increase of 20% over the previous FDC allocation". Information provided to us by the FDC cites a change of −6%. Wennersten told us that "we have an unresolved issue with operating reserves". Last year, for example, Wikimedia Germany underspent FDC funding by US$225,000 (a calculation that had to be teased out of the chapter's total underspend from all sources of $665,000). In practice, Wennersten said, the FDC expects WMDE to finance their 2013–14 activities partly from that $225,000; however, it is still unclear how this was factored into the chapter's allocation this year.

We put it to FDC chair Dariusz Jemielniak that the Committee had been staunchly critical of WMDE and that this did not seem to match the funding allocation to the chapter. His response was twofold:

The underspend situation is yet more complex, according to FDC member Sydney Poore, who told the Signpost that:

Given the multiple factors involved, we are unable at this stage to provide a graph showing how each applicant's funding related to the 80–120% guideline.

In brief


















Wikipedia:Wikipedia Signpost/2013-12-04/News_and_notes