The Signpost
Single-page Edition
WP:POST/1
28 May 2014

News and notes
The English Wikipedia's second featured-article centurion; wiki inventor interviewed on video
Featured content
Zombie fight in the saloon
Traffic report
Get fitted for flipflops and floppy hats
Recent research
Predicting which article you will edit next
 

2014-05-28

The English Wikipedia's second featured-article centurion; wiki inventor interviewed on video

Banksia integrifolia, Casliber's first featured article credit.
Diplodocus was his first solo featured article
His ninth featured article was the top-importance lion.
Grus (constellation) (upper left) marks Casliber's 100th featured article. Most of the other constellations visible in this image were also brought to featured status by him.

With the promotion to featured article (FA) of Grus (constellation) on 17 May, Casliber became Wikipedia's second featured-article centurion, following Wehwalt's groundbreaking achievement last December. Cas's first FA, Banksia integrifolia, a group effort, was promoted on 16 November 2006. His first solo project, Diplodocus, followed in January 2007; he has rarely been off the FAC page since. Quite apart from his regular and meticulous content work, Cas has contributed to many other aspects of the Wikipedia project – I'm always seeing his name, contributing, helping, leading. I caught up with him recently, and he graciously agreed to answer a few of my questions.


First, many congratulations on achieving the rare feat of 100 featured articles, an awesome accomplishment. Without wishing to breach your anonymity, can you reveal a little bit about yourself?

I am a psychiatrist from Sydney and have always been interested in birdwatching and native (Australian) plants. Sadly I have a brown thumb.

When did you begin editing WP, and what brought you here in the first place?

I first started reading it in around 2005, and began editing in May 2006 – mainly to improve my trivia knowledge, as I had been in a few trivia competitions and gameshows and won prizes. The first fact that I learned this way that I then got asked about was the name of the third book of the Old Testament. As an atheist, I'd never known this before. Sadly, on TV I was beaten to the buzzer, but was still chuffed that I had only learnt it the night before. I went on The Einstein Factor, where a contestant had to pick an esoteric subject to be quizzed on. I was on three times – first time I chose horned dinosaurs, which is why a lot my early edits were on these. Second time I went on I chose poisonous mushrooms.

Yes, I've been studying your featured article log; a fascinating medley: the predominant subjects are flora, fauna and (more recently) constellations, but occasionally, oddities turn up – a dinosaur, a novel (The Historian), a medical article. You clearly have a wide panorama of interests; do you have any specific method for choosing your subjects, or, like me, do you tend to follow your instincts?

Enthusiasm plays a huge part, one really needs the drive and interest with any particular topic to "take it all the way". I loved The Historian and enjoyed working on it, but its main driver was Wadewitz (talk · contribs), and I was a happy sidecar-rider really. I have written a lot on plants and animals local to my area, and have been meaning to do some Australia towns and cities (as well as football teams) but never gotten round to it. Medical articles are important, but they are...well, they are a bit like work really. I encourage everyone to do a big, broad article that can be a real Odyssey – some of my most enjoyable (and proud) moments are watching articles like vampire, lion and white stork grow and become something really grand to read. Betelgeuse was a surprise here too, I was buffing it when along came Sadalsuud (talk · contribs) who was a real juggernaut in finding and adding material.

I see you mentioned Wadewitz there, sadly no longer with us – I, too, enjoyed working with her in my early WP years. We never formally co-produced an article, though we talked of it from time to time. Are there other editors who particularly helped you in your early days, that you'd like to acknowledge now?

Hesperian (talk · contribs) and Gnangarra (talk · contribs) have been great with banksia editing, and were the first folks to welcome me. Along the way, Jimfbleak (talk · contribs) has been there for lots of bird collaboration, Sasata (talk · contribs) for fungi collaboration, and is always a thorough and eternally good-natured reviewer. There have been loads of friendly folks at the dinosaur, birds, fungi and astronomy wikiprojects (too many to mention but all members can consider themselves acknowledged here!).

I started a year or so after you, and I think FAC has changed a lot since then. I think standards have risen considerably – it takes me far longer than it used to to put an FA together. What differences in the FAC procedure have you experienced? Do you agree that FA standards have risen?

I joined just as inline cites were really becoming obligatory (which I think was a very good thing). FAC has become more rigorous overall; I recall there being many more simple supports, with no extended comments, which makes one wonder how detailed a look-over was done. I do worry about low numbers of reviewers, and miss some of the thorough goings-over that I think are essential to keeping standards high.

Now, writing featured content is only a part of your overall WP contribution. You are an active reviewer, an admin, you've been an arbcom and, of course, you are the prime mover behind the annual core contest (which I've been happy to judge from time to time). Which of these various roles have you found most rewarding?

Writing content really – it's definitely the most relaxing to do in downtime. Reading really good prose is also enjoyable. Nutting out conclusions to things can be good as well – I tried to do that on arbcom, punt content discussion back to the community such as hopefully playing role in giving a shove for West Bank naming and Abortion advocacy movement coverage to reach a conclusion.

How do you see your contribution to WP over the next few years? More generally, how do you see Wikipedia developing? Are there any basic changes that you would like to see implemented?

I see Wikipedia at a crossroads. The novelty of being newfangled is wearing off, as evidenced by dropoff in new editors (I think the increasing rigour of editing rules is partly responsible for this, but this is essential in the evolution of the 'pedia). I have tried with the Core Contest to kick-start improvement of broad subject articles, but their sheer size and breadth, and greater likelihood of conflicting opinions, makes getting these to featured standard exponentially harder. I saw the core contest as a means of improving our "core portfolio" of articles. Also, I wonder whether we should worry more about article maintenance, and perhaps use semi-protection more liberally? Generally, we need more reviewers, but I am unsure how best we do this. I feel that carrots are better than sticks, which is why I have been coming up with ideas for contests – I think they've been good for engagement and collaboration too. And folks should keep an eye on Stub contest to improve stubby bits – fluffy stubs with a few too many words to be easily expanded for DYK, but really need some tidying ...

Well, I see that you have plenty of ideas, and it is refreshing to see someone who has kept their enthusiasm and is still thinking ahead. It has been a pleasure to talk to you, Cas, and I'm sure all your fellow editors join me in hoping that we'll see and hear plenty more from you in the future.


List of Casliber's featured articles

100. Grus (constellation) Featured article star   

Inventor of the wiki turns 65

Ward Cunningham's birthday interview by the WMF

Howard G. ("Ward") Cunningham, who turned 65 last week, has special distinction in these realms as the developer of the first wiki. An American computer programmer, his profound innovation was first installed on the Internet in March 1995. Cunningham remains a dynamic professional force: after a career in the corporate sector, since 2011 he has been "Co-Creation Czar" for CitizenGlobal, an innovative video and photo crowdsourcing platform that enables organizations to easily collect and analyze eyewitness media and data. He is also Nike's first Code for a Better World Fellow.

One of Cunningham's memorable quips is: "the best way to get the right answer on the Internet is not to ask a question, it's to post the wrong answer," which has come to be known as Cunningham's law. Its author is reported to have said: "Wikipedia may be the most well-known demonstration of this law."

As part of his birthday celebrations, WMF's Victor Grigas published an interview with Cunningham originally recorded in 2011. The following quotes are drawn from the significant statements he makes in the video:


In brief

Tim Moritz Hector, new chair of Wikimedia Germany's board – overseeing more than 50 staff and an annual budget in the region of $10 million.

2014-05-28

Zombie fight in the saloon

Wikipedia editor Sven Manguard's work is quite underappreciated a lot of the time, most likely because people haven't heard of it yet: He's developed good relationships with game companies, and is thus able to get full-resolution screenshots released under a Creative Commons license for use on Wikipedia and elsewhere. This screenshot comes from the game Charlie Murder, a beat 'em up game where the members of the titular band deal with supernatural forces summoned by a vengeful ex-bandmember.
This Signpost "Featured content" report covers material promoted from 18 May through 24 May. Quoted material is from the relevant lists and articles, see the lists and articles for attribution.

Seven featured articles were promoted this week.

Marquee Moon is a new featured article. The critically acclaimed album "has an urban nocturne theme and lyrics with references to lower Manhattan (East Village (pictured)."
Most tracks of the award-winning Beatles album Sgt. Pepper's Lonely Hearts Club Band were recorded at Abbey Road Studio Two.
A new graduate of No. 1 Flying Training School RAAF and his wife celebrate with a kiss
  • WINC (AM) (nominated by Neutralhomer) "WINC (1400 AM) is a broadcast radio station licensed to Winchester, Virginia, United States. The station carries a news, talk, and sports format. WINC serves Winchester along with Frederick and Clarke Counties in Virginia. Launched on June 26, 1941, by Richard Field Lewis, Jr., WINC was Winchester's first radio station. It remained in the hands of the Lewis family until sold to North Carolina-based Centennial Broadcasting in 2007. The station's current format, established in 1996, consists mostly of conservative talk programs and top-of-the-hour news from Fox News Radio. Sports programming from Virginia Tech is also broadcast. Prior formats heard on WINC include middle of the road music, adult contemporary, and classic hits."
  • Marquee Moon (nominated by Dan56) "Marquee Moon is the debut album by American rock band Television. It was released in February 1977 by Elektra Records. By 1974, the band had become a prominent act on the New York music scene and generated interest from a number of record labels. They rehearsed extensively in preparation for the album and, upon signing to Elektra, recorded most of the songs in single takes. Television's frontman Tom Verlaine produced the album with recording engineer Andy Johns at A & R Recording in September 1976. For Marquee Moon, Verlaine and fellow guitarist Richard Lloyd eschewed contemporary punk rock's power chords in favor of rock and jazz-inspired interplay, melodic lines, and counter-melodies. Verlaine's lyrics for the album combined urban and pastoral imagery, references to lower Manhattan, themes of adolescence, and influences from French poetry. He also used puns and double-entendres to give his songs an impressionistic quality." The album "has... been viewed by critics as one of the greatest albums of the American punk rock movement and a cornerstone of alternative rock. The band's innovative post-punk instrumentation on the album strongly influenced the indie rock and new wave movements of the 1980s, as well as rock guitarists such as John Frusciante, Will Sergeant, and The Edge."
  • Sgt. Pepper's Lonely Hearts Club Band (nominated by GabeMc) "Sgt. Pepper's Lonely Hearts Club Band is the eighth studio album by the English rock band the Beatles. Released on 1 June 1967, it was an immediate commercial and critical success, spending 22 weeks at the top of the albums chart in the UK and 15 weeks at number one in the US. Time magazine declared it "a historic departure in the progress of music" and the New Statesman praised its elevation of pop to the level of fine art. It won four Grammy Awards in 1968, including Album of the Year, the first rock LP to receive this honour."
  • Thorpe affair (nominated by Brianboulton) "The Thorpe affair of the 1970s was a British political and sex scandal that ended the career of Jeremy Thorpe, the leader of the Liberal Party and Member of Parliament (MP) for North Devon. The scandal arose from allegations by Norman Josiffe (otherwise known as Norman Scott), that he and Thorpe had shared a homosexual relationship in the early 1960s, at a time when such relationships were illegal in the United Kingdom."
  • Audie Murphy (nominated by Maile66) "Audie Leon Murphy (20 June 1925 – 28 May 1971) was one of the most decorated American combat soldiers of World War II, receiving every military combat award for valor available from the U.S. Army, as well as French and Belgian awards for heroism. The 19-year-old Murphy received the Medal of Honor after single-handedly holding off an entire company of Germans for an hour at the Colmar Pocket in France in January 1945, then leading a successful counterattack while wounded and out of ammunition."
  • Orel Hershiser's scoreless innings streak (nominated by TonyTheTiger) "During the 1988 Major League Baseball (MLB) regular season, pitcher Orel Hershiser of the Los Angeles Dodgers set the MLB record for consecutive scoreless innings pitched. Hershiser pitched 59 consecutive innings in which opposing hitters did not score a run against him. During the streak, he averted numerous high risk scoring situations... The streak spanned from the sixth inning of an August 30, 1988 game against the Montreal Expos to the tenth inning of the September 28, 1988 game against the San Diego Padres. The previous record of 58 innings was set by former Dodger pitcher Don Drysdale in 1968; as the team's radio announcer, Drysdale called games as Hershiser pursued his record. Commentators have described this streak as among the greatest individual streaks in sports and among the greatest records in baseball history."
  • No. 1 Flying Training School RAAF (nominated by Ian Rose) "No. 1 Flying Training School (No. 1 FTS) was a school of the Royal Australian Air Force (RAAF). It was one of the Air Force's original units, dating back to the service's formation in 1921, when it was established at RAAF Point Cook, Victoria. By the early 1930s, the school comprised training, fighter, and seaplane components. It was re-formed several times in the ensuing years, initially as No. 1 Service Flying Training School (No. 1 SFTS) in 1940, under the wartime Empire Air Training Scheme. After graduating nearly 3,000 pilots, No. 1 SFTS was disbanded in late 1944, when there was no further requirement to train Australian aircrew for service in Europe."

For other featured article news, please see the accompanying Signpost report on Casliber becoming Wikipedia's second featured article centurion.

Lower Cache River Swamp is a National Natural Landmark in Illinois, United States

Three featured lists were promoted this week.

  • Axis order of battle for the invasion of Yugoslavia (nominated by Peacemaker67) "The Axis order of battle for the invasion of Yugoslavia includes a listing (or order of battle) of all operational formations of the German Wehrmacht and Waffen-SS, Italian Armed Forces and Hungarian Armed Forces that were involved in the World War II invasion of Yugoslavia which commenced on 6 April 1941. It involved the German 2nd Army, with elements of the 12th Army and a panzer group combined with overwhelming Luftwaffe (German Air Force) support. The eighteen German divisions included five panzer divisions, two motorised infantry divisions and two mountain divisions. The German force also included two well-equipped independent motorised regiments and was supported by over 800 aircraft. The Italian 2nd Army and 9th Army committed a total of twenty-two divisions, and the Royal Italian Air Force (Italian: Regia Aeronautica) had over 650 aircraft available to support the invasion. The Hungarian 3rd Army also participated, with support from the Royal Hungarian Air Force (Hungarian: Magyar Királyi Honvéd Légierő, MKHL)."
  • List of National Natural Landmarks in Illinois (nominated by Dana boomer) "The National Natural Landmarks (NNLs) in Illinois include 18 of the almost 600 such landmarks in the United States. They cover areas of geological, biological and historical importance, and include lakes, bogs, canyons and forests. Several of the sites provide habitat for rare or endangered plant and animal species. The landmarks are located in 13 of the state's 102 counties. Five counties each contain all or part of two or more NNLs, while one landmark is split between two counties. The first designation, Forest of the Wabash, was made in 1965, while the most recent designation, Markham Prairie, was made in 1987. Natural Landmarks in Illinois range from 53 to 6,500 acres (21.4 to 2,630.5 ha; 0.1 to 10.2 sq mi) in size. Owners include private individuals and several county, state and federal agencies."
  • List of SpongeBob SquarePants guest stars (nominated by Mediran) "In addition to the show's regular cast of voice actors, guest stars have been featured on SpongeBob SquarePants, an American animated television series created by marine biologist and animator Stephen Hillenburg for Nickelodeon. SpongeBob SquarePants chronicles the adventures and endeavors of the title character and his various friends in the fictional underwater city of Bikini Bottom. Many of the ideas for the show originated in an unpublished, educational comic book titled The Intertidal Zone, which Hillenburg created in the mid-1980s. He began developing SpongeBob SquarePants into a television series in 1996 upon the cancellation of Rocko's Modern Life, which Hillenburg directed."

Four featured pictures were promoted this week.

A Refunding Certificate, an early attempt by the United States Treasury to inspire confidence in paper money by having the notes gain interest over time, increasing in value.
  • Charlie Murder screenshot (created by Ska Studios, nominated by Sven Manguard) "Charlie Murder is an action role-playing beat 'em up video game developed by Ska Studios and published by Microsoft Game Studios. First revealed in January 2010 as an Xbox Live Indie Games title, the studio announced in May 2010 that the game would undergo a "complete overhaul" and be published in 2012 through Xbox Live Arcade for the Xbox 360. Charlie Murder was eventually released on 14 August 2013 to positive reviews, with critics praising the game's soundtrack and hand-illustrated visuals."
  • The Last of the Mohicans (created by Frank T. Merrill and restored by Crisco 1492, nominated by Crisco 1492) "The Last of the Mohicans: A Narrative of 1757 (1826) is an historical novel by James Fenimore Cooper. It is the second book of the Leatherstocking Tales pentalogy... The Last of the Mohicans is set in 1757, during the French and Indian War (the Seven Years' War), when France and Great Britain battled for control of North America. During this war, the French depended on its Native American allies to help fight the more numerous British colonists in the Northeast frontier areas."
  • Memorial tower of the Netherlands American Cemetery (created by Godot13, nominated by Godot13) "The World War II Netherlands American Cemetery and Memorial is a war cemetery which lies in the village of Margraten six miles (10 km) east of Maastricht, in the most southern part of the Netherlands. It is administered by the American Battle Monuments Commission... The walls on either side of the Court of Honor contain the Tablets of the Missing on which are recorded the names of 1,722 American missing who gave their lives in the service of their country and who rest in unknown graves. Beyond the chapel and tower is the burial area which is divided into sixteen plots. Here rest 8,301 American dead, most of whom lost their lives nearby. Their headstones are set in long curves. A wide tree-lined mall leads to the flag staff which crowns the crest."
  • Refunding Certificate (created by The Bureau of Engraving and Printing and the Smithsonian National Museum of American History, nominated by Godot13) "The Refunding Certificate, issued only in the $10 denomination depicting Benjamin Franklin, was a type of interest-bearing banknote issued by the United States Treasury. Their issuance reflects the end of a coin-hoarding period which began during the American Civil War, and represented a return to public confidence in paper money. In 1879, when the bonds were issued, silver coins were in wide circulation and coins minted in gold were just beginning to make their appearances at banks nationwide. Notes totaling $40,012,750 were paid out, including the majority, some $39,398,110 in the fourth quarter of 1879, as long lines of people gathered at Post Office branches and Treasury offices. The Refunding Certificate originally promised to pay 4% annual interest in perpetuity."


Reader comments

2014-05-28

Get fitted for flipflops and floppy hats

In the US, Memorial Day marks the unofficial beginning of summer, and summer is definitely on people's minds this week, with summer films Godzilla and X-Men: Days of Future Past, the apparently designated summer song "Fancy" by Iggy Azalea, and summer TV show, Game of Thrones. The Indian general election is only fading slowly, understandably as its effects have yet to be fully felt.

For the full top 25 list, see WP:TOP25. See this section for an explanation for any exclusions.


For the week of May 18 to 24, the 10 most popular articles on Wikipedia, as determined from the report of the 5,000 most viewed pages, were:

Rank Article Class Views Image Notes
1 Mary Anning Featured Article 1,750,729
She sells seashells by the seashore. She was Mary Anning, who not only found and sold fossil seashells but also identified the first ichthyosaur (at age 12!) and the first plesiosaur. A victim of the gender and class prejudice of her time, she didn't get the recognition she deserved until after her death; an oversight Wikipedia viewers have gone some way to correcting thanks to a birthday Google Doodle on 21 May.
2 Godzilla (2014 film) B-Class 797,814
It seems that Hollywood's trust in Gareth Edwards, director of the microbudget scifi flick Monsters, was well placed, as his take on the Godzilla mythos has emulated its eponymous hero, stomping the box office to dust with $93 million in three days. Critics seem to like the movie too; it's RT rating is currently 73%. Personally, I had issues with it, but then, what do I know?
3 Rubik's Cube B-Class 766,499
Nothing is more likely to generate Wikipedia views than an interactive Google Doodle, and to celebrate the 40th anniversary of the ingenious puzzle, Google effectively rendered it irrelevant by constructing a fully solvable virtual version and releasing it online for everyone to try.
4 Steve Wozniak B-Class 733,241
The co-founder of Apple Computer got a massive one-day spike on 18 May, the same day he published an open letter to the FCC demanding they retain net neutrality in the US. I'm usually suspicious of 1-day spikes with no tail-off, but this instance is at least explicable.
5 Amazon.com B-Class 656,462
This article suddenly reappeared in the top 25 after a long absence, but at least it has a reason: Amazon Fire TV; a digital streaming device to watch online content on a HDTV. How it distinguishes itself from the three or four other such devices currently on the market is a matter of some dispute.
6 Narendra Modi B-Class 544,033
Thanks to an effective ad campaign and a sound economic record as Chief Minister of the state of Gujarat, Modi led his Hindu nationalist BJP to victory with a stomping 282 (52%) seats. A Hindu nationalist and a member of the RSS, Modi is considered a controversial politician and debate still surrounds the extent of his role in the 2002 Gujarat riots during his tenure as Chief Minister. The Indian National Congress, the party that has mostly led India since its independence, came in second with 44 seats, its worst showing in any election in India's history.
7 Game of Thrones B-class 507,708
New seasons of this immensely popular show always draw people to Wikipedia.
8 Memorial Day C-Class 456,537
The last Sunday in May (that's May 26 this year), the day that the United States chose to honour its war dead, is perhaps better known as the traditional beginning of US summer vacation, and is thus eagerly anticipated by millions of people too young to serve but old enough to stand in line for action movies.
9 Game of Thrones (season 4) C-Class 455,733
This is the page with the plot synopses for each episode.
10 Iggy Azalea C-class 432,512 The Australian/American rapper released her debut album, The New Classic on 21 April, but probably re-entered the top list due to an earpiece malfunction during a performance of her single "Fancy" on Dancing With The Stars


Reader comments

2014-05-28

Overview of research on Wikipedia's readers; predicting which article you will edit next

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"Wikipedia in the eyes of its beholders: A systematic review of scholarly research on Wikipedia readers and readership"

This paper[1] is another major literature review of the field of Wikipedia studies, brought forward by the authors whose prior work on this topic, titled "The People’s Encyclopedia Under the Gaze of the Sages"[supp 1] was reviewed in this research report in 2012 ("A systematic review of the Wikipedia literature").

This time the authors focus on a fragment of the larger body of works about Wikipedia, analyzing 99 works published up to June 2011 on the theme of "Wikipedia readership" – in other words focusing on the theme "What do we know about people who read Wikipedia". The overview focuses less on demographic analysis (since little research has been done in that area), and more on perceptions of Wikipedia by surveyed groups of readers. Their findings include, among other things, a conclusion that "Studies have found that articles generally related to entertainment and sexuality top the list, covering over 40% of visits", and in more serious topics, it is a common source for health and legal information. They also find that "a very large number of academic in fact have quite positive, if nuanced, perceptions of Wikipedia’s value." They also observe that the most commonly studied group has been that of students, who offer a convenience sample. The authors finish by identifying a number of contradictory findings and topics in need of further research, and conclude that existing studies have likely overestimated the extent to which Wikipedia's readers are cautious about the site's credibility. Finally, the authors offer valuable thoughts in the "implications for the Wikipedia community" section, such suggesting "incorporating one or more of the algorithms for computational estimation of the reliability of Wikipedia articles that have been developed to help address credibility concerns", similar to the WikiTrust tool.

The authors also published a similar literature review paper summarizing research about the content of Wikipedia, which we hope to cover in the next issue of this research report.

Chinese-language time-zones favor Asian pop and IT topics on Wikipedia

Map of the Chinese-speaking world

A paper[2] presented at the WWW 2014 Companion Conference analyzes the readership patterns of the English and Chinese Wikipedias, with a focus on which types of articles are most popular in the English- or Chinese-language time zones. The authors used all Wikipedia pages which existed under the same name in both languages in the period from 1 June 2012 to 14 October 2012 for their study, coding them through the OpenCalais semantic analysis service with an estimated 2.6% error rate.

The authors find that readers of the English and Chinese Wikipedias from time-zones of high Chinese activity browse different categories of pages. Chinese readers visit English Wikipedia about Asian culture (in particular, Japanese and Korean pop culture) more often, as well as about mobile communications and networking technologies. The authors also find that pages in English are almost ten times as popular as those in Chinese (though their results are not identifying users by nationality directly, rather focusing on time zone analysis).

In this reviewer's opinion, the study suffers from major methodological problems that are serious enough to cast all the findings in doubt. Apparently because the authors were unaware of Interlanguage links and consider only articles which have the same name (URL) in both the English and Chinese Wikipedians, they find that only 7603 pages were eligible to be analyzed (as they had both an English and Chinese version), however the Chinese Wikipedia in the studied period had approximately half a million articles; and while many don't have English equivalents yet, to expect that less than 2% did seems rather dubious. Similarly, our own WikiProject China estimates that English Wikipedia has almost 50,000 China-related articles. That, given that WikiProject assessments are often underestimating the number of relevant topics, and usually don't cover many core topics, suggests that the study missed a vast majority of articles that exist in both languages. It is further unclear how English- and Chinese-language time-zones were operationalized. The authors do not reveal how, if at all, they controlled for the fact that readers of English Wikipedia can also come from countries where English is not a native language, and that there are hundreds of millions of people outside China who live in the five time zones that span China, which overlap with India, half of Russia, Korea and major parts of Southeast Asia. As such, the findings of that study can be more broadly interpreted as "readership patterns of English and Chinese Wikipedia in Asia and the the world, regarding a small subset of pages that exist on both English and Chinese Wikipedia."

"Bipartite editing prediction in Wikipedia"

Reviewed by Maximilianklein (talk)

Bipartite Editing Prediction in Wikipedia[3] is a paper wherein the authors aim to solve what they call the "link prediction problem". Essentially they aim to answer "which editors will edit which articles in the future." They claim the social utility of this is to suggest articles to edit to users. So in some ways this is a similar function to SuggestBot, but using different techniques.

Their approach here is to use a bipartite network modelling. A bipartite network is a network with two node-types, here editors and articles. Using bipartite network modelling is becoming increasingly trendy, like Jesus (2009)[supp 2] and Klein (2014).[supp 3]

Explaining their method, the researchers outline their two approaches: "supervised learning" and "community awareness". In the supervised learning approach the machine learning features used are Association Rule, K-nearest neighbor, and graph partitions. All these features, they state, can be inferred directly from the bipartite network. In the community awareness approach, the Stanford Network Analysis Project tool is used to cut the network into co-editor sets, and then go on to inspect what they call indirect features which are sum of neighbors, Jaccard coefficient, preferred attachment, and Adamic–Adar score.

The authors proceed to give a table of their results, and highlight their highest achieving precision, and recall statistics which are moderate and contained in the interval [.6, .8]. Thereafter a short non-interpretive one-paragraph discussion concludes the paper saying that these results might be useful. Unfortunately they are not of much use, since while they declare their sample size of 460,000 editor–article pairs from a category in a Wikipedia dump, they don't specify which category, or even which Wikipedia they are working on.

This machine learning paper lacks sufficient context or interpretation to be immediately valuable, despite the fact that they may be able to predict with close to 80% F-measure which article you might edit next. Therefore the paper is a good example of the extent to use Wikipedia for research without even feigning attempt to make the research useful to the Wikipedia community, or even frame it in that way.

Briefly

A reading room in the University of Pittsburgh's Hillman Library
  • "Increasing the discoverability of digital collections using Wikipedia: the Pitt experience": In this paper,[4] a librarian at the University of Pittsburgh discusses how two undergraduate interns have added over 100 links to library collections to Wikipedia articles, which led to the increase use of the library's digitized collections. An experienced Wikipedian, Sage Ross, provided help with this project. The two undergrads expanded or created approximately 100 articles, mainly related to the History of Pittsburgh (such as Pittsburgh Courier or Pittsburgh Playhouse), using resources hosted by the university's libraries as sources or external links. The paper also provides a valuable overview of similar initiatives in the past (some of which have also been covered in this research report, see e.g.: "Using Wikipedia to drive traffic to library collections"). The majority of reviewed examples suggest that linking library resources from Wikipedia pages increases their visibility, and this study reached the same conclusion with regards to their project, which led both the improvement of Wikipedia content and of driving more traffic to the digital resources hosted by the library. This reviewer applauds this project as a model one, though it would benefit from a list of all articles edited by the students (which were not tagged on their talk pages with any expected template, such as {{educational assignment}}).
  • Korean survey on "Key Factors for Success" of Wikipedia and Q&A site: This paper[5][predatory publisher] compares aspects of Wikipedia and South Korean Naver's "Naver Knowledge" service (see Knowledge Search), similar to Google Questions and Answers. This is a topic of some interest, as South Korea is praised for being one of the most Internet-integrated societies in the world, while at the same time the Korean Wikipedia currently holding the rank of 23rd largest, is less developed than those of a number of smaller countries less commonly seen as Internet powers (consider List of Wikipedias by size). The researchers surveyed 132 Korean Internet users of those services, though they do not make it clear if all members of the sample were in fact registered contributors to both services, instead describing them as "relative active users of the CI [collective intelligence] system". Unfortunately, parts of the paper, including the survey questions, appear to have been translated using machine translation, and are thus difficult to interpret correctly. Overall, the authors find that there were no significant differences with regards to the respondents views of Naver Knowledge and Wikipedia services. One of the statistically significant results suggest that Korean contributors of collective intelligence services find the Naver Knowledge service easier to use than Wikipedia, though the differences do not appear to be major (73.5% and 60.9% of Korean contributors found Naver Knowledge and Wikipedia easy to work with, respectively). One of the conclusions of the paper is the importance of making user interfaces as easy as possible, and making it easier for the users to add and edit audiovisual content (though the authors seem not aware of and do not discuss the Visual Editor).
  • "Citation filtered": This glossy and infographic-laden report dissects the 963 Persian Wikipedia articles that are blocked in Iran.[6] The technique used was to programmatically iterate over Wikipedia to see which articles could not be loaded. Categorizing the articles into 10 topics, an analysis of the Iranian Government's sensitivities are explored. From the Annenberg School of Communication, University of Pennsylvania blog. (Maximilianklein (talk))
  • "Georeferencing Wikipedia documents using data from social media sources": This paper[7] describes several methods to automatically assign geocoordinates to articles on the English Wikipedia: By matching the article text to hashtags of georeferenced tweets, to tags of georeferenced photos on Flickr, and to the text of other Wikipedia articles that are already georeferenced. The authors report that "using a language model trained using 376K Wikipedia documents, we obtain a median error of 4.17 km, while a model trained using 32M Flickr photos yields a median error of 2.5 km. When combining both models, the median error is further reduced to 2.16 km. Repeating the same experiment with 16M tweets as the only training data results in a median error of 35.81 km". As one possible application, the authors suggest automatic correction of coordinates for Wikipedia articles where their method predicts a differing location with high confidence. Among their test dataset of 21,839 articles with a geocoordinate located in the United Kingdom, the authors found three such errors, one of which was still uncorrected at the time of their preprint publication (an educational institution in Brussels which had been placed in Cornwall due to a sign error in the longitudinal coordinate). Another interesting byproduct is a visual comparison (figure 5) of the density of geolocated entries from Wikipedia, Twitter and Flickr in Africa (per the datasets used).

Other recent publications

A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.

  • "Snuggle: Designing for efficient socialization and ideological critique"[8]
  • "Preferences in Wikipedia abstracts: Empirical findings and implications for automatic entity summarization"[9]
  • "Cluster approach to the efficient use of multimedia resources in information warfare in wikimedia"[10] (from the abstract: "A new approach to uploading files in Wikimedia is proposed with the aim to enhance the impact of multimedia resources used for information warfare in Wikimedia.")
  • "From open-source software to Wikipedia: ‘Backgrounding’ trust by collective monitoring and reputation tracking"[11] (from the abstract: "It is shown that communities of open-source software—continue to—rely mainly on hierarchy (reserving write-access for higher echelons), which substitutes (the need for) trust. Encyclopedic communities, though, largely avoid this solution. In the particular case of Wikipedia, which is confronted with persistent vandalism, another arrangement has been pioneered instead. Trust (i.e. full write-access) is ‘backgrounded’ by means of a permanent mobilization of Wikipedians to monitor incoming edits. ... Finally it is argued that the Wikipedian monitoring of new edits, especially by its heavy reliance on computational tools, raises a number of moral questions that need to be answered urgently.")

References

  1. ^ Okoli, Chitu and Mehdi, Mohamad and Mesgari, Mostafa and Nielsen, Finn Årup and Lanamäki, Arto (2014): Wikipedia in the eyes of its beholders: A systematic review of scholarly research on Wikipedia readers and readership. Journal of the American Society for Information Science and Technology . ISSN 1532-2882 (In Press) PDF
  2. ^ Tinati, Ramine; Paul Gaskell; Thanassis Tiropanis; Olivier Phillipe; Wendy Hall (2014). "Examining Wikipedia across linguistic and temporal borders". Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion. WWW Companion '14. International World Wide Web Conferences Steering Committee. pp. 445–450.
  3. ^ CHANG, YANG-JUI; YU-Chuan Tsai; Hung-Yu Kao (May 2014). "Bipartite editing prediction in Wikipedia". Journal of Information Science and Engineering. 30 (3): 587-603.
  4. ^ Galloway, Ed; Cassandra DellaCorte (2014-05-02). "Increasing the discoverability of digital collections using Wikipedia: the Pitt experience". Pennsylvania Libraries: Research & Practice. 2 (1): 84–96. doi:10.5195/palrap.2014.60. ISSN 2324-7878.
  5. ^ Seo-Young Lee, Sang-Ho Lee, "A Comparison Study on the Key Factors for Success of Social Authoring Systems – focusing on Naver KiN and Wikipedia", AISS: Advances in Information Sciences and Service Sciences, Vol. 5, No. 15, pp. 137 ~ 144, 2013,PDF
  6. ^ cgcsblog. "Citation-Filtered". Retrieved 31 May 2014.
  7. ^ Olivier Van Laere, Steven Schockaert, Vlad Tanasescu, Bart Dhoedt, Christopher B. Jones: Georeferencing Wikipedia documents using data from social media sources. Preprint, acccepted for publication in: ACM Transactions on Information Systems, Volume 32 Issue 3 PDF
  8. ^ Halfaker, Aaron; R. Stuart Geiger; Loren Terveen (2014-04-28). "Snuggle: designing for efficient socialization and ideological critique" (PDF). CHI: Conference on Human Factors in Computing Systems. doi:10.1145/2556288.2557313.
  9. ^ Xu, Danyun; Gong Cheng; Yuzhong Qu (March 2014). "Preferences in Wikipedia abstracts: empirical findings and implications for automatic entity summarization". Information Processing & Management. 50 (2): 284–296. doi:10.1016/j.ipm.2013.12.001. ISSN 0306-4573. Closed access icon
  10. ^ Alguliev, R. M.; R. M. Aliguliyev; I. Ya Alekperova (2014-03-01). "Cluster approach to the efficient use of multimedia resources in information warfare in wikimedia". Automatic Control and Computer Sciences. 48 (2): 97–108. doi:10.3103/S0146411614020023. ISSN 0146-4116. Closed access icon
  11. ^ de Laat, Paul B. (2014-04-22). "From open-source software to Wikipedia: 'backgrounding' trust by collective monitoring and reputation tracking". Ethics and Information Technology: 1–13. doi:10.1007/s10676-014-9342-9. ISSN 1388-1957. Closed access icon
Supplementary references:
  1. ^ Okoli, C., Mehdi, M., Mesgari, M., Nielsen, F., & Lanamäki, A. (2012, October 24). The People’s Encyclopedia Under the Gaze of the Sages: A Systematic Review of Scholarly Research on Wikipedia. SSRN Scholarly Paper, Montreal. http://papers.ssrn.com/abstract=2021326
  2. ^ Rut Jesus; Martin Schwartz; Sune Lehmann (2009). "Bipartite networks of Wikipedia's articles and authors: a meso-level approach" (PDF). {{cite journal}}: Cite journal requires |journal= (help)
  3. ^ Klein. "Measuring Editor Collaborativeness With Economic Modelling".


Reader comments
If articles have been updated, you may need to refresh the single-page edition.

















Wikipedia:Wikipedia Signpost/Single/2014-05-28