Orphans

Large portion of articles are orphans

Almost 30% of Wikipedia articles are "orphans", with few or no incoming links from other articles, according to WikiProject Orphanage. Based on an analysis by JaGa from January 24, 2009, that includes 133,515 articles with zero links from other articles and another 92,031 linked only from lists or chronology pages. A total of 533,411 articles have links from only one or two articles (excluding lists and chronology pages); these are also classified as orphans according to WikiProject Orphanage. Only 42,936 articles have been tagged with the {{orphan}} template. By JaGa's count there are 2,575,308 articles when disambiguation pages are excluded (compared to 2,700,000+ counted by Special:Statistics).

Distribution of incoming links per article, including links from lists and chronology pages but excluding links from redirects and disambiguation pages. There are 521,323 more articles with 50 or more incoming links (not shown).

The distribution of links per article is a characteristic long tail distribution that approximately demonstrates the Pareto principle: articles with 50 or more links comprise 20% of all articles, but account for 84% of all links. JaGa's list of the top 5000 articles by link count shows that many of the very top articles are ones commonly linked from templates, such as biography, geographic coordinate system, list of sovereign states, and music genre. Major nations are also among the most-linked articles; United States holds the top spot, with 16% of all articles linking to it.

The long tail distribution of links is consistent with a 2008 academic study of the network structure of Wikipedia, which showed that—like networks of scientific publications—Wikipedia linkage demonstrates preferential attachment and appears to be a scale-free network (see earlier story). That study focused on red links and the creation of new articles, and followup work showed a troubling trend that may also help explain the large magnitude of the orphan problem revealed by JaGa's data. Computer scientist Diomidis Spinellis showed that while Wikipedia was growing exponentially from 2003 to 2006 there was a stable average rate of 1.8 links to "incomplete" articles (red links and stubs) per non-stub article, but that rate had declined to 1.4 by early 2008. This indicates that linkage patterns became more "top-heavy" and articles were relatively less likely to point to undeveloped articles. Orphaned articles tend to be stubs, and because they have few related articles linking to them, they are likely to remain underdeveloped for longer than well-linked stubs.

Partly to blame may be a pernicious trend noted by User:Raul654, James F. and others: contrary to the red links guideline, red links are frequently being removed for aesthetic reasons. The 2008 linkage study showed that new articles tend to be created soon after the first link pointing to them. Red links thus drive growth and allow new articles to avoid orphan status right from the start.




Also this week:
  • Orphans
  • News and notes
  • In the news
  • Dispatches
  • WikiProject report
  • Features and admins
  • Technology report
  • Arbitration report

  • Signpost archives

    + Add a comment

    Discuss this story

    These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
    ==massive anti-red link campaign in many quarters==

    Hello. I just read your article on Orphan articles, and wasn't sure whether there was a dedicated place to comment - so I came here in the meantime. Mainly just to thank you for noting that there seems to be a massive anti-red link campaign in many quarters: I've noticed it myself, even to the point of people editing out links I'd deliberately left for bizarrely-overlooked important articles-to-be-written (there are still some glaring omissions on many of the topics I'm interested in). I suspect the trend is symbiotically linked, however, to practices of over-linking. e.g. the contrary tendancy to link every other word in an article whether or not it has any bearing on the subject to hand. (In fact, frequently this seems to become "particularly if not".) There have been specific drives to remove the linking of dates (with some validity - they all-to-often become trivia magnets of dubious relevance in many cases), and related 'unnecessary' links - which has further leaked over into removing very-necessary links because they look similar to those elsewhere deemed unnecessary. I think that the removal of red links, or a drive to stem their creation, can be seen to be hand-in-hand with those types of push. Sometimes. Similarly, on the same/other hand, mass-creation of red links is another common "problem" - it can either (some say) cast doubt on the notability of a subject by stating/implying that there are no obvious references to it anywhere here, or else suggest that the editor is over-zealous in their own interpretation of what might be eventually considered sufficiently notable (i.e. assuming that every "best boy" and "grip" in a film's cast & crew list will ultimately warrant their own separate page). After which slight rambling, all I really wanted to say was "Thank You" for trying to reassert the significant benefits and usefulness of red links, and for highlighting why they are important, necessary and worthwhile. ntnon (talk) 00:32, 2 February 2009 (UTC)

    Why "portion" rather than "proportion"?

    Would it be possible to have a "random orphan article" link in the navigation column? Jackiespeel (talk) 15:10, 2 February 2009 (UTC)[reply]

    Wonder if there is interesting relationship between orphan status/number of links to the article and frequency of access (correcting for stub/etc. status). Zodon (talk) 06:47, 4 February 2009 (UTC)[reply]

















    Wikipedia:Wikipedia Signpost/2009-01-31/Orphans