The Signpost


Community view

A Deep Dive Into Wikimedia

The original article can be found on my blog.

This will be presented as a multi-part series of columns in this space over the next few issues. – Signpost Editors

Introduction

[edit]

In a single sentence, Wikimedia is an online movement dedicated to making access to knowledge equitable. Because Wikimedia is a grassroots movement, this means that almost of the information comes from random users that generously volunteer their time to develop projects, not paid experts. To make sure that the projects are used for benefit of the public instead of a corporation's bottom line, the content is given under a free license while the code is all free and open source. You're almost certainly familiar with Wikimedia's most popular project, Wikipedia (how did you end up here if you aren't?), but how familiar? At first, this was supposed to be a short article about Wikipedia and its policies, but as I've dug deeper, I discovered that Wikipedia was merely a single part of an unfathomably complex online ecosystem. This blog post will go over everything that I found in broad strokes, but I strongly encourage everybody to click on the links I provide to get a deep understanding of the subject matter.

Part 0: The Foundation and the Movement

[edit]

The Wikimedia Foundation is a non-profit founded by Jimmy Wales after he founded Wikipedia. Its goal is to provide infrastructure for the Wikimedia projects, offer legal services for Wikimedia, provide critical technical support where necessary, and provide funding for people working on tasks that are important for Wikimedia's health (more on all of that later). The projects and the communities that sprung up around the foundation or were otherwise inspired by the foundation are collectively called the Wikimedia movement. By necessity, these two groups constantly interact with one another to promote their agenda, sometimes blurring the lines between the two. This can be confusing, so throughout this blog post, I'll explicitly say whether or not I'm referring to the foundation or the movement.

Part 1: The Wikimedia Projects

[edit]
TKTK

The Wikimedia projects are the core of the Wikimedia Foundation, where information can be freely disseminated. There's 12 projects in total, and they each have their own unique mission. While they're all built with the same MediaWiki software and are hosted by the Wikimedia Foundation, they mostly run independently from each other, with a few major exceptions. These projects vary widely in usage, history, and quality.

Besides being dedicated to the free distribution of knowledge, a common unifying feature of the Wikimedia projects is that they're entirely run by the community. That doesn't just mean the content in the projects are created by users, it means that the policies that guide the projects are as well. Despite hosting the projects, the foundation does virtually nothing for them except to maintain the software. While the Wikimedia movement isn't explicitly political, the emphasis on communal effort over an ingrained hierarchy can be seen as a natural extension of the libertarianism of both Jimmy Wales and the broader free-culture movement that Wikimedia sprouted from.

As you can imagine, there's quite a bit of similarities between the projects, despite being independently run and having different purposes. To avoid being redundant, I'm not going to repeatedly mention shared features like how anybody is allowed to edit. Instead, I'm only going to mention things that I found interesting while I researched the projects, and then explain the features shared by all the projects in part 2 and 3.

Wikipedia

[edit]

You already know what Wikipedia is. This is the flagship project of Wikimedia and easily the largest online encyclopedia. It introduced 3 major improvements over traditional encyclopedias: it's free to use, it leverages the collective knowledge of the userbase by letting everybody edit at any moment, and it freely uses citations whenever making a claim, something most encyclopedias didn't do because it took up too much space. These are all things that we take for granted now, but these changes were so revolutionary that Wikipedia essentially set a new standard for encyclopedias, which has almost completely killed traditional print encyclopedias. Wikipedia has grown so large that most people think that it's the only thing that the Wikimedia Foundation does, but not many people know exactly how it works behind the scenes.

Wikipedia As A Community

[edit]

The Wikipedia namespace is the namespace for Wikipedia pages that deal with the internal workings of Wikipedia, but not the Wikipedia articles about Wikipedia. Pages in this namespace tend to fall into 3 different categories:

1. Pages for communication between Wikipedians. The Village Pump page that's linked to on the front page of Wikipedia contains other links to these sort of pages, where people ask questions about how Wikipedia works, suggest policies, or get help with references, among other things.

2. Essays about Wikipedia. People have LOTS of thoughts about Wikipedia, and occasionally they take to Wikipedia to write an essay about it. Essays that authors don't want others to edit are written in user namespace, but essays in Wikipedia namespace are designed to be collaboratively written. There's over 2000 of them, so you're sure to find at least 1 essay that you find interesting.

3. Pages clarifying policies. Wikipedia is big, and that means that you need to have policies to handle day-to-day activities. These pages tell you how to properly select sources, what belongs on Wikipedia, and how conflicts are handled, among many, MANY other things. There's over 300 policies that you are expected to abide by, bringing me to one of the most contentious parts of the Wikipedia community.

The Bureaucracy

[edit]

While you're obviously going to need at least some policies to run a website as large as Wikipedia, and despite official Wikipedia policy that Wikipedia is not a bureaucracy, it's pretty clear that the number of rules that you're expected to know if you want to edit is ridiculous. This creates a hierarchy between the average user and the powerusers who actually take the time to learn these policies and track them as they get updated. As anybody who's ever done a fair bit of editing on Wikipedia can tell you, some of these powerusers treat the pages that they've worked on as a fiefdom. If you dare try to correct anything wrong that you see on a page that they think they own, they will often use their superior knowledge of Wikipedia to shut you down by quoting obscure policies, a practice known as Wikilawyering. Besides that, powerusers tend to have more power than casual users because of their willingness and capacity to become admins or other important people in the Wikipedia community.

The actual process to become an admin on the English language Wikipedia is bizarre, to say the least. Rather than a straight-forward democratic election, admins are elected based on consensus. People give arguments for or against the nominee becoming an admin, and then a bureaucrat (which is like an admin but with the authority to appoint admins or other bureaucrats) weighs the arguments based on quality. Not only is there not any rubric that the bureaucrat has to use to weigh the arguments, there isn't even a set amount of consensus that the nominee needs to have. All that's mentioned is that 75% support means that the nomination is likely to succeed and 65% support means that it's unlikely (nominees for the bureaucrat role are said to require around 85%). While anybody can be a nominee, it's very rare for somebody to actually get the role with an edit count of less than 10000.

When Wikipedia does use democracy, it does so in a way that disenfranchises the majority of the community. The main election in the Wikipedia community is the one for the Arbitration Committee, which handles disputes. To simply cast a vote, you need to have had 150 edits in mainspace. For context, having 100 edits puts you in the top 1% of all users. To actually run as a nominee, you need to have made 500 edits, which puts you in the top 0.25%. Either way, there's no way for the average user to actually impact a committee that might make a judgment affecting them. At the same time, Wikipedia does need some sort of barrier to keep out vandals. It's not a simple problem, although I personally think that the barrier's are too high.

Bots

[edit]

Having a small minority of users handle most of the management of such a large website isn't very easy. To make the process easier, certain users have created bots designed to automate some of the work that would have otherwise been impossible for people to do at scale. A prominent example of this on the English Wikipedia is ClueBot NG, an automoderator designed to detect and revert vandalism. From what I understand, this is a major reason why vandalism, which used to be a common sight on Wikipedia, is now very rare. As you can imagine, the community doesn't want just anybody to create a bot that can modify pages, since it's way easier to create a bot that vandalizes than it is to revert the vandalism. To have your bot accepted for usage, you have to make a request, where your bot will be evaluated to make sure that it follows established policy.

Edit-a-thons

[edit]

As an encyclopedia, Wikipedia obviously doesn't want to not have an article on something important. Unfortunately, Wikipedia tends to have gaps in certain areas, like women's history and art. To cover these gaps and improve the diversity of the overall Wikipedia community, some people organize something called an edit-a-thon, where people get together to collectively edit Wikipedia while learning about how to contribute to the website.

WikiProjects

[edit]

Not everybody has the same interests, but since there's so many people on Wikipedia, people can form tinier communities within the larger Wikipedia community. These are the WikiProjects, which help maintain and create articles within their sphere of interest. There's tons of them for just about anything that you can think of, and they all have so many different cultures that it's impossible to write about them all. It's interesting stuff, so I recommend diving in head first and seeing all the WikiProjects out there.

Wikipedians In Residence

[edit]

While the Wikimedia movement is mostly represented by the Wikimedia Foundation, traditional establishments often also want to help out and contribute. One of the ways they do this is by hiring what is known as a Wikipedian in residence. This is a job position where somebody works to make Wikipedia contributions related to an institution's mission (e.g. an art museum hiring a Wikipedian in residence to write articles about art history). Besides their contributions to Wikipedia on behalf of their employers, Wikipedians in residence can also represent Wikipedia's interests by promoting outreach and helping to establish the website as a legitimate source of information.

The Newsletters

[edit]

A lot of things are happening on Wikipedia, and you probably don't want to go looking for all the news by yourself. Instead, the people that do often create newsletters for the community to read. The biggest newsletter by far, actually a newspaper, is The Signpost, which gives an exhaustive overview of the state of the website every two or three weeks. There's many newsletters across the Wikimedia movement, with several active newsletters being maintained by WikiProjects to keep enthusiasts up to date about the WikiProject and the subject matter.

The Wikipedia Library

[edit]

Making edits requires citations, and a lot of good sources are hidden behind paywalls. Solution: partner with universities around the world to give Wikipedia editors free access to academic articles. the Wikipedia Library. This is technically part of the Meta Wikimedia platform instead of Wikipedia, but its purpose is to be used by Wikipedia editors, and only Wikipedia editors. To access the database, you need to have made 500 edits in total and at least 10 edits in the last 30 days. On one hand, this is an incredible effort to improve Wikipedia and democratize research, but on the other, the high barrier to entry increases the disparity between average users and powerusers. Despite what the description may have you believe, the edits don't actually need to be on Wikipedia. I myself got access to the library primarily for edits that I made on other projects. It also uses a somewhat loose definition of "library". While there's many academic papers, there are also un-paywalled newspapers and access to genealogy records like Ancestry.com. You can learn more about the library on its newsletter, Books & Bytes.

Philosophy Of Editing

[edit]

There's a big debate on what the role of an editor should be. On one hand, there's an ideology called deletionism, which believes that articles with very low views should be deleted. On the other hand, there's an ideology called inclusionism, which believes that articles should be kept whenever possible. In my eyes, both of them have some pretty good points. On one hand, the inclusionists argue that Wikipedia isn't paper, so it doesn't make sense to prune articles the way that paper encyclopedias used to. While rarely viewed articles only get a few views a day, they collectively get a large number, so removing them would degrade the overall user experience. On the other hand, while Wikipedia isn't paper, storage space is still finite. An individual article is pretty negligible, but there's millions of articles on Wikipedia, which adds up. In addition, every article not deleted is an article that has to be maintained, which takes up energy that could be directed elsewhere. What the community wants is the best of both: countless articles about any niche topic that you can think about but with countless maintainers that can quickly reverse any vandalism and write new articles. Unfortunately, a trade-off has to be made, but nobody can agree on what it should be.

Mascot

[edit]

While Wikipedia doesn't have an official mascot, the unofficial mascot is widely recognized to be an anime girl called Wikipe-tan. She occasionally shows up in certain Wikipedia articles (particularly articles about anime culture) and is occasionally cosplayed at Wikimedia meetups. She also serves as the official mascot for WikiProject Anime And Manga. Wikiquote and Wikimedia Commons also have their own anime girls, but they aren't really featured that much outside of this incredible image.

Wikivoyage

[edit]

This is my favourite of the sister projects. Like the name implies, Wikivoyage compiles information about travelling, containing information about different locations, guides, and itineraries for you to use. Wikivoyage wasn't directly created by the Wikimedia Foundation. Instead, it's an offshoot of a different website called Wikitravel, which has never been affiliated with the Wikimedia Foundation. Also unlike Wikipedia, which expects citations to back up the information you add, Wikivoyage doesn't have citations at all. Instead, you're expected to use your own background knowledge when writing articles. Another neat thing about Wikivoyage is that unlike any other project, Wikivoyage doesn't let you directly create pages. Instead, you're expected to link to an empty page and then edit the page from there. The idea is that every page should be connected to another, creating what the community calls a breadcrumb trail between all of the articles. Like every other smaller Wikimedia project, Wikivoyage also has WikiProjects and policies that are similar to Wikipedia's, but less of both because of its smaller size. However, WikiProjects are called expeditions rather than WikiProjects.

Wikisource

[edit]

Okay, so you can't access the Wikipedia Library, but you still want to find a source for something. Enter another one of the Wikimedia projects, Wikisource. This is a huge repository of freely licensed or public domain texts that can be used as a source, whether it's a book, legal proceeding, or poetry. This project requires a lot more effort than may first meet the eye. A lot of sources are obscure articles that only exist in print, so Wikisource is often the first point where they're digitalized. That requires a lot of proofreading, which is done in the page namespace. To make sure that digitalized texts are properly validated, digitalized texts have to be proofread by at least 2 different people before the text is moved to main namespace. To drum up support for what's a very intensive task, the Wikisource community has monthly challenges to finish proofreading key texts. Wikisource sorts texts by subject and author, which makes it easy to find what you're looking for. Wikisource users also translate certain texts and transcribe films, but this is much rarer because of the high level of effort needed to do that.

Wikiquote

[edit]

This is a repository for quotes by famous people, TV shows, books, and more. This can be thought of as the intersection between Wikipedia and Wikisource. All quotes have to be verified, famous, and have endured the test of time. However, quotes that can not be attributed to a person are exempt from the verification requirement. While the main purpose of Wikiquote is to record the quotes that a person has said or written, it also gives information about quotes that a person is widely but incorrectly assumed to have invented.

Wikinews

[edit]

What would happen if there was a project where people around the world could write news articles that anybody can edit as major events evolve? Turns out, not much. This is the graveyard of the Wikimedia project. Even though there's so much news coming out all the time, Wikinews is lucky to get more than 3 articles a day. Even on the English Wikinews, a lot of attention is given to Russia, with most of the rest being given to America. Even parts of the 1st world like Australia get very little coverage, let alone 3rd world countries. Embarrassingly, Wikipedia has totally outshined Wikinews by having an infobox on the front page that gives more information about the news than Wikinews itself, a fact that it mentions on the somewhat gloaty article for the project. Despite its overall irrelevance, I think that there's still a few interesting things about it that's still worth mentioning.

Sourcing

[edit]

Wikinews blends the sourcing requirements of Wikipedia and Wikivoyage by allowing for articles that get information from other news reports and actual original reporting. Both of these are pretty interesting. Blending new reports may make it seem like you're just rehashing news reports made by other outlets, but many news reports often have information that's missing in others. A blended news report could be more objective than one published by traditional agencies. Also, the collaborative writing process could allow for conservatives to challenge any perceived left-wing bias, potentially leading to more bipartisan and neutral reporting. The original reporting could have also helped foster citizen journalism and provide more information on niche events that happen in the author's city. It's not hard to imagine a different future where Wikinews took off and citizen journalists made WikiProjects for their city, with an accompanying newsletter to rival traditional local news.

Accreditation

[edit]

To get access to certain events, a journalist needs a press pass. To help citizen journalists, Wikinews will accredit high-quality contributors so that they can get press passes to access restricted areas.

Wikimedia Commons

[edit]

Wikimedia projects tend to make heavy use of images, audio, and video in articles. If you have a duplicate of a piece of media that's already used in a different project, then you waste storage space. In a totally unrelated issue, people often need to find a piece of media to use, but can't because of copyright issues. The Wikimedia Commons is the solution to both problems. It's a repository of media files that are freely licensed or public domain, which other Wikimedia projects use to add media to articles instead of locally uploading. While Wikimedia projects sometimes have to use non-free media in their articles, the Wikimedia Commons has done a very good job at making sure that there's almost always a free media file on Wikimedia Commons that editors can use instead. At the time of writing, there's over 100,000,000 files that have been uploaded.

Wikibooks

[edit]
TKTK

Despite the name, this isn't a repository for published books that are freely licensed or public domain. Instead, this is a place where people can collaboratively write textbooks for a variety of subjects. If you use Lichess, you've probably used Wikibooks without even realizing: whenever you use the opening explorer, Lichess fetches information about the opening that you're looking at from the Wikibook Chess Opening Theory. It's a cool idea, but unfortunately, writing a textbook is a lot of effort, which runs into the same problem that Wikinews has where very few people are actually willing to put real work into contributing.

Wikibooks includes an excellent sub-site, The Cookbook. If I had to take a guess, this is because it leverages how shallow books on Wikibooks tend to be. Since a cookbook only really requires you to write a short recipe, the barrier to entry is a lot lower than contributing to something like a math textbook, which requires specialized knowledge and a deep explanation about the subject matter. However, I don't think that's a full explanation about what's going on here. The Cookbook is far too broad and detailed to say that it only took off because it's low-effort. Instead, I think that The Cookbook has spawned an entire sub-community on Wikibooks, having its own namespace and several categories within the namespace that deal with different cuisines, ingredients, and even more abstract ideas like seasonality. While there's a few troll recipes, I would overall say that this is perhaps the best cookbook on the internet.

Wikiversity

[edit]

This is a place where people can collaboratively create courses to teach people about a wide variety of topics. As with most Wikimedia projects, the overwhelming majority of learning material is in text, not video or audio. That creates a huge level of overlap with Wikibooks, but without The Cookbook to drive traffic. However, unlike Wikibooks, Wikiversity encourages active participation from learners by promoting a philosophy of "learning by doing". Besides courses, a major part of Wikiversity is learning projects, where users get together to discuss certain subjects. Some of the courses can be good, but most courses are sparse on details or focused on more fringe ideas.

Wiktionary

[edit]

This is one that you probably used before. Wiktionary was originally a dictionary that anybody can contribute to, but it's grown to be so much more than that. Wiktionary is now also a thesaurus and gives the etymology of every word. Despite the name, a better way to think about it is the Wikipedia of language. An cool aspect of Wiktionary is that by letting people add whatever word they want, you can also get information on new and slang words, allowing the dictionary to rapidly evolve alongside the language itself.

Wikidata

[edit]

If you've been looking at the different Wikimedia projects while reading this blog post, you might have noticed a link on the sidebar called "Wikidata item". This takes you to Wikidata, where information is stripped of all unnecessary details and reduced to structured data. This can be thought of as Wikimedia Commons for facts instead of media. While you can browse this project the same way that you can browse the other projects, it's better used to scrape data for machine learning or as the backend for some sort of Wiki viewer. The primary use of data hosted here is to be used by the Wikimedia projects, where they can all receive up-to-date information by a single change to the linked item on Wikidata. This also helps to ease the problem of maintainability, which is a lifesaver for smaller projects.

Query

[edit]

Let's say that you need to query Wikidata. Instead of needing to write your own scraper, Wikidata has a built in way to access the data using SPARQL. While using this tool isn't necessarily the most intuitive, Wikidata has material to help you learn the language. Because you can directly submit your query to this url, it's easy to write a script that accesses Wikidata instead of using the GUI.

WikiCite

[edit]

As you might have guessed, getting citations for academic resources is important to contribute to Wikimedia projects, especially Wikisource. To that end, an initiative called WikiCite has started to add citation data to Wikidata so that there can be a centralized database for users to draw from. At the time of writing this, over 41 million items are instances of "scholarly article", and most of them have at least some citation data such as "author" or "DOI" added. Part of the WikiCite initiative is Scholia, a tool that lets you search for academics or academic articles to see their citation data. It also does some other interesting stuff, like automatically generate a citation graph for each academic article (if applicable) and listing the number of citations the article received every year, as well as how many of those citations were made by one of the authors.

The Limits Of Wikidata

[edit]

You start to run into problems when your database gets too many queries, and it's begun to seriously affect the Wikidata project. At the time of writing this, the Wikimedia foundation has begun to separate the WikiCite dataset apart from the main Wikidata dataset because the strain on Wikidata servers has become too much. What that means is that you have to specify whether or not you want to search the WikiCite dataset when you use the query service from now on. However, this only scales the Wikidata dataset back to 2018 levels. It's not clear what Wikidata will do to make sure that the project can handle the increased load as more and more data is added. In the mean time, there's an WikiProject to quantify and estimate the various limitations on Wikidata's scope.

Wikispecies

[edit]

This is a project designed for biologists needing to look up information about species and other taxons. Specifically, information about animals stripped of all unnecessary details and reduced to a database of species. Hmm, where have I heard that before? As you might have guessed, this is pretty much Wikidata but for biologists and without an easy way to scrape information. The main reason it exists, it seems, is because it was created before Wikidata was conceived of. Wikidata doesn't seem to have quite enough information to totally replace Wikispecies yet, but I feel like Wikispecies is the project that's most at risk of getting deprecated.

Wikifunctions

[edit]

This is the newest Wikimedia project. This is a repository of computer functions, which are written in Python and Javascript. Despite first appearances, this isn't meant to be some sort of FOSS replacement to Github. Instead, it's meant to be used for an upcoming project called Abstract Wikipedia.

Abstract Wikipedia

[edit]

It's easy to take it for granted if you speak English, but some of the Wikipedias for other languages can be pretty lacking in information. Also, smaller Wikipedias are at risk of being taken over by bad actors who want to push an agenda or pretend to be Scottish. Something clearly needs to be done, and what the Wikimedia Foundation thinks should be that something is Abstract Wikipedia, which is meant to be language-independent. This project is still in its infancy, but the idea is that the functions on Wikifunctions could be used with the data on Wikidata to create an abstraction of an article, which is then made readable by using a program called a renderer. This should provide more information than can be provided by normal integration with Wikidata. This isn't on the table yet, but there's no reason to think that this couldn't be deployed for other projects if it proves to be successful.


Next month, Part 2: The Technology Behind Wikimedia.


Signpost
In this issue
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.


















Wikipedia:Wikipedia Signpost/Next_issue/Community_view