The Signpost

Opinion essay

The copyright crisis, and why we should care

Moonriddengirl has been a Wikipedian since the first half of 2007, becoming an administrator for the English Wikipedia later that year. In that capacity, she dedicates much of her volunteer time to dealing with copyright concerns at the English Wikipedia's copyright problems board and contributor copyright cleanup, attempting to implement Wikimedia's zero tolerance policy on copyright infringements. In addition, she works for the Wikimedia Foundation in community liaison. Below, Moonriddengirl outlines her view that all contributors need to pull together to manage copyright concerns on the English Wikipedia.

The views expressed are those of the author only. Other editors will often leave opposing views and potential corrections in the comments section. The Signpost welcomes proposals for op-eds. If you have one in mind, please leave a message at the opinion desk.



We have a copyright crisis. Wikipedia is full of copyright problems. How full, I don't know.

I do know that CorenSearchBot (before it became inoperable due to a catastrophic change in Yahoo's terms) routinely found several dozen new articles every day built on content copied from other websites. I know that every day more articles and images are tagged by human contributors for speedy deletion for copyright concerns or listed for the slower processes of the copyright problems board or possibly unfree files. I know that there are more tens of thousands of articles and images awaiting copyright review at WP:CCI than I want to tally; this is content placed by people we know have repeatedly violated copyright. Odds are good that a substantial portion of this content is a problem, too. In spite of policies prohibiting it—and in spite of prominent reminders of those policies on every edit page—more copyrighted content finds its way into our project every day.

Why it happens

People place copyrighted content on Wikipedia because they can, because it's easier to copy somebody else's words than write your own, because it's hard to resist using somebody else's picture when the only other alternative is that an article has no pictures at all. Some people do it accidentally, attempting to change content but not changing it enough. Some people do it defiantly, using Wikipedia as part of their own statements against copyright laws.

Most people do it with good intentions, I believe. I've talked to hundreds of people about this over the last few years. Few of them seem to be out to deliberately cause trouble, even the ones who wind up being blocked because we can't get them to stop. The fact is that many of them just don't see the harm, and some have trouble even understanding what the issue is.

In some cultures, copyright is no big deal—even reputable sources copy without obvious concern. (No kidding: I've seen books by evidently respected academicians that have baldly copied from Wikipedia without credit and government websites that have done the same.) In a way, it's not much of a deal to the international Internet culture we all share. People paste news articles into their blogs or appropriate copyrighted cartoon characters as their avatars all the time, without a thought as to whether the content is copyrighted and what that might mean.

Why we should care

This may be why even some of the contributors who don't cause the problems and who plainly do understand the concept of copyright simply don't think about whether or not it's happening here. Blatant violations may pass right in front of them, and they don't notice. They simply don't seem keyed in to the issue. It happens everywhere, and, after all, if a copyright holder objects, all we have to do is take it down.

While technically true, this is an attitude Wikipedia can't afford. For whatever reasons people place the content, and however we ourselves may feel about copyright, keeping it is not only potentially damaging to copyright holders, it's bad for us. It's bad for our reusers; it's bad for Wikipedia; it's bad for our volunteers.

I'm not going to discuss the question of whether intellectual property laws are a good thing or a bad thing. (Although as a published writer who receives small royalty checks every year, I have a certain interest in the question.) It's a passionately debated subject, and, in my opinion, it's not necessary to go into it to settle the important point. It's a simple matter of fact that we are subject to intellectual property laws, and we need to recognize how working within that reality is in our best interests. While we have the option to swiftly address copyright concerns by simply pulling material from publication—indeed, we have a legal obligation to have a designated agent to answer takedown notices sent to us by copyright holders and their representatives—our content reusers may not have the option of responding so simply. If a video documentarian uses images that were hosted on Wikipedia under the mistaken belief that the free license label on them is accurate, he may have to recut his documentary to remove them or replace them with something else. If a publisher places some of our featured articles on animals in a textbook, she may have to pull it from distribution.

A propaganda cartoon explaining why multilicensing benefits reusers, part of the push to accommodate reusers.

This is a major problem. We like content reusers (if not all of them). We really do. We encourage them to do it—to use our material online, in books, newspapers, video documentaries; to use it and modify it whenever and however they like, so long as they follow the licensing terms. Indeed, the Wikimedia Foundation's mission is "to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally." We've made it as easy for them as we can. But how many times would a reuser encounter the trouble or expense of withdrawing problematic content before deciding to avoid our work? If the content we bill as "free" is not, we risk damage to our reputation and discouraging the global dissemination of our work.

Beyond that, I have personally observed the inconvenience and expense (at least in terms of time) to our volunteers when copyright problems created by others are encountered too late. "Too late" in this context would be after they have themselves engaged with the content. Too often, somebody creates an article or expands it with copyrighted content placed without permission of the copyright holder. Others come behind to improve the article, sometimes putting a great amount of time into polishing prose, locating sources, adding text. Their work is tainted, too. The time they've spent polishing copyrighted content is lost when that content must be removed. The hours they've put in could have been better spent building usable content or creating an article we can retain. Then there is the cost to their motivation. I've spoken multiple times to people in this situation who are heartsick and discouraged by the experience. I hate the thought that we've wasted their time, that we might lose them, because of a problem that was not promptly detected or resolved.

There's also a cost to the volunteers who create the problems in the first place. As I said, I believe most of these people are working in good faith. Those who have trouble grasping the issue may require more guidance than those who simply didn't think it mattered, but copyright problems can be corrected. If the issue is discovered early in a Wikipedian's career, we may be able to more easily clean up any outstanding issues and help them avoid creating more, enabling them to move forward as constructive and valuable contributors. If problems linger, more articles may be tainted and fall-out greater in terms of both collateral damage to others and loss of the contributor themselves.

We need to care; we need to take action.

What we can do

Handout derived from "Let's get serious about plagiarism" in The Signpost

While copyright cleanup can use all the active contributors it can get, you can help with the problem simply by being conscious of the potential so that you recognize copyright issues when they appear. Does an image look unlikely to be original to the uploader? Text too polished or disjointed in tone? Even if you don't feel that you can help with cleanup, you can tag a suspicious text or image copyright concern for others to evaluate. You can save reusers potential time and expense, save your fellow volunteers wasted effort, perhaps a reparable contributor issue from devolving into an unsalvageable one. The simple act of identifying the problem is the first, crucial step to resolving it. Swift handling is the best service we can provide to our reusers, to the project and to our contributors (as well as, in my opinion, to the copyright holders). By recognizing the problem and resolving it when it first appears, we can keep it contained.

Further reading


















Wikipedia:Wikipedia Signpost/2011-09-05/Opinion_essay