Wikimedia’s mission is to provide educational content and to effectively disseminate it. Doing so requires understanding the needs and motivations of the people who read Wikipedia. In this blog post, we discuss what we learned about Wikipedia reader motivations and needs across 14 languages from a recent research study.
“ | لماذا تقرأ هذه المقالة اليوم؟", "আপনি কেন এ নিবন্ধটি আজ পড়ছেন?", "為什麼你今天會讀這篇條目?", "Waarom lees je dit artikel vandaag?", "Why are you reading this article today?", "Warum lesen Sie diesen Artikel gerade?", "?למה אתה קורא את הערך הזה היום", "यह लेख आज आप क्यों पढ़ रहे हैं?", "Miért olvasod most ezt a szócikket?", "あなたは今日何のためにこの項目を読んでいますか?", "De ce citiți acest articol anume astăzi?", "Почему вы читаете эту статью сегодня?", "Por qué estas leyendo este artículo hoy?", "Чому Ви читаєте цю статтю сьогодні? | ” |
This is the question we posed to a sample of Wikipedia readers across 14 languages (Arabic, Bengali, Chinese, Dutch, English, German, Hebrew, Hindi, Hungarian, Japanese, Romanian, Russian, Spanish, Ukrainian) in June 2017[1] with two goals in mind: to gain a deeper understanding of our readers’ needs, motivations, and characteristics across Wikipedia languages, and to verify the robustness of the results we observed in English Wikipedia in 2016. With the help of Wikipedia volunteers, we collected more than 215,000 responses during this follow-up study, and in this blog post, we will share with you what we learned through the first phase of data analysis.
Every second, 6,000 people view Wikipedia pages from across the globe. Wikipedia serves a broad range of the daily information needs of these readers. Despite this, we know very little about the motivations and needs of this diverse user group: why they come to Wikipedia, how they consume its content, and how they learn. Knowing more about how this group uses the site allows us to ensure that we’re meeting their needs and developing products and services that help support our mission.
It’s incredibly hard to receive this kind of data at scale and we had to build our capacity to take in this kind of data. Over the past several years, we have laid the foundation for doing this kind of research. Starting in 2015, the Wikimedia Analytics team made the storage and analysis of webrequest logs possible. These logs, which are stored for 90 days, provide an opportunity for performing deeper analyses of reader behavior. However, analyzing actions can be difficult on a site at Wikipedia’s scale. Every second, we can easily receive 150,000 requests performed by readers when loading a webpage. Without knowing what kind of questions we want to answer or what reader characteristics we are interested in, the analysis of webrequest logs resembles the search for a needle in the haystack. The key to our puzzle came in 2015, with the arrival of the Wikimedia Foundation microsurvey tool QuickSurveys. Through QuickSurveys, we can create a framework for interaction with people using Wikipedia. For this study, we combined qualitative user surveys (via QuickSurveys) with quantitative data analysis (via webrequest logs) to make sense of our readers’ needs and characteristics.
In 2016, we built the first taxonomy of Wikipedia readers, quantified the prevalence of various use cases for English Wikipedia, and gained a deeper understanding of readers’ behavioral patterns associated with every use case. (The details of the methodology are described in our peer-reviewed publication on this topic.) A year later, when we replicated this study and extended it to other language editions, we put the same survey questions from 2016 in front of readers across 14 languages. More specifically, we asked readers about
Below is what we have learned so far. (Note that all the results below are debiased based on the method described in the appendix of our earlier research to correct for various forms of possible representation bias in the pool of survey respondents.)
"I am reading this article to (pick one) [look up a specific fact or get a quick answer, get an overview of the topic, or get an in-depth understanding of the topic]
The charts below summarize users’ information need across the 14 languages we studied.[2]
From these graphs, we see that on average around 35 percent of Wikipedia users across these languages come to Wikipedia for looking up a specific fact, 33 percent come for an overview or summary of a topic, and 32 percent come to Wikipedia to read about a topic in-depth. There are important exceptions to this general observation that require further investigation: Hindi’s fact lookup and overview reading is the lowest among all languages (at 20 percent and 10 percent, respectively), while in-depth reading is the highest (almost 70 percent). It is also interesting to note that Hebrew Wikipedia readers have the highest rate of overview readers (almost 50 percent).
"Prior to visiting this article (pick one) [I was already familiar with the topic, I was not familiar with the topic and I am learning about it for the first time]"
We repeat the same kind of plot as above, but now for the question that asked respondents how familiar they were with the article on which the survey popped up.
The average familiarity with the topic of the article in question is 55 percent across all languages. Bengali and Chinese Wikipedia users report much lower familiarity (almost 40 percent), while Dutch, Hungarian, and Ukrainian users report very high familiarity (over 65 percent). Further research is needed to understand whether these are fundamental differences between the reader behavior in these languages or whether such differences are the result of cultural differences in self-reporting.
"I am reading this article because (select all that apply) [I have a work or school-related assignment, I need to make a personal decision based on this topic (e.g. to buy a book, choose a travel destination), I want to know more about a current event (e.g. a soccer game, a recent earthquake, somebody’s death), the topic was referenced in a piece of media (e.g. TV, radio, article, film, book), the topic came up in a conversation, I am bored or randomly exploring Wikipedia for fun, this topic is important to me and I want to learn more about it. (e.g., to learn about a culture), Other.]
Finally, we look at the sources of motivation leading users to view articles on Wikipedia.
These are the results:
Among the seven motivations the users could choose from, intrinsic learning is reported as the highest motivator for readers, followed by wanting to know more about a topic that they had seen from media sources (books, movies, radio programs, etc.) as well as conversations. There are some exceptions: In Spanish, intrinsic learning is followed by coming to an article because of a work or school assignment; in Bengali by conversation and current event. Hindi has the lowest motivation by media score (10%), while Bengali has the highest motivation by intrinsic learning.
We still need time to further analyze the data to understand and compare the behavior of users based on the responses above. We encourage careful examination of the above results, avoiding conclusions that the analysis may not support. Based on the above results we can confidently say a few things:
We have started the second phase of analysis for some of the languages. If you observe interesting patterns in the data in this blog post that you think we should be aware of and look into, please call it out. If you have hypotheses for some of the patterns we see, please call them out. While we may not be able to test every hypothesis or make sense of every pattern observed, the more eyes we have on the data, the easier it is for us to make sense of it. We hope to be able to write to you about this second phase of analysis in the near future. In the meantime, keep calm and read on!
———
This research is a result of an enormous effort by Wikipedia volunteers, researchers, and engineers to translate, verify, collect, and analyze data that can help us understand the people behind Wikipedia pageviews and their needs. We would like to especially thank the Wikipedia volunteers who have acted as our points of contacts for this project and helped us with the translation of the survey to their languages, going through the verification steps with us, and keeping their communities informed about this research.
———
Discuss this story