The Signpost

File:High Impact - Wikipedia sources and edit history document two decades of the climate change field - Figure 4.jpg
Benjakob et al
CC 4.0 BY-SA
14
0
700
Recent research

"Newcomer Homepage" feature mostly fails to boost new editors


A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Largest newbie support features experiment to date finds mostly null results

How to better support new editors has long been a conundrum for Wikipedians. In 2018, the Wikimedia Foundation launched its Growth team, which tackles this issue by working on "features to encourage newcomers to make edits." A paper[1] by four Wikimedia Foundation staff reports on the results of a long-time systematic study evaluating their impact:

"We propose the Newcomer Homepage, a central place where newcomers can learn how peer production works and find opportunities to contribute, as a solution for attracting and retaining newcomers. The homepage was built upon existing research and designed in collaboration with partner communities. Through a large-scale controlled experiment spanning 27 non-English Wikipedia wikis, we evaluate the homepage and find modest gains, and that having a positive effect on the newcomer experience depends on the newcomer’s context."

The newcomer homepage is summarized as "a central place to learn how Wikipedia works and that they can participate by editing". It offers a set of "Newcomer Tasks" to work on articles that the community has flagged as needing improvement, "with some tasks categorized as 'Easy' (e.g. copy editing, adding links), 'Medium' (e.g. adding references), and 'Hard' (e.g. expanding short articles)."

One version of the newcomer homepage on Czech Wikipedia, suggesting an article for copyediting (lower left)

More specifically, the team conducted randomized controlled experiments, where newly registered accounts were either shown a "Get started here!" notification inviting them to visit their "Newcomer Homepage", or received the standard interface. Outcomes were tracked for four different metrics (all based on edits made to articles and article talk pages). Two different methods were used to evaluate impact: 1) An "'Intent-to-Treat' (ITT) approach, where we learn whether an invitation to the homepage results in significant differences" (combined with hierarchical regression to aggregate the results from the different wikis), and 2) a two-stage least squares approach to obtain an "estimate of the causal effect of making suggested edits conditional on being invited". The overall findings are:

  • Activation A small but significant increase in overall activation [specifically, an 1% increase in the odds of making an edit within 24 hours of signing up], and that the outcome depends on newcomer context. Our intervention appears to distract newcomers who were already in the process of contributing, but seems to support those who were not, and in particular those who did not create an account with an intention to contribute.
  • Retention No difference in the retention rate; we find a strong correlation with the activity level on a newcomer’s first day.
  • Productivity No difference in the overall number of constructive contributions. [measured as the number of edits made within 15 days]
  • Revert rate No difference in the proportion of contributions rejected by the community.

(The null results on retention and productivity contrast with the positive results that the team had earlier found in a smaller-scale experiment confined to four language Wikipedias, see our brief earlier coverage.)

The framing of these mostly null results as "modest gains" in the abstract appears a bit generous, also considering that the only metric with a significant increase (activation) seems less directly related to furthering Wikipedia's mission than some of the others. Similar A/B tests have been successfully used across the internet to greatly increase new user retention and activity on many websites, quite a few of which may be competing with Wikipedia for people's free time. However, the growth teams of commercial sites often have vastly more resources at their disposal (fueled by advertising revenues), enabling them to try out many more different features until hitting on one that has a significant impact. And in any case, in this reviewer's opinion these Wikipedia experiments should be considered a success in that they represent a major advance in helping us better understand new editors. As highlighted by the authors, there is a scarcity of existing research about what works specifically on sites like Wikipedia: "It is unclear what solutions work when it comes to attracting and retaining newcomers at scale in peer production communities." They note that in previous research (apart from an experiment that successfully used barnstar-like awards to increase long-term retention of new editors on German Wikipedia), "proposed solutions have only been available in a single community (English Wikipedia), and only two have been evaluated in controlled experiments". These are The Wikipedia Adventure (cf. our coverage: "The Wikipedia Adventure: Beloved but ineffective"), and the Teahouse, which the authors call (to their knowledge) the only "controlled experiment that has shown a significant impact on newcomer retention." (However, non-Wikimedia researchers have pointed out that "The Teahouse study might also have been a false positive" because of a statistical problem involving multiple comparisons.)

Briefly

  • The Wikimedia Foundation's research department invites proposals (deadline: April 29) for the "Wiki Workshop Hall", a new feature of the annual Wiki Workshop online conference consisting of two 30-minute sessions "for Wikimedia researchers and Wikimedia movement members to connect with each other."
  • See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

"Prominent users in the ECC article. A) Top 10 editors, based on edit count. B) User activity timeline of the top 20 users. In green are years of activity for each user. On the bottom are counts of active users per year (out of these 20)." (Figure 4 from the paper)

"High Impact: Wikipedia sources and edit history document two decades of the climate change field"

From the abstract:[2]

[...] to understand how [climate change] was represented on English Wikipedia, we deployed a mixed-method approach on the article for “Effects of climate change” (ECC), its edit history and references, as well as hundreds of associated articles dealing with climate change in different ways. Using automated tools to scrape data from Wikipedia, we saw new articles were created as climatology-related knowledge grew and permeated into other fields, reflecting a growing body of climate research and growing public interest. Our qualitative textual analysis shows how specific descriptions of climatic phenomena became less hypothetical, reflecting the real-world public debate. The Intergovernmental Panel on Climate Change (IPCC) had a big impact on content and structure, we found using a bibliometric analysis, and what made this possible, we also discovered through a historical analysis, was the impactful work of just a few editors. This research suggests Wikipedia’s articles documented the real-world events around climate change and its wider acceptance - initially a hypothesis that soon became a regretful reality. Overall, our findings highlight the unique role IPCC reports play in making scientific knowledge about climate change actionable to the public, and underscore Wikipedia’s ability to facilitate access to research. [...]


"From causes to consequences, from chat to crisis. The different climate changes of science and Wikipedia"

From the abstract:[3]

"Understanding how society reacts to climate change means understanding how different societal subsystems approach the challenge. With the help of a heuristic of systems theory two subsystems of society – science and mass media – are compared with respect to communications about climate change over the last 20 years. With text mining methods metadata of documents from two databases – OpenAlex and Wikipedia – are generated, analyzed, and visualized. We find substantial differences as well as similarities in the social, factual, and temporal dimensions. [...] This demonstrates for science a discursive shift from causes to consequences and for mass media a shift from chat to crisis. Science shows an ongoing growth process, while the attention of mass media appears cyclical."

"Authors of climate change pages in [English] Wikipedia per year"
"New and edited climate change pages in [English] Wikipedia and proportion of all edited pages per year (index: 1 =2001)"

From the abstract:[4]

"[...] The aim of this paper is to [... analyze] whether the research topics of greatest academic interest align with those that attract the most social attention. To this end, the OpenAlex concepts are explored by comparing their works count with the page views of their respective Wikipedia articles. As a result, a correlation analysis between the two metrics reveals a lack of connection between the two realms.

See also a presentation at the November 2023 Wikimedia Research Showcase, and earlier coverage of related publications involving the first author


"Collaborating in Public: How Openess Shapes Global Warming Articles in Wikipedia"

From the abstract:[5]

[...] I trace how the global warming-related articles in Wikipedia changed over time, particularly in the wake of the publication of the 2007 International Panel on Climate Change Fourth Assessment Report. [...] I trace how Wikipedians enact genre in an unstable environment by analyzing how arguments unfold in Wikipedia talk pages, how the article text and citations change, as well as the larger network of global warming-related articles. [...] In chapter 2, I find that Wikipedians’ arguments create boundaries around the discursive spheres that can be cited within different articles, which suggests the significance of arguments not only about the topic but about genre as a deliberative resource in networked discourse. In chapter 3, I find that editors’ work in enacting genre results in facts becoming more at issue, or destabilized, within articles through the course of 2007. This analysis suggests that arguments about genre, and the easy availability of circulating texts online, may challenge consensus about controversial issues. In chapter 4, I use argument and network analysis to trace both Article for Deletion discussions and also the larger ecosystem of articles about global warming. This analysis shows how the talk page and article editing practices that I trace in earlier chapters become sedimented within the site’s information architecture, shaping what Internet users may learn about the issue. [...]

Higher-quality environmental articles "have more editors and edits, are longer, and contain more references, as well as a higher ratio of references to words"

From the abstract:[6]

"Wikipedia articles are categorized into different levels of quality, so we analyzed all 7,048 environmental articles in the Environment Assessment project on English-language Wikipedia. Based on a review of literature, we selected indicators of information quality (number of editors, number of edits, article length, number of references, and the ratio of references to words) and tested the correlation between these indicators and quality perception in the Wikipedia Assessment project. We found that articles perceived as higher quality typically have more editors and edits, are longer, and contain more references, as well as a higher ratio of references to words"


"Using Wikipedia Pageview Data to Investigate Public Interest in Climate Change at a Global Scale"

From the abstract:[7]

"[...] This study examines global engagement with climate change and related concepts through an analysis of around 517 Million Wikipedia pageviews of 3965 items from WikiProject Climate Change across 213 countries in the years 2017 to 2022. We take advantage of Wikimedia Foundation's differentially-private daily pageview dataset, which makes it possible to study Wikipedia viewing behavior in a language edition agnostic way and on a per-country basis. Temporal analysis reveals a stagnant engagement with climate change articles, contrary to societal trends, possibly due to the attitude-behavior gap. We also found substantial regional differences, with countries from the global north displaying greater traffic compared to the global south. Specific events, notably Greta Thunberg's speech at the UN climate summit in 2019, drive peaks in climate change engagement [...]. However, causal time series analyses show that events like these do not lead to long-lasting increased traffic."


References

  1. ^ Warncke-Wang, Morten; Ho, Rita; Miller, Marshall; Johnson, Isaac (2023-09-28). "Increasing Participation in Peer Production Communities with the Newcomer Homepage". Proceedings of the ACM on Human-Computer Interaction. 7 (CSCW2): 1–26. doi:10.1145/3610071. ISSN 2573-0142.
  2. ^ Benjakob, Omer; Jouveshomme, Louise; Collet, Matthieu; Augustoni, Ariane; Aviram, Rona (2023-12-01), High Impact: Wikipedia sources and edit history document two decades of the climate change field, bioRxiv, doi:10.1101/2023.11.30.569362
  3. ^ Korte, Jasper W.; Bartsch, Sabine; Beckmann, Rasmus; El Baff, Roxanne; Hamm, Andreas; Hecking, Tobias (2023-10-01). "From causes to consequences, from chat to crisis. The different climate changes of science and Wikipedia". Environmental Science & Policy. 148: 103553. doi:10.1016/j.envsci.2023.103553. ISSN 1462-9011.
  4. ^ Arroyo-Machado, Wenceslao; Costas, Rodrigo (2023-04-21). Do popular research topics attract the most social attention? A first proposal based on OpenAlex and Wikipedia. 27th International Conference on Science, Technology and Innovation Indicators (STI 2023). doi:10.55835/6442bb04903ef57acd6dab9e.
  5. ^ Cooke, Ana (2018-05-01). Collaborating in Public: How Openess Shapes Global Warming Articles in Wikipedia (PhD thesis). Carnegie Mellon University. (published online 2023)
  6. ^ Petiška, Eduard; Kuběna, Aleš; Dressler, Michal (2024-01-15). What does the data analysis of 7,048 environmental articles tell us about the quality of Wikipedia?. In Review.
  7. ^ Meier, Florian Maximilian (2024). "Using Wikipedia Pageview Data to Investigate Public Interest in Climate Change at a Global Scale". ACM Web Science Conference (Websci'24).


+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
  • Looking at just the first 15 days for a new editor seems short. Many people dip their toes in the water and then return to editing more intensely months or even years later. —Ganesha811 (talk) 23:37, 29 March 2024 (UTC)[reply]
    Hi @Ganesha811, thanks for reading and commenting! I’m the first author on that paper, and you’re right that 15 days appears a bit short. In the paper we describe how we chose two weeks rather than four weeks due to the marginal gains you get from extending the window. We also investigate two longer-term retention windows that were used in the Teahouse paper (1–2 months and 2–6 months), and again find no significant impact. One methodological note about retention is that we control for first-day activity in our models. This means that we can have a “no impact” result even if the Newcomer Homepage increases first-day activity, because the method’s comparing a more active newcomer in the treatment group to a similarly active one in the control group. Understanding or increasing retention is a challenging area, it’s often a “needle in a haystack” kind of problem because the retention rates are very low. Hope this helps explain things, thanks again! Cheers, Nettrom (talk) 19:58, 30 March 2024 (UTC)[reply]
    Interesting, thanks for the reply! —Ganesha811 (talk) 20:25, 30 March 2024 (UTC)[reply]
  • Just want to comment that I was interested in what was going on with The Wikipedia Adventure and that "blog post" where the whole thing was discussed at length (I believe) is linked to a parked domain now unfortunately. Don't know if the expiration of communitydata.cc came up before... Reconrabbit 04:05, 31 March 2024 (UTC)[reply]
  • There's an interesting conclusion to the paper on Newcomer tasks that I think the Signpost write-up missed: context matters. The strange results from the beginning of the paper (lots of nulls, weak and contradictory postiive results) make more sense if you consider the contexts in which the users are making account. In Section 4.4 they cover this in detail, but to summarize: editors make accounts because (1) they want to make an edit or (2) they want to read. The authors replicate a finding from this 2014 working paper that when an editor receives an intervention after opening the edit window they are less likely to make the edit they were planning to make as an unregistered user---the homepage and newcomer tasks distract the would-be-editor from actually making the edit they started. For the second group, who make an account to read, the authors find the opposite effect, with readers who make an account to read are more likely to make an edit (activate) if they are given the homepage and newcomer tasks.
    This is a cool finding! It suggests that the features need a more complicated roll-out system than "give to everyone" and instead take into account what the user was doing at the time they created an account. It seems like an effective first pass would be just providing the newcomer features based on whether an edit window was open or not when the create an account link was clicked, but further research might show better ways to decide what kind of welcome would be most effect for a given account creation context. Wug·a·po·des 04:34, 31 March 2024 (UTC)[reply]
    There are surely lots of interesting remarks in the paper that had to be left out in the review (I do encourage folks to read the whole thing). But this particular conclusion was actually already summarized in the overall findings as quoted in the review: "Our intervention appears to distract newcomers who were already in the process of contributing, but seems to support those who were not, and in particular those who did not create an account with an intention to contribute."
    Either way, I agree that this is an interesting takeaway. I guess from an UX design perspective it's a good reminder that one should always be aware of the possible downsides and opportunity costs of a particular design feature (here: distracting users from their intended task).
    It's interesting that the team appears to have previous have worked from somewhat different assumptions, if I read this 2021 post correctly:

    The Newcomer homepage was the second project built as a result of the usability testing of our earlier work, where we noticed a lot of test participants expected a dashboard or some homepage to orient themselves.
    [...]
    Based on foundational research and cumulative feedback from previous experiments, we knew that many newcomers either arrive with a very challenging edit in mind (e.g., write a new article from scratch) from which failure becomes a demoralising dead-end, or else they don’t have any edits in mind at all, and never find their footing as contributors. The “Suggested edits” module offers tasks ranging from easy tasks (like copyediting or adding links), to harder tasks (like adding citations or expanding articles) that newcomers can filter to based on their specific interests (e.g., copyedit articles about “Food and drink”). This feature is intended to cater to both groups – a clear way to start editing for those who don’t know what they want to do, and for those who do have a difficult edit in mind, it provides a learning pathway to build up their skills first.

    I.e. in 2021 they assumed that there basically exist two relevant groups of newbies, those who need be encouraged to edit, and those who already want to make an edit but need to be discouraged from overly ambitious tasks and steered to easier edits first, both of which would be helped by the Newcomer Homepage. Per earlier parts of the same post, the foundational research that this had been based on was qualitative in nature, focusing on user interviews and constructing user personas. The quantitative results in the paper show that this was not the full picture, and vindicate the notion that such assumptions must be tested with the actual user base. The 2021 post also highlighted, commendably, that A very important work principle we try to keep in mind is that experiments fail. When we see something is not working, we try to make a clean stop and take learnings from that failure to make the next experiment better.
    PS: It's also interesting that the 2021 post reported quantitative conclusions that are very different from those of the later paper:

    Since its launch, and as we added more editing guidance and improvements to the homepage design over multiple iterations and variant tests (a subject for another post), we’ve seen steady increases in edits and number of editors using it. In other words, the release of this experimental feature ["Suggested edits"] saw us succeed in increasing editor activation and retention. 🙌 🙌

    HaeB (talk) 06:06, 31 March 2024 (UTC)[reply]
  • If you are curious of the Homepage, you can test it at special:Homepage (which might require to activate it in your Preferences). The coordination around Growth features at English Wikipedia is at Wikipedia:Growth Team features. Trizek_(WMF) (talk) 07:58, 2 April 2024 (UTC)[reply]
  • Thanks, @HaeB, for summarizing the Increasing Participation in Peer Production Communities with the Newcomer Homepage paper for the Signpost. That paper was published in 2023, but from experiments run in early 2022. In the last couple of years the Growth team has completed several other projects and research experiments, many of which have had more promising results: Add a Link, Add an Image, and Leveling Up.
    I think it's worth highlighting that the Growth team almost always makes improvements to features once we have initial experiment findings.  Sometimes we run a secondary experiment after making those improvements, but sometimes we don’t have the time and resources to run a complete follow-up experiment. For example, in this experiment we saw that Growth features were clearly helping certain user groups, but seemed to distract people who were already mid-edit when they created an account.  We then implemented a change so these users no longer receive a Welcome survey or prompt to visit their Newcomer Homepage while mid-edit (T310320).  So, as Wugapodes suggests, we now take into account what the user was doing at the time they created an account. We’ve always hoped to eventually personalize onboarding further, for example, the onboarding process for someone who signed up to “create a new article” should likely look different than for someone who signed up to “fix a typo or error in a Wikipedia article.” That work isn’t on our immediate roadmap, but we have many ideas shared in Phabricator (T353301 & T229865). We welcome further ideas about new editor onboarding and retention on the Growth team’s Talk page or join the English Wikipedia conversation at Wikipedia:Growth_Team_features. Thanks, - KStoller-WMF (talk) 17:51, 2 April 2024 (UTC)[reply]

















Wikipedia:Wikipedia Signpost/2024-03-29/Recent_research