The Signpost

Tips and tricks

XTools: Data analytics for your list of created articles

Experienced Wikipedians often have a long list of articles they've created. But what do they know about those articles? How can they get some metrics or analytics to follow their collection?

XTools provides some insights about the list of articles created by a user[1]. The Pageviews API provides a way to get the number of pageviews for each article created by a user[2]. But what about the gender distribution of the biographies I've created? What is the main occupation of people I've written about? Where are the places located for which I've created an article? And if we come to the content of the articles: which is the longest? Which has the most references?

By using the XTools pages-created API, I've developed a set of new tools to answer all those questions.[3]

Screenshot of "User-level gender statistics for Wikipedia": Gender distribution of articles I've created in Wikipedia in French[4]

At first, I was very curious about the gender distribution of people I've created a biographical article about. So I've used the Wikidata API to get the value of the property sex or gender (P21) for all items corresponding to articles a user has created. This first tool is named "User-level gender statistics for Wikipedia".[5]

Screenshot of "Look at your list of created articles through Wikidata": Distribution of articles I've created in Wikipedia in French by instance of (P31).[6]

This tool can easily be extended to other Wikidata properties such as instance of (P31) and country (P17) (and for humans, country of citizenship (P27) and occupation (P106)). This led to another tool named "Look at your list of created articles through Wikidata".[7]

Another tool provides a map of your articles related to geolocated Wikidata items, using property coordinate location (P625).[8]

Screenshot of "Look at your list of created articles with the XTools Page Prose API": List of articles I've created in Wikipedia in English sorted by number of words and by number of references[9]

We can also gain insights about the content of our articles. The XTools page-prose API gives the number of words, references, unique references and sections in each article. So I've developed a notebook which computes this for all the articles created by a user.[10]

Screenshot of "Look at your list of created articles with the XTools Page ArticleInfo API": List of articles I've created in Wikipedia in English sorted by number of revisions and by number of editors.[11]

My last tool collects data about the number of revisions, the number of editors, the number of pageviews and the number of watchers for all of a user's articles, using the XTools articleinfo API.[12]

All my tools are developed in JavaScript using Observable (a data visualization platform created by Melody Meckfessel and Mike Bostock), which makes it very easy to design interactive tools. One shortcoming is that you may experience some timeout errors, since my tools rely on a high number of API calls. I can imagine that if you've created more than 2,000 articles, you may have a lot of timeout errors. And all my work is open source – so feel free to improve it and suggest better solutions. And, of course, all of your feedback is greatly appreciated.

References




Tips and Tricks is a general editing advice column written by experienced editors. If you have suggestions for a topic, or want to submit your own advice, follow these links and let us know (or comment below)!

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

















Wikipedia:Wikipedia Signpost/2023-02-04/Tips_and_tricks