Better than Average: The Strengths and Weaknesses of the Arithmetic Mean

The average, or arithmetic mean, enjoys a prominent position in most analysis toolkits, and for good reason. It’s a simple and elegant way to compress a large data set into a single, descriptive number. It’s ever-present in the alphabet soup of “key performance indicators” that highlight most gaming dashboards: ARPU (average revenue per user), ARPPU (average revenue per paying user), and average session length, to name a few. But the average is not without its limitations. If your data set is not evenly or predictably distributed, an average might not give you a full picture, leading to misunderstandings of how players engage with your game. If you understand these limitations, however, you’ll be well-equipped to both avoid analysis pitfalls and address them with complementary data visualization techniques.

Distribution of Game Data

The chart below displays the number of daily active users (DAU) in a game over the past 30 days. Despite the minor fluctuations in game activity, this distribution is fairly uniform between the days. It’s close to an ideal candidate for an average: very little information is lost by compressing thirty days into a single number (the horizontal orange line).

Chart of daily active users

However, many data distributions in gaming are not uniform. The attritive nature of gaming ensures that most players won’t see end-game content, and that the most engaged players will be engaging in game systems at orders of magnitude more than the dabblers. This often results in data distributions that appear to follow that of a power law, such as what is seen in the below chart.

Chart showing battles fought distribution

This chart displays an ordered list of the number of battles fought by players who were active in one of our published games on May 28th. 65% of the active players played 20 or fewer battles on this day. A histogram provides a different look at the same distribution:

Chart showing battles fought histogram

Although the data is heavily clustered around a low numbers of battles, there is also a long tail of extremely active players who have fought as many as 600 battles that day.

A similar distribution can be seen in the size of guilds which participated in a particular game event:

Chart showing guild size distribution

Another Dimension

Distributions of this sort are commonplace in customer lifecycle data, and tend to be particularly pronounced in free-to-play gaming, where the difference in behavior from the least engaged users to the most engaged users is quite dramatic. The data for the primary monetization model for the industry is no exception, with typically fewer than 5% of players choosing to spend money in a given game. Even within this subset of spenders, we see a power-law-style distribution:

Chart showing livetime spending distribution

This chart displays a ranked list of the lifetime value of all spenders in this game, from largest to smallest. It’s common in the industry to compress a list like this into a single average -- often called the average revenue per paying user (ARPPU) -- which in this case is $48.90. But flattening such a dramatically non-uniform set of data into a single figure removes an entire dimension of information from our data. That’s not necessarily a bad thing, since an average is often a conscious tradeoff of information richness in exchange for simplicity and elegance. But it is important to first understand the way that the data is distributed. Taking a blind average risks fundamentally misunderstanding your data.

Mean Friends

Visualizations are a great way to acquire this sort of understanding. Unfortunately, they’re not the most digestible thing to put on a business intelligence dashboard. Being exhaustive can be exhausting, and an infinitely descriptive yet unread chart may as well not exist. Don’t despair, though! Here are a few tricks for succinctly supplementing averages of non-uniform data sets.

Median

There’s a reason that we all are familiar with the concept of a median. It’s simple, efficient, and pairs very well with an average. Think of it as a keyhole view into a comprehensive distribution chart; rather than looking at every ordered data point, the median just grabs the middle value. Let’s grab the median for our set of lifetime spending data:

Median: $15
Mean (ARPPU): $48.9

Since our ARPPU is substantially greater than our median, we know that our distribution of spending data is concentrated around smaller lifetime purchase amounts, with a long tail (or perhaps a couple of outliers) that is pulling up the average.

Percentile Distribution

If you’d like to be more descriptive than median, a quartile or decile distribution can be thought of as an abridged version of a full distribution. It involves segmenting a ranked distribution of the data into a number of equal buckets. Let’s go with four buckets. From here, we can take both the minimum and average of each of these buckets:

Lifetime spending, quartile minimum

Lifetime spending, quartile average

Other suggestions

When dealing with long tailed distributions, it can be helpful to either remove outliers (or at least be aware of their presence).
Another way to inure yourself to outliers is to use binary statistics. For example, in addition to calculating the average number of sessions** per day, you might also consider measuring the **portion of active users who record a second daily session. The second measure will not be overly influenced by a player who records hundreds of daily logins.
A bounded average can help you zero in on the specific band of activity that interests you. For example, if you’re curious about the difficulty tuning of a limited time event in your game, you might consider only measuring the average progress of users who reach a particular milestone.