Five Things You Need to Know: Lies, Damn Lies and Graphs That Don't Use the Logarithmic Scale
And other tricks they use to make you believe their phony graphs
During a stint at Minyanville, a financial news and education media company, I was fortunate to work under the tutelage of a talented editor named Kevin Depew. Depew is now the deputy chief economist at consulting firm RSM, and wrote prolifically during the Great Financial Crisis, brilliantly capturing the mood of America during that trying time. His "Five Things You Need to Know" was required reading every week, and covered topics from high finance to macroeconomics to social trends. This series is an homage to "Pep," who once quipped "I majored in philosophy, but since none of the big philosophy firms were hiring, I went into finance."
I'm going to take a break from pure real estate stuff this week to explain a simple trick that analysts, journalists and economists can use to twist data into doing their bidding.
In their defense, many don't even know they’re doing anything wrong.
But what is at best statistical ignorance and at worst willful disinformation has a tremendous impact on the public discourse.
The statistics that are suitable for public consumption, and therefore shape public opinion and impact policy, are in visual form: Graphs.
So if you want to be one of the few who isn’t wowed by a snazzy graph meant to convince you of a story that might not be there, this breakdown is for you.
First some definitions.
1) Sorry, what's wrong with my graph?
Dust off that middle school math and recall that graphing data in two dimensions requires plotting a set of observations on two axes, the X (horizontal) and Y (vertical).
(I still sometimes forget which is which, and of the many tricks the easiest I've found is that Y is a tall letter, so Y is vertical. You’re welcome.)
You've got to put a scale on both the X and Y axes, primarily so all your data fits on the graph.
So if one afternoon you're casually graphing total outstanding US government debt over time, you'll want the X axis to measure the passage of time in years, and the Y axis to run upwards a few trillion dollars every tick mark.
Plow this data into Excel and we get something that looks like this.
Scary right?
The data shown like this conforms to the narrative that US government spending is spiraling out of control. I mean look at that graph!
But what if you wanted to know if spending actually was spiraling out of control, not just whether it looked that way.
2) Wait, there's another option?
Since graphs are intended to help us the viewer interpret a given set of data, the way that graph looks is perhaps as important as the data themselves.
Usually we space out the numbers along the side of the axis, or our scale, at even intervals because it looks nice and the math is easy. In the graph above, the X axis uses two-year intervals, and up the Y axis I use intervals of $5 trillion.
This is called the "linear scale" and is the way the vast majority of the graphs that we see are shown.
This is innocent enough and often the correct way to display a given data set. But as we'll see below, simple changes to a linear scale can have a dramatic impact on what the graph visually suggests.
Grokking the alternative to the linear scale unfortunately requires a bit of abstract mathematical thinking, so bear with me.
A logarithmic scale (as opposed to a linear one) is a group of numbers which does not increase by the same amount at each interval. Instead, the scale is exponential in nature, in that the distance between the numbers grows. We'll see why this matters a bit further down.
As we saw above, the scale 5, 10, 15, 20, etc is linear.
But a scale of 1, 2, 4, 16, 32, etc would be logarithmic, since each number is the square of the one before it.
A commonly cited logarithmic scale is the Richter Scale, used to measure the strength of earthquakes.
As those of us from California should remember, an earthquake measured at 5.0 on the Richter Scale is 10 times stronger than one measured at 4.0. (rather than 25% stronger if the scale were linear).
Now let’s look at our data above in two different ways.
Here is the same graph with two simple changes: First, I stopped the scale of the Y Axis at $35 trillion instead of $50 trillion to give the impression that the US government data had reached some sort of imaginary limit.
Second, I squished the graph to me more of a square, making the slope of the curve look more extreme.
Even scarier right?
Now let's look at the original linear graph, but shown with the Y (Vertical) axis as a logarithmic scale.
Excel does a nifty job of doing it for me, here:
The logarithmic scale on the Y axis rises not linearly, but on a exponentially to measure the relative change over time.
Tells a very different story, right? Not exactly a pretty picture but much less frightening.
And since graphs are intended to help us the viewer interpret a given set of data, the way that graph looks is perhaps as important as the data themselves.
Logarithmic scales set their intervals by exponential change, rather than evenly distributed intervals, to better match data as they move through time. They are particularly useful for measuring changes over long periods of time, thanks in part to that powerful law of compounding.
3) Well crap, that sounds complicated.
Sure, but it doesn't have to be.
Let’s examine why a logarithmic scale presents a more accurate picture of the growth in US government debt that the linear one we are used to seeing.
The linear graph paints a picture of government debt that is spiraling out of control while the logarithmic one looks like a steady march upward.
Which is more accurate?
Consider how long the public debt took to roughly double since the year 2000:
$5 trillion - $10 trillion: (2000 - 2009, or 9 years)
$10 trillion - $20 trillion: (2009 - 2016, or 7 years)
$20 trillion - $30 trillion: (2016 - 2022, or 6 years)
Certainly doubling quicker in the past decade or so, but looking at the data in this way - which is what the logarithmic scale does - paints a picture that is more of a giant swell rolling across the ocean than a tsunami racing towards shore.
No doubt, since 2020 the national debt has been accelerating at the fastest pace in recent memory, but a logarithmic scale with a trend line puts in context that the change is a recent phenomenon - which hard to gauge with the linear scale.
4) So, why even bother with linear then?
Linear scales are often the correct way to look at a data series. And since this is purportedly a newsletter about real estate, let's look at some real estate data.
Consider the following data set which graphs apartment vacancy, courtesy of ApartmentList.com.
Here we have a measure of vacancy which oscillates over time and does not compound, so linear is the correct scale to use and logarithmic simply squishes the data.
5) Fine, but does it really matter that much?
Quite a lot actually.
The title of this post is clearly a play on the oft-quoted “there are lies, damn lies and statistics.”
And the statistics that are suitable for public consumption, and therefore shape public opinion and impact policy, come in visual form: Graphs.
Consider the following presentation on a linear scale of the change in median home price from 2009 to 2019.
From the graph, we can see that the Federal Reserve’s accommodative monetary policy in the wake of the Great Financial Crisis spurred strong home price appreciation. So it’s reasonable to assume that ratcheting up interest rates could cool an overheated market.
But what if we pull back and look at the data with more historic context?
Within the framework of the past 30 years, the rise in home prices from 2009 - 2019 looks a lot more like 1995 - 2001, when rates were generally above 7.5% (vs sub-5% from 2009 - 2019).
Which changes our interpretation of the impact Fed policy may have on home prices.
Pull back even further, switch to a logarithmic scale and add a trend line, and we can see that it took 10 years during that housing “boom” of 2009 - 2019 just for home prices to get back to the long term trend.
Not exactly what I would call a boom.
And in fact, home prices rose during that time at a slower pace than the average from 1975 - 2023. (hat tip @EstateBets1135)
Which makes you wonder a lot less why jacking up interest rates hasn’t cooled home price appreciation much over the past couple years.
Long-term movements in prices, whether home prices or stocks, are most vulnerable to the distortion of using a linear rather than logarithmic scale.
Layer on top of that how easy linear scales are to manipulate to exaggerate or even distort the takeaway from a given dataset, and we start to wonder if any of the graphical interpretations of data we see are unbiased representations of the true mathematical trends.