Growing up in New York City, I attended one of the well-known “science and math” public high schools. I find that the core science skills I learned as a teenager are often more useful to my career than the years of expensive business education which followed. There’s one nuance in particular that I see my MBA colleagues trip over time and again, that of significant digits.
The concept of significant digits is simple; your calculation result is only as accurate as the least accurate input into the model. Makes sense intuitively, and is a fundamental part of scientific measurement. But in the business world there is a strong bias to certainty — no one ever got a raise by saying “I don’t know.”
Let’s see how this concept applies in practice. Suppose you’re a biologist and you are trying to estimate the number of cells in a sample of liquid. You can measure the volume of liquid in the sample very as 1.23 milliliters using precision instruments. You then look through a microscope to count the number of cells in a 0.01 milliliter drop. You see about 5 cells. Scanning through other drops you consistently see about 5 cells. How many cells are there in the total sample?
If you said 615 (5 * 1.23/0.01) try again. The answer is 600.
Let me explain. 1.23 milliliters is a measurement with three significant digits. It is distinct from 1.2 and from 1.3 because we have more significant information about this datum that can be expressed in another decimal point. If the lab owned a more accurate tool, you might have 1.22932 milliliters or 1.234684 milliliters in the sample but it doesn’t, so you don’t. You then take your three-digit datum and commingle it with a less accurate measurement of 5 cells per 0.1 drop. “5” has one significant digit. Presumably, if you could measure more accurately you could determine that the average drop had 4.9 or 5.1 cells, but once again, that’s not the data we’ve been given. So what happens when you extrapolate an estimate based on two measurements of differing certainty? The less certain measurement is reflected in the result, thus the answer “600”. This answer is more accurate than the answer “615” which has the illusion of greater precision.
This is the part that really baffles people. If we’ve got additional measurement point, why is it more accurate to discard it? Isn’t “600” just an estimate, while “615” is closer to the truth? The answer is that they’re both estimates, based on our best ability to measure the inputs. The difference is that “600” includes an embedded acknowledgment that it is an estimate, while “615” attempts to show accuracy where none exists.
Here’s another way to look at it. Suppose that the “true” value of cells/drop was actually 4.61. This would be consistent with a single-digit estimate of 5, but would give us a three-digit estimate of cells in the sample of 567 — quite a ways from our 615 “more accurate” estimate!
So after reading this far, anyone with a science background is cringing from my armchair-labcoat analogies (I know there are more than 5 cells in a 0.1 ml drop, I was making a point), and anyone with a business background is waiting for me to tell them something useful. OK, then, on to Excel.
Excel is a many-splendored thing. I personally use Excel for almost everything I do at work, including tasks better suited to databases or Word, simply because Microsoft’s spreadsheet program stands so far above their other products. New York real estate is over-priced because of 23-year olds who can use Excel to make toast in the morning. Excel might be the most perfect software program ever written.
But Excel doesn’t have a concept of significant digits. It just gives an answer.
Let’s use another example, this time from business. You’re in M&A at a white shoe I-Bank (ha!) and have been working on a valuation model all night for the MD. The target company has the following stats:
- Last year’s revenue: $123.5 million
- Projected revenue CAGR, next 5 years: 14.5%
- EBIDTA margin: 26%
- Cost of capital: 11.2%
- Perpetual growth rate after 5 years: 4%
The model spits out a $945 million valuation. Great, but wrong.
The real valuation your model should have give you is $900 million. The 4% growth rate only has one digit of certainty (and if you’ve studied valuation you know it’s totally made up anyway). When you show the model to your boss and say “$945” you’re making a fundamental mistake and over-valuing the company by $45 million. But since 945 is almost half way to $1 billion, isn’t it arbitrary that we’re rounding down instead of up? No! If you can make a case that the perpetual growth rate is actually 4.1% or 4.2% go ahead and do that. If you can’t, then you’re stuck with the model’s output.
It may be that all the smart managers automatically bias their interpretation of data to include inaccuracy. When the Managing Director studies your model and see the $945 result, he may actually say to himself “the right number is somewhere north of $900.” But what about when the number is $14.75 billion? Do the decision-makers really see that number for what it is — $10 billion?
I used a financial model because it illustrates most concretely the risk of over-relying on the accuracy of your estimates. But this same principal applies to all measurements in business. And not once in my 10 years of work experience have I heard anyone make note of the concept of significant digits.
Comments from the Old Blog
Very interesting article. But, I think you glossed over an important point about significant digits. It’s not that the numbers you cited (600 organisms & $900M) are more “correct” than the fully expanded numbers, it’s just that the last significant digit is all you can predict with accuracy. The correct number can only be found by better numbers. Also, I think that we in business often use shorthand. For example your Perpetual Growth Rate might actually be 4.0%. . .
Also, I’ve seen plenty a revenue model that’s a shot in the dark, but what’s the remedy: we’re always going to be using estimates. Maybe we should just give our managers an order of magnitude when she asks for next year’s numbers =)
Posted by: Mark Johnson on April 4, 2006 01:40 AM
Your explanation of sig figs (as we used to call them) has a significant mistake. The right way to express your answer is 6*10^3. That indicates that the answer is six hundred, and that that answer is significant to only the first figure. You don’t know how many sig figs there are in the number when you express it as 600–you might have two (which would be properly represented as 6.0*10^3) or three (6.00*10^3). Technically, the representation 600 implies 3 sig figs, which is the opposite of what you wanted.
It’s also worth mentioning that you don’t do that rounding until you’re done calculating, or your ultimate answer is less accurate/precise that it would be otherwise. In other words, if you’re going to multiply that 615 by something else, you don’t don’t round the 615, you round the final number that you’re going to report.
Posted by: Dan on May 14, 2006 02:15 PM
Sig figs, here here. Signed, Indiana Academy for Science, Mathematics, & Humanities alumna.
Posted by: ANP on May 22, 2006 09:59 PM
“The difference is that “600” includes an embedded acknowledgment that it is an estimate, while “615” attempts to show accuracy where none exists.”
Isn’t this anthropomorphizing the numbers a little? How would the consumer of this information, let’s say a manager who is only looking at the final answer, know that “600” is such an honest little number whereas “615” is so devious?
If accuracy is important why not handle it by giving a margin of error, e.g. if 5 could be anywhere from 4.6 to 5.4 then calculate using your most accurate raw numbers (the 1.23 number which went to 3 sig figs) then use the accuracy range on “5” to figure out a margin of error? Seems better than dumbing down the whole equation and then slapping a magical “embedded acknowledgment” label on it.
Excel the most perfect software in the world? I’m sure it’s good for people who make toast with it all day long, but it’s got wretched discoverability. Think of the poor user who tries to perform the simple task of inserting a line break in text within a cell. They must somehow figure out to hit not “Return” not even “ctrl-Return” but “ALT-Return”.
Posted by: Roy Zornow on August 30, 2006 10:22 PM
Readers might be interested to know of my own blog (sciencetext.com) which discusses the problem of sig figs and how they and units and conversions are abused in the media.
db
Posted by: David Bradley on February 28, 2007 01:30 PM