Ben Kuhn: Be less scared of overconfidence
2022 Nov 30
See all posts
Ben Kuhn: Be less scared of overconfidence @ Satoshi Nakamoto
- Author
-
Ben Kuhn
- Email
-
satoshinakamotonetwork@proton.me
- Site
-
https://satoshinakamoto.network
When I was deciding whether to work for Wave, I got very hung up on
the fact that my "total compensation" would be "lower."
The scare quotes are there because Wave and my previous employer, Theorem, were both early-stage
startups that were paying me mostly in fake startup bucks
equity. To figure out the total compensation, I tried to guess how much
money the equity in each company was worth, with a thought process
something like:
- Both of these companies have been invested in by reputable, top-tier
venture capitalists.
- The market for for-profit investments is pretty efficient,
and most people who think they can do better are being
overconfident.
- Who am I, a lowly 22-year-old programmer, to disagree with reputable
top-tier venture capitalists? I should defer to them about the
valuations.
So I valued the equity by taking the valuation each company's VCs had
invested at, and multiplied it by the fraction of the company my shares
represented. That number was higher for Theorem than for Wave.
Seven years on, the Wave equity turned out to be... a
lot more valuable. That raises the question: how dumb was my take?
Was the actual outcome predictable if I'd thought about it in the right
way?
I don't think it was perfectly predictable, but I do think I
shouldn't have been that anchored to the market-efficiency reasoning.
Those respectable, top-tier VCs had YOLOed those valuations after a
couple one-hour meetings, because that's how early-stage VC works.
Meanwhile, I had worked at Theorem for a year and my then-partner had
worked at Wave for nine months. Heck, I had gotten more founder time
than those VCs had just during my interview process. I had way
more information than "the market."
If I'd had the confidence to use that information, I might have
thought something like:
- After its funding round, Wave continued to add users at one of the
fastest paces their investors had ever seen, whereas Theorem is
struggling to grow.
- Theorem is constrained by its ability to do sales, and the founders
don't seem to be acting with enough focus or urgency to unblock that
constraint. Instead, they're distracting themselves with things like
hiring machine learning interns (i.e. me).
- The founders of Wave seem much smarter, more relentlessly
resourceful, and more trustworthy.
- Given the above, I should value the Wave equity way more even though
its naive expected
value is less than the Theorem equity.
Fortunately, I chose Wave for other reasons. But this thought
pattern—throwing away most information in fear of using it to make
overconfident judgments—shows up all the time. I'm here to tell you why
I hate it.
In January 2020, my entire Twitter timeline was freaking out about a
novel-seeming respiratory disease spreading in Wuhan.
Part of me thought:
- All the reputable, top-tier technocrats are ridiculing the freaked-out
people.
- Usually, when a ragtag band of Internet weirdos thinks they know
better than a large group of reputable, top-tier technocrats, the
Internet weirdos are being overconfident.
- So the technocrats are probably right on this one.
Another part of me thought:
- Huh, the simple model of "this thing has a fast exponential growth
rate and spreads when people are asymptomatic so it's very hard to stop"
seems like a compelling reason to think things will be quite bad.
- When reputable, top-tier technocrats say not to freak out, they
don't usually address the best arguments in favor of freaking out, and
they often seem like they don't understand how exponential growth
works.
- Maybe I'll buy a lot of beans in case everything goes to shit.
(I also contemplated the fact that the stock market didn't seem to be
freaking out, but I decided that since most
people can't beat the stock market, I probably wouldn't either. Some
braver souls than I bought
puts on the S&P 500 and made a killing.)
In both of these situations, I had some mental model of what was
going on ("this epidemic is growing exponentially," "this startup seems
good") based on the particulars of the situation, but instead of using
my internal model to make a prediction, I threw away all my knowledge of
the particulars and instead used a simple, easy-to-apply heuristic
("experts are usually right," "markets are efficient").
I frequently see people leaning heavily on this type
of low-information heuristicto make important decisions for
themselves, or to smack down overconfident-sounding ideas from other
people.
This startup is growing incredibly fast and the founders are some
of the most effective people I've ever met, but at their current VC
valuation, the total comp is lower than my Big Tech job so I can't
justify the move.
I think I could have a big impact as an academic researcher,
but most grad students end up
depressed and don't
land a tenure-track position, so it's not worth trying.
You're going to start a company? Are you aware that 90% of
startups fail? What makes you think you and your ragtag band of weirdos
are the chosen ones?
Who are you to be sounding the alarm about a pandemic when every
past alarm has been false and all the reputable, top-tier experts say
not to worry?
These all place way too much weight on the low-info
heuristic.
A heuristic like that can make a good starting point when you're not
an expert in an area and don't have very much time to think about it or
dig in. This is useful in theory, but in practice, people don't limit
these heuristics to that regime—they fall back on the same ones even in
high-context, high-investment situations, where it's silly to throw away
so much detailed context about the particulars.
What's worse, these low-info heuristics almost always push in the
direction of being less ambitious, because the low-info view of
any ambitious project is that it will fail (most projects run behind
schedule, most startups fail, most investors underperform the market,
etc.).
The problem is that the bad consequences of underconfidence and
under-ambition are severe but subtle, whereas the bad consequences of
overconfidence and wishful thinking are milder but more obvious. If
you're overconfident, you'll try things that fail, and people will laugh
at you. If you're underconfident, you'll avoid making risky bets, and
miss out on the potential upside, but nobody will know for sure what you
missed.
That means it's always tempting to do what the low-info heuristic
tells you and be less ambitious—but ultimately, that ends up being worse
for the world.
Why do people find low-info heuristics so compelling? A few potential
reasons:
Many (most?) attempts to reason via specific
details are wrong. Most people who think "I'm going to beat the
market" don't; most people who think "I know better than all the
experts" are less Balaji
Srinivasan and more Time Cube
guy.
The reasoning and evidence backing up low-info heuristics is
(relatively) legible and easily verifiable. If I claim "90% of startups
fail," I can often cite a study for support. Whereas if I claim "the
markets aren't freaking out enough about COVID," I'd need to make a much
more complicated argument to explain my reasoning.
It's relatively straightforward to reason with low-info
heuristics even when you're not an expert in the domain. For something
like a forecasting challenge,
where forecasters need to make predictions across a wide range of topics
and can't possibly be an expert in all of them, this is very
important.
Because it's much more objective, reasoning via low-info
heuristics gives you many fewer opportunities to fall prey to biases
like optimism
bias, motivated
reasoning, the planning
fallacy, etc.
Those are all real advantages! low-info heuristics are a great way to
be more-or-less right most of the time as a non-expert, and to limit
your vulnerability to overconfidence and wishful thinking.
The problem is that there are lots of ways that low-info heuristics
fail or can be improved on.
For example, the efficient
market hypothesis ("asset prices incorporate all available
information, so it's hard to beat the market" used in the above example
to infer that "venture capitalists value companies correctly") is
justified by economic theory that relies on a few assumptions:
Low transaction costs: The cost of doing a trade
in the market (in this case, an investment) must be near-zero so that
people can use any mispricings to get rich.
Enough smart money: The well-informed and
rational players in the market need to have enough capital to take
advantage of any pricing inefficiencies that they notice.
No secrets: The "available information" must be
available to enough of the smart money that it can be used to correct
mispricings.
Ability to profit: There must be a way for a
smart market participant to make money from a mispriced asset.
In the case of venture capital, many of these assumptions
are super false. Fundraising takes a lot of time and money:
transaction costs are high. Venture capitalists YOLO their valuations
after a few meetings: they frequently miss important information. And
it's impossible to short-sell startups,
so there's no market mechanism to correct an overpriced company. You can
see the outcome of this in the fact that there are venture capitalists
that consistently beat "the market's" returns.
But it's not just venture capital: almost no markets fully
satisfy the conditions of the EMH, and many important markets—like
housing or prediction markets—strongly violate them.
Or consider the heuristic that "if internet weirdos disagree with
experts, the experts are right." What community of Internet weirdos and
what community of experts? Some communities of experts are clearly
bonkers, like the victims of the Sokal hoax. In
other cases, a community with expertise in one narrow area might not
have the context in adjacent areas or the ability to do the
first-principles thinking necessary to apply their expertise correctly
in the real world. For example, doctors are experts in medicine, and
thus are often expected to make medical diagnoses, but only 21%
of doctors are capable of doing the elementary statistical
calculations necessary to turn a medical test result into the
probability of having a disease.
Or consider the heuristic of the
outside view: "the outcome of this situation will probably be
similar to the outcome of similar past situations." Suppose you're using
this to judge how likely a startup is to succeed. Sure, you could
predict it based on the distribution of outcomes across all startups at
a similar stage and valuation. But that would throw away almost all
information you have about the particular startup at hand. It
ignores tons of important questions, like:
You could imagine trying to incorporate info like this into your
outside-view analysis, by, e.g., looking at outcomes specifically of all
startups that have grown by 10x in a single year. But that kind of
information is so private and closely guarded that you probably can't do
that analysis. For some of the other traits, e.g. "how determined are
the founders," we don't even have a good enough way of measuring that
trait that you could do the analysis even in principle.
Sometimes I see people use the low-info heuristic as a "baseline" and
then apply some sort of "fudge factor" for the illegible information
that isn't incorporated into the baseline—something like "the baseline
probability of this startup succeeding is 10%, but the founders seem
really determined so I'll guesstimate that gives them a 50% higher
probability of success." In principle I could imagine this working
reasonably well, but in practice most people who do this aren't willing
to apply as large of a fudge factor as appropriate. Strong
evidence is common:
One time, someone asked me what my name was. I said, "Mark Xu."
Afterward, they probably believed my name was "Mark Xu." I'm guessing
they would have happily accepted a bet at 20:1 odds that my driver's
license would say "Mark Xu" on it.
The prior odds that someone's name is "Mark Xu" are generously
1:1,000,000. Posterior odds of 20:1 implies that the odds ratio of me
saying "Mark Xu" is 20,000,000:1, or roughly 24 bits of evidence. That's
a lot of evidence.
... One implication of the Efficient Market Hypothesis (EMH) is that is
it difficult to make money on the stock market. Generously, maybe only
the top 1% of traders will be profitable. How difficult is it to get
into the top 1% of traders? To be 50% sure you're in the top 1%, you
only need 200:1 evidence. This seemingly large odds ratio might be easy
to get.
In fact, outperforming low-info heuristics isn't just possible; it's
practically mandatory if you want to have an outsized impact on the
world. That's because leaning too heavily on low-info heuristics pushes
people away from being ambitious or trying to search for outliers.
Most important things in life—jobs, hires, companies, ideas,
partners, etc.—have a distribution of outcomes where the best possible
choices are outliers that are dramatically better than the
typical ones. In my case, for example, choosing to work at Wave was
probably 10x better than staying at my previous employer: I learned
more, gained responsibility faster, had a bigger impact on the world,
etc.
Unfortunately, low-info heuristics tell you that outliers can't
exist. By definition, most members of any group are not outliers,
so any generalized heuristic will predict that whatever you're looking
at isn't an outlier either. If you index too heavily on what the average
outcome is, you're deliberately blinding yourself to the possibility of
finding an outlier.
This is especially bad when someone uses this kind of reasoning to
smack down other people's ambition, because the payoffs are asymmetric.
If you incorrectly tell someone that their ambitious idea is likely to
succeed, then they'll waste their time on a failed idea, which is not
great, but ultimately fine. But if you smack them down with low-info
heuristics and convince them their idea is likely to fail, you rob the
world of an awesome idea that would have existed otherwise. Shame on
you! (Too bad you'll never know about it.)
OK, so what should you do instead of relying on low-info heuristics?
Here are my suggestions:
Build gears-level
models of the decision you're trying to make. If you're deciding,
e.g., where to work, try to understand what makes different jobs awesome
or terrible for you.
Think
really hard about the problem. Most inside
views are wrong—to stand a fighting chance of beating the
outside view, you'll need to put a lot of effort in.
Don't fool yourself with motivated reasoning. Stress-test your
ideas; ask yourself what the best arguments against your inside view are
and see if you can rebut them.
- To the extent that you do use low-info heuristics, use them as a
stress test rather than a default belief. "90% of startups fail" is
useful to know as a warning to try to mitigate failure modes. It's
dangerous when you hear it and stop thinking there.
Don't be afraid to try ambitious things where the downside of
failing is low, and the upside of succeeding is high!
Thanks to draft readers Irene
Chen, Milan Cvitkovic,
and Sam
Zimmerman.
Ben Kuhn: Be less scared of overconfidence
2022 Nov 30 See all postsBen Kuhn
satoshinakamotonetwork@proton.me
https://satoshinakamoto.network
When I was deciding whether to work for Wave, I got very hung up on the fact that my "total compensation" would be "lower."
The scare quotes are there because Wave and my previous employer, Theorem, were both early-stage startups that were paying me mostly in
fake startup bucksequity. To figure out the total compensation, I tried to guess how much money the equity in each company was worth, with a thought process something like:So I valued the equity by taking the valuation each company's VCs had invested at, and multiplied it by the fraction of the company my shares represented. That number was higher for Theorem than for Wave.
Seven years on, the Wave equity turned out to be... a lot more valuable. That raises the question: how dumb was my take? Was the actual outcome predictable if I'd thought about it in the right way?
I don't think it was perfectly predictable, but I do think I shouldn't have been that anchored to the market-efficiency reasoning. Those respectable, top-tier VCs had YOLOed those valuations after a couple one-hour meetings, because that's how early-stage VC works. Meanwhile, I had worked at Theorem for a year and my then-partner had worked at Wave for nine months. Heck, I had gotten more founder time than those VCs had just during my interview process. I had way more information than "the market."
If I'd had the confidence to use that information, I might have thought something like:
Fortunately, I chose Wave for other reasons. But this thought pattern—throwing away most information in fear of using it to make overconfident judgments—shows up all the time. I'm here to tell you why I hate it.
In January 2020, my entire Twitter timeline was freaking out about a novel-seeming respiratory disease spreading in Wuhan.
Part of me thought:
Another part of me thought:
(I also contemplated the fact that the stock market didn't seem to be freaking out, but I decided that since most people can't beat the stock market, I probably wouldn't either. Some braver souls than I bought puts on the S&P 500 and made a killing.)
In both of these situations, I had some mental model of what was going on ("this epidemic is growing exponentially," "this startup seems good") based on the particulars of the situation, but instead of using my internal model to make a prediction, I threw away all my knowledge of the particulars and instead used a simple, easy-to-apply heuristic ("experts are usually right," "markets are efficient").
I frequently see people leaning heavily on this type of low-information heuristicto make important decisions for themselves, or to smack down overconfident-sounding ideas from other people.
This startup is growing incredibly fast and the founders are some of the most effective people I've ever met, but at their current VC valuation, the total comp is lower than my Big Tech job so I can't justify the move.
I think I could have a big impact as an academic researcher, but most grad students end up depressed and don't land a tenure-track position, so it's not worth trying.
You're going to start a company? Are you aware that 90% of startups fail? What makes you think you and your ragtag band of weirdos are the chosen ones?
Who are you to be sounding the alarm about a pandemic when every past alarm has been false and all the reputable, top-tier experts say not to worry?
These all place way too much weight on the low-info heuristic.
A heuristic like that can make a good starting point when you're not an expert in an area and don't have very much time to think about it or dig in. This is useful in theory, but in practice, people don't limit these heuristics to that regime—they fall back on the same ones even in high-context, high-investment situations, where it's silly to throw away so much detailed context about the particulars.
What's worse, these low-info heuristics almost always push in the direction of being less ambitious, because the low-info view of any ambitious project is that it will fail (most projects run behind schedule, most startups fail, most investors underperform the market, etc.).
The problem is that the bad consequences of underconfidence and under-ambition are severe but subtle, whereas the bad consequences of overconfidence and wishful thinking are milder but more obvious. If you're overconfident, you'll try things that fail, and people will laugh at you. If you're underconfident, you'll avoid making risky bets, and miss out on the potential upside, but nobody will know for sure what you missed.
That means it's always tempting to do what the low-info heuristic tells you and be less ambitious—but ultimately, that ends up being worse for the world.
Why do people find low-info heuristics so compelling? A few potential reasons:
Many (most?) attempts to reason via specific details are wrong. Most people who think "I'm going to beat the market" don't; most people who think "I know better than all the experts" are less Balaji Srinivasan and more Time Cube guy.
The reasoning and evidence backing up low-info heuristics is (relatively) legible and easily verifiable. If I claim "90% of startups fail," I can often cite a study for support. Whereas if I claim "the markets aren't freaking out enough about COVID," I'd need to make a much more complicated argument to explain my reasoning.
It's relatively straightforward to reason with low-info heuristics even when you're not an expert in the domain. For something like a forecasting challenge, where forecasters need to make predictions across a wide range of topics and can't possibly be an expert in all of them, this is very important.
Because it's much more objective, reasoning via low-info heuristics gives you many fewer opportunities to fall prey to biases like optimism bias, motivated reasoning, the planning fallacy, etc.
Those are all real advantages! low-info heuristics are a great way to be more-or-less right most of the time as a non-expert, and to limit your vulnerability to overconfidence and wishful thinking.
The problem is that there are lots of ways that low-info heuristics fail or can be improved on.
For example, the efficient market hypothesis ("asset prices incorporate all available information, so it's hard to beat the market" used in the above example to infer that "venture capitalists value companies correctly") is justified by economic theory that relies on a few assumptions:
Low transaction costs: The cost of doing a trade in the market (in this case, an investment) must be near-zero so that people can use any mispricings to get rich.
Enough smart money: The well-informed and rational players in the market need to have enough capital to take advantage of any pricing inefficiencies that they notice.
No secrets: The "available information" must be available to enough of the smart money that it can be used to correct mispricings.
Ability to profit: There must be a way for a smart market participant to make money from a mispriced asset.
In the case of venture capital, many of these assumptions are super false. Fundraising takes a lot of time and money: transaction costs are high. Venture capitalists YOLO their valuations after a few meetings: they frequently miss important information. And it's impossible to short-sell startups, so there's no market mechanism to correct an overpriced company. You can see the outcome of this in the fact that there are venture capitalists that consistently beat "the market's" returns.
But it's not just venture capital: almost no markets fully satisfy the conditions of the EMH, and many important markets—like housing or prediction markets—strongly violate them.
Or consider the heuristic that "if internet weirdos disagree with experts, the experts are right." What community of Internet weirdos and what community of experts? Some communities of experts are clearly bonkers, like the victims of the Sokal hoax. In other cases, a community with expertise in one narrow area might not have the context in adjacent areas or the ability to do the first-principles thinking necessary to apply their expertise correctly in the real world. For example, doctors are experts in medicine, and thus are often expected to make medical diagnoses, but only 21% of doctors are capable of doing the elementary statistical calculations necessary to turn a medical test result into the probability of having a disease.
Or consider the heuristic of the outside view: "the outcome of this situation will probably be similar to the outcome of similar past situations." Suppose you're using this to judge how likely a startup is to succeed. Sure, you could predict it based on the distribution of outcomes across all startups at a similar stage and valuation. But that would throw away almost all information you have about the particular startup at hand. It ignores tons of important questions, like:
You could imagine trying to incorporate info like this into your outside-view analysis, by, e.g., looking at outcomes specifically of all startups that have grown by 10x in a single year. But that kind of information is so private and closely guarded that you probably can't do that analysis. For some of the other traits, e.g. "how determined are the founders," we don't even have a good enough way of measuring that trait that you could do the analysis even in principle.
Sometimes I see people use the low-info heuristic as a "baseline" and then apply some sort of "fudge factor" for the illegible information that isn't incorporated into the baseline—something like "the baseline probability of this startup succeeding is 10%, but the founders seem really determined so I'll guesstimate that gives them a 50% higher probability of success." In principle I could imagine this working reasonably well, but in practice most people who do this aren't willing to apply as large of a fudge factor as appropriate. Strong evidence is common:
In fact, outperforming low-info heuristics isn't just possible; it's practically mandatory if you want to have an outsized impact on the world. That's because leaning too heavily on low-info heuristics pushes people away from being ambitious or trying to search for outliers.
Most important things in life—jobs, hires, companies, ideas, partners, etc.—have a distribution of outcomes where the best possible choices are outliers that are dramatically better than the typical ones. In my case, for example, choosing to work at Wave was probably 10x better than staying at my previous employer: I learned more, gained responsibility faster, had a bigger impact on the world, etc.
Unfortunately, low-info heuristics tell you that outliers can't exist. By definition, most members of any group are not outliers, so any generalized heuristic will predict that whatever you're looking at isn't an outlier either. If you index too heavily on what the average outcome is, you're deliberately blinding yourself to the possibility of finding an outlier.
This is especially bad when someone uses this kind of reasoning to smack down other people's ambition, because the payoffs are asymmetric. If you incorrectly tell someone that their ambitious idea is likely to succeed, then they'll waste their time on a failed idea, which is not great, but ultimately fine. But if you smack them down with low-info heuristics and convince them their idea is likely to fail, you rob the world of an awesome idea that would have existed otherwise. Shame on you! (Too bad you'll never know about it.)
OK, so what should you do instead of relying on low-info heuristics? Here are my suggestions:
Build gears-level models of the decision you're trying to make. If you're deciding, e.g., where to work, try to understand what makes different jobs awesome or terrible for you.
Think really hard about the problem. Most inside views are wrong—to stand a fighting chance of beating the outside view, you'll need to put a lot of effort in.
Don't fool yourself with motivated reasoning. Stress-test your ideas; ask yourself what the best arguments against your inside view are and see if you can rebut them.
Don't be afraid to try ambitious things where the downside of failing is low, and the upside of succeeding is high!
Thanks to draft readers Irene Chen, Milan Cvitkovic, and Sam Zimmerman.