Ben Kuhn: Be less scared of overconfidence

2022 Nov 30 See all posts
Ben Kuhn: Be less scared of overconfidence @ Satoshi Nakamoto
Author

Ben Kuhn

Email

Site

https://satoshinakamoto.network

When I was deciding whether to work for Wave, I got very hung up on the fact that my "total compensation" would be "lower."

The scare quotes are there because Wave and my previous employer, Theorem, were both early-stage startups that were paying me mostly in fake startup bucks equity. To figure out the total compensation, I tried to guess how much money the equity in each company was worth, with a thought process something like:

So I valued the equity by taking the valuation each company's VCs had invested at, and multiplied it by the fraction of the company my shares represented. That number was higher for Theorem than for Wave.

Seven years on, the Wave equity turned out to be... a lot more valuable. That raises the question: how dumb was my take? Was the actual outcome predictable if I'd thought about it in the right way?

I don't think it was perfectly predictable, but I do think I shouldn't have been that anchored to the market-efficiency reasoning. Those respectable, top-tier VCs had YOLOed those valuations after a couple one-hour meetings, because that's how early-stage VC works. Meanwhile, I had worked at Theorem for a year and my then-partner had worked at Wave for nine months. Heck, I had gotten more founder time than those VCs had just during my interview process. I had way more information than "the market."

If I'd had the confidence to use that information, I might have thought something like:

Fortunately, I chose Wave for other reasons. But this thought pattern—throwing away most information in fear of using it to make overconfident judgments—shows up all the time. I'm here to tell you why I hate it.




In January 2020, my entire Twitter timeline was freaking out about a novel-seeming respiratory disease spreading in Wuhan.

Part of me thought:

Another part of me thought:

(I also contemplated the fact that the stock market didn't seem to be freaking out, but I decided that since most people can't beat the stock market, I probably wouldn't either. Some braver souls than I bought puts on the S&P 500 and made a killing.)




In both of these situations, I had some mental model of what was going on ("this epidemic is growing exponentially," "this startup seems good") based on the particulars of the situation, but instead of using my internal model to make a prediction, I threw away all my knowledge of the particulars and instead used a simple, easy-to-apply heuristic ("experts are usually right," "markets are efficient").

I frequently see people leaning heavily on this type of low-information heuristicto make important decisions for themselves, or to smack down overconfident-sounding ideas from other people.

These all place way too much weight on the low-info heuristic.

A heuristic like that can make a good starting point when you're not an expert in an area and don't have very much time to think about it or dig in. This is useful in theory, but in practice, people don't limit these heuristics to that regime—they fall back on the same ones even in high-context, high-investment situations, where it's silly to throw away so much detailed context about the particulars.

What's worse, these low-info heuristics almost always push in the direction of being less ambitious, because the low-info view of any ambitious project is that it will fail (most projects run behind schedule, most startups fail, most investors underperform the market, etc.).

The problem is that the bad consequences of underconfidence and under-ambition are severe but subtle, whereas the bad consequences of overconfidence and wishful thinking are milder but more obvious. If you're overconfident, you'll try things that fail, and people will laugh at you. If you're underconfident, you'll avoid making risky bets, and miss out on the potential upside, but nobody will know for sure what you missed.

That means it's always tempting to do what the low-info heuristic tells you and be less ambitious—but ultimately, that ends up being worse for the world.




Why do people find low-info heuristics so compelling? A few potential reasons:

Those are all real advantages! low-info heuristics are a great way to be more-or-less right most of the time as a non-expert, and to limit your vulnerability to overconfidence and wishful thinking.




The problem is that there are lots of ways that low-info heuristics fail or can be improved on.

For example, the efficient market hypothesis ("asset prices incorporate all available information, so it's hard to beat the market" used in the above example to infer that "venture capitalists value companies correctly") is justified by economic theory that relies on a few assumptions:

In the case of venture capital, many of these assumptions are super false. Fundraising takes a lot of time and money: transaction costs are high. Venture capitalists YOLO their valuations after a few meetings: they frequently miss important information. And it's impossible to short-sell startups, so there's no market mechanism to correct an overpriced company. You can see the outcome of this in the fact that there are venture capitalists that consistently beat "the market's" returns.

But it's not just venture capital: almost no markets fully satisfy the conditions of the EMH, and many important markets—like housing or prediction markets—strongly violate them.

Or consider the heuristic that "if internet weirdos disagree with experts, the experts are right." What community of Internet weirdos and what community of experts? Some communities of experts are clearly bonkers, like the victims of the Sokal hoax. In other cases, a community with expertise in one narrow area might not have the context in adjacent areas or the ability to do the first-principles thinking necessary to apply their expertise correctly in the real world. For example, doctors are experts in medicine, and thus are often expected to make medical diagnoses, but only 21% of doctors are capable of doing the elementary statistical calculations necessary to turn a medical test result into the probability of having a disease.

Or consider the heuristic of the outside view: "the outcome of this situation will probably be similar to the outcome of similar past situations." Suppose you're using this to judge how likely a startup is to succeed. Sure, you could predict it based on the distribution of outcomes across all startups at a similar stage and valuation. But that would throw away almost all information you have about the particular startup at hand. It ignores tons of important questions, like:

You could imagine trying to incorporate info like this into your outside-view analysis, by, e.g., looking at outcomes specifically of all startups that have grown by 10x in a single year. But that kind of information is so private and closely guarded that you probably can't do that analysis. For some of the other traits, e.g. "how determined are the founders," we don't even have a good enough way of measuring that trait that you could do the analysis even in principle.

Sometimes I see people use the low-info heuristic as a "baseline" and then apply some sort of "fudge factor" for the illegible information that isn't incorporated into the baseline—something like "the baseline probability of this startup succeeding is 10%, but the founders seem really determined so I'll guesstimate that gives them a 50% higher probability of success." In principle I could imagine this working reasonably well, but in practice most people who do this aren't willing to apply as large of a fudge factor as appropriate. Strong evidence is common:

One time, someone asked me what my name was. I said, "Mark Xu." Afterward, they probably believed my name was "Mark Xu." I'm guessing they would have happily accepted a bet at 20:1 odds that my driver's license would say "Mark Xu" on it.

The prior odds that someone's name is "Mark Xu" are generously 1:1,000,000. Posterior odds of 20:1 implies that the odds ratio of me saying "Mark Xu" is 20,000,000:1, or roughly 24 bits of evidence. That's a lot of evidence.

... One implication of the Efficient Market Hypothesis (EMH) is that is it difficult to make money on the stock market. Generously, maybe only the top 1% of traders will be profitable. How difficult is it to get into the top 1% of traders? To be 50% sure you're in the top 1%, you only need 200:1 evidence. This seemingly large odds ratio might be easy to get.




In fact, outperforming low-info heuristics isn't just possible; it's practically mandatory if you want to have an outsized impact on the world. That's because leaning too heavily on low-info heuristics pushes people away from being ambitious or trying to search for outliers.

Most important things in life—jobs, hires, companies, ideas, partners, etc.—have a distribution of outcomes where the best possible choices are outliers that are dramatically better than the typical ones. In my case, for example, choosing to work at Wave was probably 10x better than staying at my previous employer: I learned more, gained responsibility faster, had a bigger impact on the world, etc.

Unfortunately, low-info heuristics tell you that outliers can't exist. By definition, most members of any group are not outliers, so any generalized heuristic will predict that whatever you're looking at isn't an outlier either. If you index too heavily on what the average outcome is, you're deliberately blinding yourself to the possibility of finding an outlier.

This is especially bad when someone uses this kind of reasoning to smack down other people's ambition, because the payoffs are asymmetric. If you incorrectly tell someone that their ambitious idea is likely to succeed, then they'll waste their time on a failed idea, which is not great, but ultimately fine. But if you smack them down with low-info heuristics and convince them their idea is likely to fail, you rob the world of an awesome idea that would have existed otherwise. Shame on you! (Too bad you'll never know about it.)




OK, so what should you do instead of relying on low-info heuristics? Here are my suggestions:


Thanks to draft readers Irene ChenMilan Cvitkovic, and Sam Zimmerman.