The Fundamental Extrapolation Error

Sep 29, 2023

What we talk about when we talk about AI is hypotheticals. Safety, capabilities, prospects; paths, dangers, strategies; “bigger than the internet,” “the race to God-like AI,” “might destroy the world” — almost every discussion about AI is, I cannot stress this enough, entirely about hypotheticals. I include substantially all of my own! We’re all constructing made-up futures and trying to convince ourselves, and/or others, that our made-up future coincides reasonably well with the real one.

Of course you know this, but I think it’s worth reiterating. And of course you object: (almost) no one is just randomly hallucinating hypothetical futures. We’re all thinking carefully, constructing rigorous mental models, assessing base rates. We’re not just prophesying; we’re forecasting; we’re extrapolating.

All true. Trouble is, we’re really bad at it. In some ways maybe almost as bad as LLMs.

No shame in that. There are many things we humans are very bad at. Famous among them is the Fundamental Attribution Error: attributing any/every person’s actions to the way they are, their inherent and irrevocable traits, rather than what happened to them, the context to which they’re reacting. That isn’t to say nature is nothing and nurture is all-powerful … but it is to say it’s well-established that we grossly overemphasize nature, and underemphasize nurture, when we try to explain what people do and are. This is kind of a big deal. (Most of humanity’s bigoted/discriminatory -isms can be attributed to it.)

I asked DALL-E to “Show me a colorful visualization of the fundamental attribution error.” This result is nonsensical, but hey, I still like it.

I propose that when we talk about AI we are, similarly, prone to a Fundamental Extrapolation Error, in that we extrapolate assuming modern AI is on a path to intelligence like ours. Of course we know intellectually it may not be. It may not be on a path to intelligence at all; LLMs may be a dead end, encoding information in the form of next-token prediction without ever achieving anything like reasoning. Or, more interestingly, they may be a path to intelligence very unlike ours. Either way, though, it feels like we’re on the road to intelligence like ours, but supercharged … so much so that we can see that road … we think.

But what if that road is a mirage? Or, not a mirage exactly, but what if it’s not a superhighway but a ragged trail ascending into the daunting Himalaya? When a person passes a bar exam, or a medical board exam, we think this is meaningful because we know there is a strong correlation between those results and being able to do the work of a lawyer or a doctor. When an LLM passes such an exam, it’s interesting, it’s remarkable, but there is no such correlation. Instead it can do a few fragmentary parts of those jobs very well, and the rest of them not at all.

We know this! We know it’s senseless to extrapolate from “LLM passes bar” to “LLM is lawyer.” And yet we commit the Fundamental Extrapolation Error and assume that future LLMs will get good enough to be lawyers, because that eventual intelligence is the only kind of intelligence we really grok. We assume future generations will develop reasoning, and goals, beyond those of a cron job, because whenever we look at intelligence we cannot help but anthropomorphize, since we’re the only base-rate intelligent species we know (sorry, chimps and dolphins) … and this is a huge mistake.

https://twitter.com/NeelNanda5/status/1705995593657762199

Similarly, on the tech/coding side of things which is my bailiwick, I’ve been playing around a lot with GPT-4 for the last several months, for various pet side projects, and talking to many other people doing the same. The rough consensus seems to be:

GPT-4 is capable of moments of absolute and utter magic, and one can use it to construct demos that are legitimately mindblowing and/or terrifying.

and also, at the very same time

GPT-4 is a ridiculously terrible engineering tool, in that getting it to do anything consistently & reliably amid the real world’s vagaries & flux & edge cases is just incredibly frustrating, like trying to wield a hammer that sometimes turns mid-swing into a screwdriver or a lasso or a hose.

Lots of AI safety people seem to be extrapolating from the former fact … and ignoring the latter. Maybe they don’t really understand it. Maybe they assume it’s nothing worth paying attention to, just messy early-days stuff which will obviously be cleaned up in future generations. (Careful observers will note the question-begging begins here, before we even start to talk about intelligence.) Or maybe … I hesitate to suggest it, but …

OK I guess I’m not that hesitant. One way AI is a very unusual technological breakthrough is that, for once, it came out of academic labs. Bell Labs, Intel and the Traitorous Eight, Xerox PARC, early Apple, early Google — some were home to great scientists, yes, but basically these were hackerspaces focused more on building products than writing papers. (True, the Internet was originally promulgated by universities, but by the late 90s they were no longer relevant to it.)

The entire field of computer security — of which which AI safety is at least arguably a very weird branch — has been guided at least as much by extremely disreputable hackers, and uncredentialed tinkerers, as by well-respected academics. To the credit of both cohorts, they ultimately managed to join their perspectives and work together. Well … ish.

Today, though, the AI tinkerers who are trying to actually build production products, and as a result will contemptuously tell you (correctly) that GPT-4 is a godawful mess of a tool and if someone wants to build something world-endangering with this or any of its descendants then watching the attempt would be extremely hilarious and they’re welcome to try their luck … don’t seem to be interacting much with the AI safety academics who argue that GPT-4 already has “sparks of AGI” and we’re only two generations / three years from LLMs being able to independently plan and execute what will certainly seem like troublingly independent volitional goals.

I think both sides are making the Fundamental Extrapolation Error. (Although I also think academics have a lot more to learn from hackers than hackers from academics. Heresy, I know.) The tinkerer who knows GPT-4 to be an incredibly frustrating real-world tool assumes it’s therefore “dumb.” The academic who sees that GPT-4 is capable of some mindblowing feats assumes therefore even more mindblowing and scary feats, including reasoning / goals / consistency / focus, are just around the corner.

There is very little reason to think either of these beliefs are at all correct! True, we have some visibility into the future in the form of the Chinchilla scaling laws, which seem to suggest that we have the data and compute to make LLMs 10x as powerful as they are today, but maybe not much headroom beyond that. But what does “10x as good” even mean? What would a 10x better LLM with the same context window be able to do better? Would it necessarily be more reliable? Would it necessarily be fundamentally just as limited? Are you sure? Are you … extrapolating?

To be clear, my own sympathies, as a tinkerer, are very much with the first view; I write fiction for a living but I struggle to imagine any engineer working at length with GPT-4 while still imagining it and any near-term successors as seriously AI-x-risk dangerous, even while thinking very hard about exponential curves. But intellectually I’m compelled to admit that both camps of extrapolation seem to be failing to wrap their minds around the fact that we may be — and, to my mind, probably are — building something orthogonal to how we think about intelligence.

This is cheering, in that the “AI doomers” are likely misextrapolating; but also disconcerting, in that the “effective accelerationism” people probably are too. And of course the further you extrapolate the more you miss. My sense is that we can make pretty good guesses about the next generation, reasonably opine about the subsequent one, and after that our already clouded crystal ball grows fully opaque. As such, the path forward seems to be: keep building and keep a fascinated/watchful eye on the empirical realities — not the hypotheticals — of what we have built can actually do. Thus far, that empirical reality is; don’t panic, and don’t expect to panic anytime soon.

Gradient Ascendant

Discussion about this post