Anthropological Intelligence
The dirty secret of most tech startups is that they are not actually tech startups.
We call them “tech startups,” and they use tech … but they rarely actually push the technological frontier. Most of the time their technology is well understood and unlikely to fail. It might be inefficient, to begin with, especially if you’re doing something NP-complete like routing Uber drivers. But Uber is an good example: when it launched, its essential technological elements—smartphones, GPS, online maps—were already more or less solved problems. It was just offering them to users in a new combination.
Which is to say that, like most tech startups, Uber was, really, primarily an anthropology startup. The question it answered was not “will this technology work?” but rather “will people use this?” This may now be hard to believe, but at the time many people were certain the answer was no, that average people would never want to ride in a stranger’s car if that car was not a licensed taxi.
Most so-called tech startups are primarily answering these kinds of anthropological questions. AirBnB is an even better example. No one doubted whether they would technically succeed. Skeptics just thought the idea of people flocking en masse to stay in strangers’ homes, rather than hotels, was clearly nuts. To this day it has thrived without ever developing any meaningful technology. Uber, by contrast, did eventually try to build its own self-driving cars … and that did not go well.
AI, so far, is different. We don’t know what the tech is capable of: partly because it evolves so quickly, partly because models are opaque and a vast capability overhang exists within them. What is possible depends enormously on:
what context / data you provide
the prompts you use
how you decompose your problem into a graph of multiple LLM calls, and/or consolidate multiple calls into a single request
how well you evaluate the results of your calls, and respond to those evaluations (by re-running with a different approach, taking a different path through the call graph, escalating for human intervention, etc.)
how/whether LLMs themselves decide all the above—selecting data, composing prompts, directing the call graph, evaluating results—leading to something not unlike genetic algorithms, or, less formally, “turtles all the way down.”
At the same time, though, AI startups are even more anthropological than others. Most of the time you are inherently testing not just the technology, but users’ relationship to the technology. We’re in the early days of experimenting with how and when people actually want to use AI. ChatGPT itself was something of an oh-why-not launch that many OpenAI employees didn’t expect to be popular. We can expect many more Uber- and AirBnB-scale anthropological surprises.
Ultimately this makes AI startups harder than previous waves of tech. If you haven’t found fabled product-market fit, does it mean your “LLM engineering”—the system you have built, including LLM calls…not unlike the early days of software itself, when it seemed a relatively thin and disposable layer around microprocessors—isn’t good enough and must improve? Or is the problem not with your technology, but with the fact that people just don’t want what you’re giving them, no matter how fine the quality of your outputs? Or, worst of all, some combination of the two?
If Devin doesn’t succeed, will it be because their tech isn’t good enough? Or because they don’t fit into existing software workflows, yet aren’t revolutionary enough to entirely reshape those workflows? Or because engineers and CTOs decide it’s just anthropologically easier and less cognitive load to write some quick-and-dirty code to assemble context to be copy-and-pasted into o3? Or all of the above?
It gets more complicated yet. AI, unlike almost all previous technologies, tends to generate outputs with which users may have parasocial relationships. There exists the very real possibility that AI startups of the future may fail not because they don’t work, or because users don’t want them, but because alternatives / competitors have more charm. In the same way that UX became a differentiator for many companies, charm and (faux) personality will become AI differentiators.
(I wrote ten years ago about how important the use of different flavors of written language can be. The last several years of politics worldwide have left me feeling, sadly, extremely bitterly vindicated.)
Add up all the above, and it seems reasonable to conclude that the risk/reward for AI startups—by which I mean all kinds of ‘AI-first’ organizations, not just for-profit ones—will be even more skewed than for tech startups as a whole. The pot of gold (or meaningful change, if that’s what drives you) will be at least as large as it was for smartphones or the Internet. But there will also be many more ways to fail.