On the Eve of Superintelligence

Monday. January 27, 2025

These days, I’ve had the privilege of a courtside seat to the radical transformation happening in our society due to artificial intelligence. It has been awe-inspiring and I’m writing this mostly because outside of the Silicon Valley tech bubble it seems as though no one in a position to prepare our society for this is remotely aware of what’s coming. Worse, the trusted arbiters of information in the broader media appear to be writing headlines erroneously implying a slowdown in its progress.

This is not true. AI progress is accelerating, and the world is not preparing.

Before I talk about how to prepare, I want to talk a bit about how we got here.

For the past decade, the world has enjoyed the fruits of several technical breakthroughs in deep learning — the most widely used training method for artificial intelligence models — that have enabled widely used but largely innocuous tools. From 2015-2022, when I was an early AI researcher, we saw advances in methods like image recognition to label our friends in pictures, language translation to make it easier than ever to travel abroad, and protein folding to assist with biological discovery. And, of course, ad-ranking methods to fund all of the above.

Starting in 2022, teams of researchers at AI-focused research labs discovered that training ever-larger versions of the Transformer model on ever-larger datasets enabled conversational AI systems, notably ChatGPT. These were a commercial hit, and many people now benefit from widely available chat tools from OpenAI, Anthropic, Meta, and many others. Over the past few years, these tools have gotten more accurate, fast, and useful, enabling people to be more productive.

Notably, this series of advances (going from OpenAI GPT3 → 3.5 → 4o, Meta Llama 1 → 2 → 3, and Anthropic Claude 1 → 2 → 3 → 3.5) has largely occurred due to a set of competitors doing increasingly large pre-training. This refers to the process of extracting text from the internet and training models to ingest it. From 2021 to 2024 the large hyper-scalers got skilled at pre-training by generating exponentially larger datasets and then training exponentially larger models on them.

These investments were driven by an empirical scaling law that demonstrated that model trainers could expect predictable model quality improvements. The challenge here was that by 2024 the costs were getting exponentially higher, and they were already training on essentially all of the data on the internet, making it seem plausible that they would hit a wall eventually.

In September 2024, the course changed when OpenAI released their o1 model. OpenAI broke from the existing paradigm of pre-training and added a new scaling paradigm around test-time compute. The idea is simple. First, at inference time, rather than just producing an answer, o1 would produce a chain of thought, meaning essentially a series of outputs that resemble human multi-step reasoning. This was valuable because OpenAI then further trained their models to be increasingly good at producing chains of thought using a method called reinforcement learning (RL). Notably, this is the class of methods that OpenAI and Deepmind used to build AI systems that beat games like Chess, Go, and Dota ~8 years ago.

To train this model, they start with a pre-trained model like GPT-4o. At every step of RL training, they pass the model an input query and it produces a chain of thought, meaning a series of responses that resemble reasoning. Then, the model evaluates whether that chain produced an accurate answer. If it did, they update the chain to produce more correct answers. It turns out, one can scale up this process as well, and enable the resultant model to produce chains of thought that are capable of complicated multi-step reasoning tasks. This was the genesis of OpenAI’s o1 model, and it rapidly beat the benchmarks set by the previous GPT 4o model.

Fast-forward a mere 3 months to January 2025, and OpenAI announced the o3 series of models. OpenAI demonstrated that scaling up their RL training generated extremely high quality models. They were able to do this relatively fast because training these models is not as time-consuming as pre-training, and they can rapidly generate real and synthetic training datasets. They showed examples solving nearly all of the competition math problems in a challenging benchmark, and getting programming scores that implied the model is one of the best programmers in the world. These are hard benchmarks, and it is worth taking a look at them to see what is possible today. These models can reason for thousands or even hundreds of thousands of steps to arrive at solutions, and they appear to do so quite effectively. On January 21, 2025, OpenAI announced a private partnership to invest $500B to create new data centers in the United States, likely to train and serve their future reasoning models.

At the same time, there has been a nascent but rapidly advancing developer community around “computer use”, meaning an AI system that can control a computer and autonomously take actions on it. Packages like the Anthropic computer-use package, OpenAI Operator, and open source agents like browser-use enable anyone to use AI tools to control computers. Computer use is highly accessible — you simply pass a request for what you want, like “book me a flight from Boston to San Francisco on United on January 1”, and the model will figure out how to do that, control your browser or other applications, and do the action for you. This was impossible 9 months ago. Now, it mostly works. Very soon, it will fully work — that is a near certainty. If you don’t believe me, check out the demos from Anthropic, OpenAI, and browser-use.

These activities are notable in part because they are happening in the open, and there has been an explosion of open source activity in the AI world. For the last several years, Meta has been spending hundreds of millions to billions of dollars on their own models called Llama, with slightly worse but largely comparable performance to OpenAI/Anthropic’s closed models, and then releasing them completely for free. The result is that many AI applications can be run entirely on open source, freely available models.

Until recently, nearly all of these advancements were led by American companies, in part due to effective export control restrictions to China that limited Chinese labs from attaining enough GPUs (the specialized computer hardware used to train models) to seriously compete with American firms.

On Christmas Eve in 2024, this rapidly changed. Alibaba’s AI team, called Qwen, released an open source model that rivaled o1 in accuracy on some benchmarks. On December 26, Deepseek, another Chinese AI lab, open sourced Deepseek-v3, rivaling GPT4o. No doubt, these followers benefitted from seeing what worked for the hyperscalers in the US, but nevertheless Deepseek was able to deliver a high quality model for ~$5M in compute, likely less than 10% of what the American companies paid. On January 20, 2025, they released Deepseek-r1, their first reasoning model that has o1-level reasoning abilities on several benchmarks, and it’s open source. Notably, they also released a paper explaining how they trained this model, and it highlighted that they used a purely RL-driven approach that they invented, reminiscent of how Deepmind made Alphazero many years earlier to master Chess and Go purely from playing itself. As of today, I can sit in my apartment in the US and readily access outputs from multiple reasoning models, and the cheapest one by a factor of ~25x is the one trained by a Chinese lab.

This situation is awe-inspiring and alarming.

I want to make it clear that the reality on the ground is that the pace of AI advancements is clearly increasing, not decreasing. The newest era of RL-based scaling appears to be very far from any kind of data limitation, and we can already see incredible human-level performance on very hard tasks. There is no compelling reason to believe this will slow down anytime soon, and it is very reasonable to start planning for a reality where we have human or superhuman reasoning widely available in many domains.

Second, these models are already superhuman level at information retrieval from the internet, and with the newest reasoning models, will be at least human level at tasks that look like retrieval + multi-step reasoning on that retrieved information.

Third, we should assume that within the next 6 months, we will have AI systems that can accurately control browsers, take actions on the web, and reason about how to do so at the level of a human.

Fourth, it is increasingly likely that open source models will be highly capable, and trail the closed models by only a few months. Even if these don’t resemble true superintelligence, they will likely be available completely openly and largely unrestricted in practical utilization by basically everyone on Earth. If prices remain like they are now, it will cost a trivial amount to access them.

So how should society think about this?

We should begin by accepting that it is real. This is not some sci-fi, cyberpunk fantasy. We are about to enter an era of abundant intelligence, widely available in a democratized manner to nearly everyone, and I’d encourage you to read the excellent Machines of Loving Grace by Anthropic’s CEO Dario Amodei for an optimistic but measured vision of how we can create an incredible world with this technology. There may be incredible impacts on education, medicine, mental health, and wealth creation enabled by AI. In The Intelligence Age, OpenAI’s CEO Sam Altman also describes a vision of a future of abundance, one in which we will look back on today and wonder how anyone did the jobs we are doing. While these authors have a clear conflict of interest, it is valuable to see the vision they are building towards.

Collectively, the near-term advances I outlined will begin to resemble the “powerful AI” that Amodei outlines, where each AI model is smarter than Nobel Prize winners, can work autonomously for many hours or days, can interface with tools and control physical objects, and millions of copies of them can act independently — forming a “country of geniuses in a data center”.

While this may enable incredible opportunity for humanity, both Amodei and Altman acknowledge that the societal-level benefits of these technologies may take time to emerge, and there may be a bumpy road along the way.

It is likely that in the next 12-24 months we will begin to see the first signs of job displacement at a pace without historical parallels. Companies that are either using or selling AI products are already replacing the labor of 10,000s of humans. For example, conversational voice AI systems can replace receptionists and customer service agents, and these are significant sources of employment. The past of clicking through buttons and waiting on hold for customer service is rapidly being replaced with instant human-like conversation. There are companies selling AI software entirely automating the backoffice of small businesses. There are even AI products replacing high-end labor like financial analysis and software engineering. This is not a future hypothetical — all of these examples already exist.

For the 12-24 month horizon, I would adopt a simple heuristic: any job whose primary function involves either (1) repetitive non-physical labor or (2) information retrieval from the internet followed by straightforward multi-step reasoning that a generalist could be trained to do, is heavily at risk of being replaced. AI systems today can generate human-like voice and video in real time, respond intelligently to open-ended questions, and instantly retrieve nearly all information that is contained on the internet. This framework implies that a large number of jobs could be impacted, including accounting, financial operations, management consulting, junior-level software programming, backoffice work, investment banking analysis, human resources, etc. This is a wide aperture of impact. I’m not stating that these jobs will or should be eliminated overnight, but that there will be effective models that can do large fractions of jobs like these, which will create pressure to reduce headcount. If buying software means a company can do the work of 5 people with 2, most will buy it. Most “white collar” professions are going to be threatened by AI, and this is roughly 2/3 of the US labor force.

I’m not sure there exists any historical parallel for this level of labor disruption at this pace. Though I’m not a historian, my sense is that while the agricultural and industrial revolutions brought about changes to economies and job losses, these losses occurred over time scales that allowed society to re-organize around the evolving reality. The British agricultural revolution occurred over 200 years starting in the 1600s, and the industrial revolution occurred over nearly 100 years starting in the 1700s. Even more recently, job losses due to automation and globalization in American industrial towns have caused significant displacement, but even so all three of Pennsylvania, Wisconsin, and Michigan — former industrial powerhouses now called the “rust belt” — have unemployment rates below 5% today. So, while there were job losses and significant pain felt by people, my suspicion is that it happened gradually enough to allow most people to retire or adjust to a new reality.

I don’t see how that automatically happens this time. I’m not aware of any period of history in which 60% of the workforce was legitimately in threat of being replaced by technology in under 10 years. I do believe that these technologies could create economic productivity and growth, and this may create an age of abundance in wealth and resources. However, left unaddressed, these upsides will almost certainly flow to the shareholders of corporations, and are unlikely to be distributed across society in a manner that reflects a collective sense of abundance.

A few jobs will be relatively safe for some time. Jobs that involve physical labor have a good amount of longevity because, for now, we are pretty bad at general purpose robotics. Ones in regulated industries, like pilots or doctors, will have some legal and emotional protections that offer greater longevity. Careers that involve in-person interaction, especially with significant empathy, will remain hard to replace.

So how should we organize society to act on this?

First, I’d be remiss if I did not mention the geopolitical elephant in the room. If societies develop digital superintelligence, it would be an absolute travesty if it were first or solely controlled by an autocratic government like China. Preventing this should be a key component of the industrial policy of the United States. In the current state of affairs, some of the best models are open source models from Chinese companies, which I’m at least cautiously optimistic about in that they aren’t actively controlled by an adversary. That said, as these models advance, they could become closed and controlled by the Chinese government. This reality is a failure of American policy. Specifically, a major factor is restrictive skilled immigration policies advanced by both parties that stopped American firms from recruiting the best talent from China (and India). The world’s best people want to come to America — we should actively recruit them and brain-drain our adversaries.

As these models move closer to superintelligence, it is critical that they occur in democratic nation-states, with a measured framework of oversight by democratically elected bodies. For the latter to be effective and not just onerous, democratically elected governments like the US need competent technology leaders across every function, not an endless army of octogenarians. Having technologically inept individuals in government roles is a threat to our national security. The impacts of AI will be seen during the current presidential administration, so we must demand better-trained staff in elections and in existing regulatory agencies prior to expanding the regulatory state.

Second, we should invest in safety research outside of the large corporations. On balance, I am a fan of open source as a means of limiting power concentration, but there is no doubt that open model availability expands the surface of nefarious use cases. So far, the open source labs have taken efforts to limit the possibility to do harm, and it’s not clear that any serious incidents have occurred. However, as these models get connected to the internet and take actions autonomously, the number of ways that this can go wrong increases. I don’t know all the answers here, but there is a robust community of safety researchers that have been largely ignored in popular discourse about AI (other than hyperbolic headlines about AGI killing us all), and it’s time to engage seriously with this community. I would like to see journalists building expertise in reporting on safety issues, regulatory bodies staffed by PhD-level AI researchers who can understand the science, and government funding to expand academic AI safety research.

Third, we should begin planning for job retraining, but the nature of this retraining program will have to be very different from past ones. My optimistic hope would be that the accessibility of AI tools means that anyone in our society can become an entrepreneur and use technology to solve problems. Perhaps the ideal job retraining program will look less like “let’s teach people how to code” and more like a startup school for people to act on their innate entrepreneurial instincts with technology. We should welcome creative thinking here.

Fourth, as a society we must debate how we intend to support those whose jobs will be displaced in the near term. Realistically, there will not be roles for ~60% of the economy to fill in a short period of time. While there may be incredible new jobs that no one can conceive of in this future world of abundant intelligence, if those take a decade or more to come about, we must handle the disruption that occurs in the meantime. There are a variety of potential solutions here, from expanding the safety net, guaranteeing basic incomes, changing tax policies, etc. I’m not sure what the right answer is, but this is a topic that every elected official should be publicly commenting on, and you should call your representatives and push them to have an opinion about this.

It may be that my predictions are wrong. The pace of model improvements may plateau unexpectedly, or it may be that after the low-hanging fruit of job disruption that most jobs are harder to replace. It may also be that politics — whether geopolitics or board politics — meaningfully slows down the progress of one or more of these companies. In that case, our society may never have to grapple with the potential impact I outlined. I don’t mind being wrong here. In fact, were there a way to more gradually introduce artificial superintelligence into society, I would be relieved.

But just in case I’m right, we should take this seriously and be ready.

Thanks to Lucy Nam, Jen Hao, and Jonah Kallenbach for reading drafts of this.