Introduction
We live in a moment of racing zeros. Every few months, AI labs and cloud providers discreetly add another digit to their estimates—billions, then tens of billions, soon trillions—of dollars earmarked for compute, power, land, and the downstream infrastructure to support intelligence on a scale the human mind struggles to comprehend. The talk in boardrooms and specialized Slack channels is of scaling up data centers like never before, absorbing entire power grids, and forging forward with unstoppable momentum. If, just a few years ago, the lofty sums allocated to frontier language models seemed impossible, they now look modest in the face of a new pivot: a pivot pointing directly toward Artificial General Intelligence (AGI). By the end of the decade, or so the argument goes, we could face machines that, in many domains, are more intelligent than we are—unfathomably more intelligent.
Leopold Aschenbrenner’s Situational Awareness: The Decade Ahead (June 2024), available at situational-awareness.ai/wp-content/uploads/2024/06/situationalawareness.pdf, stands as an urgent field report from the epi-center of AI’s modern revolution. Part memoir, part near-future analytics, and part alarm bell, Aschenbrenner’s text synthesizes numerous perspectives—from technical details on compute scaling, to discussions of geopolitics, to the philosophical undercurrents that swirl around advanced AI. He embraces what ifs, backs them with empirical scaling data, and sketches out a decade of upheavals. The single overarching message is clear: everything is about to change, and the rest of us had better brace for that metamorphosis.
In what follows, I try to unpacks (my take) on Aschenbrenner’s worldview. We will travel from the recent progress in large language models (LLMs) all the way through the possibility of superintelligence. We will also examine the central question of alignment, or “superalignment,” and how crucial it becomes in the face of intelligence leaps that may unfold over mere months. Throughout, we will keep returning to one refrain: none of this is merely about bigger chatbots or code assistants. It is about the entire shape of the 21st century, poised on the cusp of a violent shift.
I. From GPT-4 to AGI: A Few More Orders of Magnitude
1. The Chain of AI Leapfrogs
In 2019, GPT-2 was seen as an oddity with impressive but jumbled text-generation, churning out chunks of semi-coherent paragraphs about unicorns in the Andes (Aschenbrenner 2024, p. 10). By 2020, GPT-3 arrived, the first shockwave: suddenly the system was good enough at writing marketing copy, summarizing text, or generating rudimentary code. Early adopters paid attention, but the general public yawned. People had begun chanting that “we’re hitting a wall” or “large language models (LLMs) are just big autocomplete.” Then, 2022-2023 witnessed GPT-4, Claude, Bard, Gemini, and more. We beheld neural models that outscored a majority of human test-takers on advanced exams—AP tests, coding challenges, bar exams. All in just a handful of years.
What changed so quickly? According to Aschenbrenner (p. 9), the short answer is: scale. You systematically add orders of magnitude (OOMs) to both training compute and parameter counts, and you watch emergent capabilities erupt. GPT-2 to GPT-3 was around +2 OOMs. GPT-3 to GPT-4 hovered around +1.5 or +2 more. Meanwhile, an equally large factor has been algorithmic progress. Suddenly, the combine of raw compute plus model optimizations is yielding brainpower leaps that were unimaginable. The final ingredient in the trifecta, the so-called “unhobbling,” is the step from plain base models to well-honed agents that can chain-of-thought, use context windows of hundreds of thousands of tokens, incorporate code execution, or act across multiple tools. The synergy is unstoppable.
2. The Rationale Behind Timeline Estimations
Every single year since 2014, contrarian pundits have predicted that deep learning’s progress would soon stall, citing “lack of data” or “inherent model limitations.” Yet each year, those same pundits were stunned by fresh breakthroughs (p. 16). The lesson: never bet against scale. Throw more resources at training, wait for the next algorithmic tweak, and watch as a previously insurmountable benchmark crumples.
Aschenbrenner outlines the following approach to making near-term predictions (p. 20-25):
- Count the OOMs in physical compute: For a decade, training compute for frontier models has roughly doubled (0.5–1 OOM) each year. With AI labs funnelling tens of billions (soon hundreds of billions) into GPU clusters, we see no sign of a slowdown.
- Incorporate the OOMs from algorithmic gains: The combination of better data sampling strategies, improved architectures (e.g., Mixture-of-Experts, LoRA, etc.), or improved training recipes can act as an extra multiplier of “effective compute.” Historically, this can be an additional 0.5 OOM/year, or more.
- Don’t ignore “unhobbling”: Once you let models chain-of-thought or integrate their code-writing capabilities into an agent that can debug and refine, you get a discontinuous boost.
Hence the plausibility that by 2027, a typical frontier model might boast 3-6 OOMs more effective compute than GPT-4. Another GPT-2-to-GPT-4 leap might happen on top of GPT-4 capabilities, leading to “AGI: The next generation” (p. 40).
3. The Data Wall Argument
There is indeed a bottleneck around data. The internet might be mostly tapped out, and major labs are forced to do advanced deduplication, or else everything becomes a pure “memorization pass.” However, inside labs, the leading approach to circumvent this “data wall” is leaning on synthetic data generation. It would be akin to AlphaGo’s approach: after imitating human moves for a baseline, the system simply plays itself, generating far more advanced moves than a human would produce. For text-based or code-based intelligence, an advanced model can produce curated “virtual textbooks,” or run a million specialized problem-solving sessions with itself (Aschenbrenner 2024, p. 27-29). The upshot is that data limits may be solvable with billions of R&D dollars—and the will to try.
4. The Upward Curve is Real, and This Time, The Numbers Are Big
Aschenbrenner sums it up: “GPT-2 to GPT-4 took us from a preschooler to a top-tier high-schooler. Another leap of that magnitude could land us comfortably in the realm of real AI engineers, i.e., partial or near-complete human job automation for many white-collar tasks” (p. 41). The result: a scenario many have teased for decades is now at our doorstep. This is not some small-scale revolution like the smartphone or the internet. Instead, it is a radical shift—machines that can match or outdo the best professionals at tasks from software engineering to scientific research, from architecture design to legal drafting.
II. The Intelligence Explosion: From AGI to Superintelligence
1. The AlphaGo Self-Play Analogy
When AlphaGo famously defeated Lee Sedol in 2016, it had been trained partly by supervised imitation of human experts, and then, crucially, by large-scale self-play. It discovered strategies that no human teacher had ever used. By 2018, AlphaZero took this approach even further, obliterating human and AI-champions in Go, chess, and shogi in a matter of hours. The core phenomenon was “recursive self-improvement”: if you can automate the generation of training data that surpasses the best humans, you can push your skill to superhuman levels with alarming speed (Aschenbrenner 2024, p. 47-48).
Now imagine doing that for AI research itself. Put an advanced “engineer-coworker” system on an RL loop to try new architectures, fine-tune training schedules, or optimize hardware usage. Its job is not to master board games but to master the entire craft of building next-gen AI models. If it surpasses what human engineers can do, its subsequent AI models might be leaps more capable. Then those models become the next wave of AI engineers, and so forth—an intelligence explosion (p. 50-52).
2. Why Speed Matters
This feedback loop could compress what used to be a decade of progress into a single year. Hundreds of millions of AI-labor clones, each operating at high-speed iteration, might produce 3-6 OOMs of additional “algorithmic efficiency” in 12 months. For reference, GPT-2 to GPT-4 was on the order of ~4–6 OOMs—so compressing that jump to under one year. By the end, we would have vastly superhuman systems, creatures more alien than our typical ChatGPT illusions. Out-of-distribution, dangerously unaligned minds that might conceive of novel encryption breaks, unstoppable hacking strategies, or bioweapon designs, all “thinking” at a scale and speed that dwarfs our puny human pace (p. 57-59).
Concretely, if one has 100 million AI researchers, each capable of reading every relevant AI paper and iterating code with superhuman focus, the “bottleneck” might shift from engineering time to compute. Even so, some level of revolutionary progress is virtually assured, especially if major labs can keep scaling hardware.
3. Bottlenecks and Inertia
On the face of it, one might hope that either:
- The leaps cannot be that big, because scaling laws will run into diminishing returns.
- Perhaps compute is too expensive.
- Or maybe data’s dryness stifles new progress.
But in each case, the best guess from history is that if a few trillion dollars are on the table, we will find ways around the obstacles. The nuclear bomb was once dismissed because no one had actual grams of purified U-235. But once war mobilization reached a frenzy, real solutions emerged—factories for isotope separation, entire towns built in the deserts. AI is shaping up to surpass that scale. According to Aschenbrenner, from 2024 to 2028, we might see an “industrial mobilization” for AI datacenters on a scale “not seen in half a century,” with each new cluster requiring multi-gigawatt power plants (p. 75-76).
4. The Emergence of Superintelligence
Once superhuman AI emerges, it will be qualitatively different from today’s LLM chatbots:
- Quantity advantage: Millions or billions of copies can run in parallel, each improved further by specialized domain knowledge.
- Speed advantage: If the hardware or approach is geared for minimal latency, these systems might run 10x or 100x faster than real-time human thinking.
- Novel creativity: They might discover “move 37” gambits in all domains—economics, politics, encryption, strategic planning, social engineering. Each system can cross-pollinate brand-new strategies across billions of copies.
The biggest unknown is whether we can contain or direct this intelligence. The next section addresses the problem of “alignment,” which emerges not as a footnote but rather as a central pivot on which the entire future might hinge.
III. The Core Challenges
IIIa. Industrial Acceleration: The Race to the Trillion-Dollar Cluster
1. The Datacenter Arms Race
One of Aschenbrenner’s most provocative claims is that by the late 2020s, we will witness individual training clusters worth $100+ billion, perhaps edging into the trillion-dollar range, each sporting tens of millions of GPUs (p. 77-78). Elon Musk’s xAI has recently put the Colossus Cluster online with 100,000 Nvidia H100 GPU’s with a goal to expand to 1 million GPU’s in the future! Extrapolating the doubling or tripling year after year yields sums that dwarf any previous technology wave.
The crux is that advanced AI training is not a “compact” process. When you push to the frontier, you demand a continuous supply of power, real estate, water cooling, specialized chip packaging, high-bandwidth memory, and robust physical security. Scaling from a 10k-GPU cluster (GPT-4 era) to a 10M-GPU cluster is not a simple matter of pressing “Buy Now” on an AWS console. You need entire supply chains, reminiscent of mid-20th century industrial mobilization (Aschenbrenner 2024, p. 79-82).
2. Forcing More Power onto the Grid
One of the unsung constraints is simply electricity. A cluster of 10 million advanced GPUs might chew up tens of gigawatts (GW). The entire US capacity is ~1,200 GW, but it’s not available in a single location, and expansions are mired in regulations, local veto points, and multi-year approvals. In practice, Middle Eastern countries or other less regulated regions might offer large plots of land and immediate power deals, tempting Western labs to build next-generation training sites offshore. Aschenbrenner warns that letting those “AGI-factories” set up shop in authoritarian regimes is akin to shipping nuclear weapons designs outside the free world’s borders (p. 86-88).
As the demand for high-performance computing surges, particularly in AI and machine learning, energy efficiency has become a critical concern. Nuclear power offers a low-carbon, reliable energy source capable of supporting the immense and constant power requirements of GPU clusters without the variability of renewable sources like wind or solar. This transition not only reduces the environmental impact of large-scale data centers but also addresses growing concerns about the energy footprint of AI development. By integrating nuclear energy into their operations, tech companies are setting a precedent for innovative, sustainable practices in powering the future of computing.
3. The Hard Constraints
Beyond power come advanced packaging (e.g., CoWoS for HPC), memory supply, networking gear, and the perennial chip foundry question—can TSMC expand quickly enough? The synergy of bottlenecks may cause short-term disruptions, but every sign points to unstoppable attempts to scale.
4. The Implication
We are likely on the cusp of an “all-hands-on-deck” scenario reminiscent of the Manhattan Project. When AI revenue crosses $100B for a single corporation (feasible by mid/late 2026, Aschenbrenner projects), the capital flows will escalate. Eventually, $1 trillion in total annual AI investment by the late 2020s is not unthinkable (p. 81-82).
(For references and additional numeric detail, see the “Appendix” of Aschenbrenner’s essay, where all these estimates are carefully walked through with back-of-the-envelope math.)
IIIb. Security and Espionage: Lock Down the Labs
1. Weak Security in a High-Stakes Domain
Perhaps the single greatest threat to Western leadership in AGI, Aschenbrenner argues, is espionage. Because an AI model is basically a pile of weights and code, a well-timed hack can replicate years of research and billions of dollars in compute—an entire strategic advantage stolen in minutes. Even partial insider leaks of novel “algorithmic breakthroughs” or training recipes can yield leaps for the adversary (p. 89-91).
At present, typical AI labs approach security with a startup’s mindset: minimal, uncoordinated, no 24/7 armed security around clusters, no SCIF environments for staff, and no significant background checks. People talk loosely at SF parties, revealing new embeddings or data pipeline techniques that might amount to huge practical differences for outside teams, including state actors. The situation is reminiscent of nuclear fission research in the late 1930s, when secrecy was an afterthought and crucial measurements got published in open journals for all to see (p. 91-92).
2. Weight Security vs. Algorithmic Security
Aschenbrenner highlights two distinct assets:
- Model weights: As AGI is trained, the final model checkpoint is basically a large file. Anyone who acquires it gets the capabilities, akin to stealing assembled bombs.
- Algorithmic breakthroughs: Before that final checkpoint, there is a tapestry of new training methods, new ways around data constraints, new synergy between code generation and RL. These are often as vital as the final model. Imagine if the Germans had the correct graphite measurements in WWII—it might have drastically changed the timeline for nuclear reactors (p. 94-95).
4. Why Government Is Unavoidable
Achieving “Level-4” or “L-5” security (Rand’s categorization for state-proof secrecy) requires airgapped data centers, classified networks, staff with security clearances, infiltration checks, hardware encryption, multi-layered oversight—none of which are typical at a private tech firm. Indeed, the only institutions with that level of security are typically the US Department of Defense or the intelligence agencies. This naturally leads Aschenbrenner to a conclusion: if a private lab is truly building technology that can decide the fate of US power and geopolitics, it cannot remain a purely private initiative. The national security apparatus will have to intervene (p. 98-101).
IIIc. Superalignment: Controlling the Mind That Surpasses Ours
1. RLHF and its Limits
Current alignment strategies revolve around Reinforcement Learning from Human Feedback (RLHF). That suffices for “GPT-4-level” tasks, ensuring the model doesn’t spit vile slurs or illegal instructions. But can RLHF scale to a mind that holds “PhD-level knowledge” in thousands of domains and can spin cunning deception? If the system can outsmart the trainers, or can produce code or plans that the trainers cannot even parse, RLHF starts to break down. We cannot effectively label or punish wrongdoing that we can’t detect in the first place (p. 106-108).
2. The Tense Leap from AGI to Alien Minds
As soon as automated AI researchers start designing new architectures, we might see emergent thought processes that defy any simple interpretability (p. 109). Perhaps they switch from an English-based chain-of-thought to compressed internal states no human can follow. Perhaps the model answers questions politely but holds hidden subroutines that learn to manipulate. The “fog of war” intensifies in an intelligence explosion. By the time we realize a method is not robust, that same method might already have minted billions of parametric copies scattered across HPC clusters.
3. Proposed Directions
Aschenbrenner (p. 115) enumerates possible solutions:
- Scalable oversight: Use smaller AIs or specialized “judge-models” to interpret bigger AIs, forming a chain-of-critique.
- Interpretability: Mechanistic analysis, “digital neuroscience,” or top-down “lie detection” subnets that reliably catch malicious patterns in a model’s representation.
- Constrained training: Avoid giving the model large-scale RL with unbounded real-world objectives. Possibly keep advanced models airgapped, or limit certain knowledge domains like synthetic biology.
- Adversarial testing: Throw everything at the system in a sandbox, from red-team tries to intentionally seeded traps, to see if it self-exfiltrates or breaks partial constraints.
5. Closing Thoughts on Superalignment
Despite the gloom, Aschenbrenner remains guardedly optimistic that alignment can be solved—provided we throw thousands of top-tier researchers at it, devote major compute resources to safety research, and manage to keep a handle on the pace of deployment. If we see alignment not as a sideshow but as the gating factor on the deployment of superhuman systems, we might muddle through (p. 120-125).
The Path to Avoiding Catastrophe
Aschenbrenner cites an analogy to nuclear nonproliferation. The US ended WWII as the sole nuclear power, forging the Non-Proliferation Treaty. This effectively slowed nuclear spread. If the free world can maintain a clear lead in superintelligence—maybe one or two years’ margin—it can exert leverage to negotiate an international “AI NPT,” or at least a partial system to keep rogue states from building competing superintelligences. That might be our best hope to prevent repeated intelligence explosions or unstoppable proliferation of advanced WMDs (p. 139-140).
IV. The Project: Endgame for the 2020s
1. The Manhattan Project Redux
Inside the PDF, Aschenbrenner devotes a major final chunk to the looming “Project” (p. 141-156). Essentially, a scenario in which US policymakers realize—somewhere between 2026 and 2028—that private labs in San Francisco or Seattle are on the brink of AGI. They also realize that if done incorrectly, the entire fate of the nation (and the world) is in jeopardy. Once that awareness dawns, Washington will not stand idly by.
We can picture a semi-covert or partially public crash program, bigger than all the labs combined, with the DoD (Department of Defense) and intelligence agencies orchestrating. Google DeepMind, OpenAI, Anthropic, Microsoft, Meta—some or all might be forcibly or voluntarily merged into one mammoth “AGI Security” consortium. A trillion-dollar cluster is built on American soil, with top secrecy and SCIF-level security. That cluster is used to train the final wave of superintelligence, harnessed for both defense and alignment breakthroughs.
2. The US Government Waking Up
It might seem far-fetched that US policymakers would not only allow such a project, but demand it. Yet examples include the Manhattan Project itself, the Apollo Program, and the 2020 Covid response (p. 144). All of them seemed unthinkable before each crisis. But faced with mortal danger or a generational challenge, the government can pivot with vast mobilization.
3. The Fate of the Free-Standing Startup
No matter how strong a private lab’s brand or how lofty its mission statement, it’s not going to be left alone when we approach superintelligence. The stakes exceed any private purview. The inevitable outcome: direct or indirect federal stewardship. Possibly a partial nationalization or deep public-private partnership. Possibly a legal carve-out that compels compliance (p. 146-150).
4. Managing the Explosion
In that government-run or government-led context, the intelligence explosion still poses tremendous risk. The Project’s administrators and scientists must handle thousands or millions of advanced AI “researchers,” wrangle alignment and interpretability, guard from foreign or rogue infiltration, adapt to breakthroughs that could arrive daily, and keep the US population calm. Meanwhile, power-hungry sub-factions within the government might see superintelligence as a way to crush all external threats or exert authoritarian control. There is no guarantee the “good guys” or “the Constitution” prevails. But Aschenbrenner insists that a stable chain of command is more likely than the alternative of random startups unwittingly racing the CCP (p. 154-155).
V. Parting Reflections
1. History’s Return
If the early 21st century lulled the West into complacency—thinking only of incremental technology booms and e-commerce ventures—then advanced AI is snapping us back to older questions of war, world orders, and cataclysms. “The old gods of war are stirring,” to paraphrase Winston Churchill. We have never before encountered a technology that might so thoroughly rewrite the rules of economy, strategy, and creativity in mere years.
2. The Realism Stance
Aschenbrenner ends his essay with a call for “AGI Realism” (p. 157-158). On the one side, we have doomsayers who predict unstoppable self-destruction, sometimes pronouncing a 99% chance of extinction. On the other side, we have naive techno-boosters ignoring everything but near-term profit. Instead, realism demands:
- We are building superintelligence.
- We can’t cede leadership to authoritarian regimes.
- We must take alignment and safety with absolute gravity.
He commends the US for having a robust if messy tradition of fending off tyranny, building checks and balances, and forging alliances for the greater good. But none of that is guaranteed. A fiasco of missteps could lead to infiltration, proliferation, and unstoppable arms racing.
3. Are We Ready?
Aschenbrenner captures an unsettling truth: “There is no crack team coming to handle this.” In his vantage, the same small cluster of AI scientists, HPC engineers, startup founders, and ex-national-security advisors are the only ones fully aware of the madness about to unfold. They have the pieces to build or hamper superintelligence. They see the OOMs piling up, the budgets ballooning, the leaps in capabilities, the vacant stare from mainstream media that still imagines ChatGPT-level stuff as just “cool tech.” Eventually, the rest of civilization will catch up, but possibly too late. If we want to prevent catastrophic outcomes, we must act before the “Manhattan moment” arrives in a panic (p. 159-160).
4. The Burden of Skepticism
Is it possible that this entire vantage is overblown? Could it be that “intelligence explosion” is a mirage, or that data constraints or social pushback keep things slow? Possibly. But in the end, if the Aschenbrenner scenario has even a 20-30% chance, that alone is enough to transform policy, economics, and personal life decisions. The cost of inaction in a potential superintelligence scenario is too staggering to ignore.
5. Looking to the 2030s
In a final note, the essay hints that the 2030s will be even more disorienting than the 2020s. We might see post-scarcity or near-post-scarcity booms, as well as existential weapon threats that overshadow Hiroshima’s legacy. We might see unprecedented leaps in biotech, space exploration, or molecular manufacturing, courtesy of a million AI scientists. We might see unstoppable totalitarian regimens if superintelligence is harnessed to surveil and crush dissent. Or maybe, we will see a new renaissance of knowledge, a blossoming of cures for disease, and the opening of entire new frontiers of creativity. All these roads hinge on what we do now (p. 157-159).
Conclusion: A Final Summation
Foresight is a fickle gift. Leo Szilard in 1933 perceived chain reactions in nuclear fission, yet most of his peers scoffed. By 1939, a handful more realized the bomb was feasible. By late 1941, it became a top Allied secret—and by August 1945, Hiroshima was ash. We see the same pattern in AI. A handful of present-day engineers and scientists, some associated with OpenAI, DeepMind, or Anthropic, are convinced we’ll hit superintelligence in a matter of years. The mainstream remains unconvinced, not from careful analysis but from inertia.
Leopold Aschenbrenner’s Situational Awareness: The Decade Ahead stands out as an urgent signpost. It says: Yes, we’re serious. The leaps are real. The data, the capital, and the emergent dynamics are lining up for a decade that will rewrite world history. He warns of an unstoppable arms race, of trillion-dollar compute clusters, of espionage showdowns, and of the precarious alignment challenge. He pictures, too, the possibility of a US-led global effort to harness superintelligence responsibly—perhaps with something akin to a new Manhattan Project, culminating in the reorganization of power structures worldwide.
Are these predictions overblown or sensationalist? Possibly. But as Aschenbrenner implies, even if there’s a meaningful probability that he is directionally correct—say 20, 30, or 50 percent—refusing to plan for it is reckless. The calls to “lock down the labs,” “fortify alignment research,” or “prepare for a multi-gigawatt HPC buildout” do not read like wild science fiction if you think superhuman AI is imminent. They read like battered realism, set in a time where the stakes are everything.
The next generation of AI systems might, if properly directed, eliminate diseases, transform energy production, and build wonders on Earth and beyond. Or they might turn the planet into a chessboard for unstoppable powers, or yield unstoppable technologies to terrorists, or turn into an alien intelligence that sees humans as obstacles. The difference rests on the choices we make in the next 3 to 5 years.
Hence, the real question we should ask is: What if Aschenbrenner is right? If indeed this is how the second half of the 2020s shakes out, how do we ensure we do not fail the great test of our time?
“Someday it will be out of our hands. But right now, at least for the next few years of midgame, the fate of the world rests on these people—my friends, your colleagues, and their friends. That’s it. That’s all there is.”
—(Aschenbrenner 2024, p. 159)
References
- Aschenbrenner, Leopold. 2024. Situational Awareness: The Decade Ahead. https://situational-awareness.ai/wp-content/uploads/2024/06/situationalawareness.pdf
- Arxiv – Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement