No one seems to know whether world-bending AGI is just three years away. Or rather, everyone seems to know, but they all have conflicting opinions. How can there be such profound uncertainty on such a short time horizon?
Or so I thought to myself, last month, while organizing my thoughts for a post about AI timelines. The ensuing month has brought a flood of new information, and some people now appear to believe that transformational AGI is just two years away. With additional data, the range of expectations is actually diverging.
Here’s my attempt to shed some light.
The Future Has Never Been So Unclear
Have we entered into what will in hindsight be not even the early stages, but actually the middle stage, of the mad tumbling rush into singularity? Or are we just witnessing the exciting early period of a new technology, full of discovery and opportunity, akin to the boom years of the personal computer and the web?
There was already a vibe that things were starting to speed up (after what some viewed as a slow period between GPT-4 and o1), and then OpenAI’s recent announcement of their “o3” model blew the doors off everyone’s expectations. Thus:
In 10 years, chatbots have gone from "this can persuade 33% of users it's a 13yr old Ukrainian boy speaking broken English" to "the community of mathematicians, taken collectively, is still smarter than this”1.
AI is approaching elite skill at programming, possibly barrelling into superhuman status at advanced mathematics, and only picking up speed. Or so the framing goes. And yet, most of the reasons for skepticism are still present. We still evaluate AI only on neatly encapsulated, objective tasks, because those are the easiest to evaluate. (As Arvind Narayanan says, “The actually hard problems for AI are the things that don't tend to be measured by benchmarks”.) There’s been no obvious progress on long-term memory. o1 and o3, the primary source of the recent “we are so back” vibe, mostly don’t seem better than previous models at problems that don’t have black-and-white answers2. As Timothy Lee notes, “LLMs are much worse than humans at learning from experience”, “large language models struggle with long contexts”, and “[LLMs] can easily become fixated on spurious correlations in their training data”.
Perhaps most jarringly, LLMs3 still haven’t really done anything of major impact in the real world. There are good reasons for this – it takes time to find productive applications for a new technology, people are slow to take advantage, etc. – but still, it’s dissatisfying. One almost expects that in a year or two, we will have chatbots that plow through open questions in mathematics with the same dismissive ease as Matt Damon’s character in Good Will Hunting, code circles around the goofy genius main character in Silicon Valley, make Sheldon from The Big Bang Theory look like a physics neophyte, and the primary impact will still be kids using it to do their homework. The state of the discourse is nicely encapsulated by this tweet (lightly edited for clarity):
OpenAI announced a proto-AGI which all but confirms a near-term singularity as part of a gimmicky xmas product rollout, named to avoid a copyright clash with a phone company, to ~no fanfare in the mainstream media, the day before WSJ published the headline “The Next Great Leap in AI Is Behind Schedule and Crazy Expensive”.
I recently attempted to enumerate the fundamental questions that lie underneath most disagreements about AI policy, and number one on the list was how soon AGI will arrive. Radical uncertainty about the timeline makes it extremely difficult to know what to do about almost any important question. (I define AGI as AI that can cost-effectively replace humans at more than 95% of economic activity, including any new jobs that are created in the future. This is roughly the point at which seriously world-changing impacts, both good and bad, might start to emerge. For details, see here.)
In an attempt to shed a bit of light on the situation, I’m going to articulate two timelines for AGI – one slow, one fast. In the process, I’ll highlight some leading indicators that will indicate which path we’re on.
The Slow Scenario
This is the slowest timeline I can make a good argument for, excluding catastrophes (including war) or a deliberate pause. Think of it as a lower bound on AI progress.
In this scenario, the recent flurry of articles suggesting that AI has “hit a wall” are correct, insofar as the simple scaling of training data and model size – which drove progress from 2018 to 2023 – sputters out. It won’t come to a complete halt; in 2025 or 2026, we’ll see a new generation of models that are larger than recent trends would have indicated. That will allow the models to incorporate more world knowledge and “system 1 smarts” / “raw intelligence” (whatever that means) than GPT-44. But this won’t be a leap like GPT-3 to GPT-4, perhaps not even GPT-3.5 to GPT-4. It is becoming too hard to find more quality training data and justify the cost of larger models. Further progress on this axis remains slow.
Progress on “reasoning models” like o1, o3, and DeepSeek-R1 continues, turning out ever-more-impressive results on benchmarks such as FrontierMath and RE-Bench (which measures the ability of AIs to perform AI R&D). However, the gains are limited to neatly encapsulated tasks with black-and-white answers – exactly the sorts of capabilities that are easiest to measure.
This turns out to have less impact than anticipated. The models are useful for mathematicians, scientists, and engineers (including software engineers), especially as people become adept at identifying encapsulated problems that they can extract from the messy complexity of their work and hand to an AI. But because these neatly encapsulated problems only encompass part of the job, Amdahl's Law kicks in and the overall impact on productivity is limited5. Meanwhile, AI is generally not opening the door to radically new ways of getting things done. There are some exceptions, for instance in biology, but the incredible scientific and regulatory complexity of biology means that substantial real-world impact will take years.
Furthermore, progress on reasoning models is not as rapid as the vibes at the end of 2024 suggested. o3’s remarkable benchmark results turn out to have been a bit of a mirage, and even for neatly encapsulated problems, o1 and o3’s capabilities are found to be hit-and-miss6. Moving forward, the training approach struggles to generalize beyond problems with easily evaluated answers. Progress on problems that take humans more than a few hours to solve turns out to be especially difficult, for two reasons: navigating the vast range of possible steps requires higher-level cognitive strategies and taste that we don’t yet know how to train into an AI, and we haven’t figured out how to give LLMs fine-grained access to knowledge in the world.
There are widespread efforts to create “agents” – tools that can be trusted to [semi-]independently pursue a goal across an extended period of time. 2025 is dubbed the Year of the Agent, but the results are mostly poor. Agents struggle to go out into the world and find the information needed to handle a task. They do a poor job of navigating between subtasks and deciding whether and how to revise the master plan. Models continue to be distracted by extraneous information, and resistance to trickery and scams (“adversarial robustness”) remains a challenge. Much as the “Year of the LAN” was proclaimed across most of the 80s and early 90s, pundits will still be saying “this is finally the Year of the Agent” well past 2030.
Overcoming these limitations in reasoning and agentic behavior turns out to require further breakthroughs, on the scale of transformers and reasoning models, and we only get one of those breakthroughs every few years7.
Working around these limitations, individuals and organizations are finding more and more ways to encapsulate pieces of their work and hand them to an AI. This yields efficiency gains across many areas of the economy, but the speed of adoption is limited for all the usual reasons – inertia, regulatory friction, entrenched interests, and so forth. Fortunes are made, but adoption is uneven – just as in the early years of the Internet.
The major AI labs are doing everything they can to use AI to accelerate their own work. Internally, there are few barriers to adoption of AI tools, but the impact is limited by the tasks where AI isn’t much help (Amdahl’s Law again). AI is not generating the conceptual breakthroughs that are needed for further progress. It does accelerate the work of the humans who are seeking those breakthroughs, but by only a factor of two. The process of training new AIs becomes ever more complex, making further progress difficult despite continued increases in R&D budgets. There may be a slowdown in investment – not a full-blown “AI winter”, but a temporary pullback, and an end to the era of exponentially increasing budgets, as a less breathless pace starts to look more cost-effective.
Another drag on impact comes from the fact that the world knowledge a model is trained on is out of date by the time the model is available for use. As of the end of 2024, ChatGPT8 reports a “knowledge cutoff date” of October 2023, indicating that its models do not have innate understanding of anything published after that date – including the latest in AI R&D techniques9. Until a new approach is found, this will interfere with the pace at which AI can self-improve.
Eventually, 2035 rolls around – 10 years from now, which is as far as I’m going to project – and AI has not had any Earth-shaking impact, for good or ill. The economy has experienced significant change, AI is embedded in our everyday lives to at least the same extent as the smartphone, some major companies and job markets have been disrupted, we have capabilities that seemed almost unimaginable in 2020 and may still seem so today – but the overall order of things is not drastically altered. Importantly, we have not missed the window of opportunity to ensure that AI leads to a positive future.
The Fast Scenario
I’ll now present the fastest scenario for AI progress that I can articulate with a straight face. It addresses the potential challenges that figured into my slow scenario.
In recent years, AI progress has been a function of training data, computing capacity, and talent (“algorithmic improvements”). Traditional training data – textbooks, high-quality web pages, and so forth – is becoming harder to find, but not impossible; video data, commissioned human work, and other sources can still be found. The days of rapid order-of-magnitude increases in data size are behind us, but it’s possible to scrounge up enough high-quality tokens to fill in domains where AI capabilities had been lacking, increasing reliability and somewhat smoothing “jagged” capabilities.
More importantly, synthetic data – generated by machines, rather than people – turns out to work well for training ever-more-capable models. Early attempts to use synthetic data suffered from difficulties such as “model collapse”, but these have been overcome (as highlighted by the success of o1 and o3). Given enough computing capacity, we can create all the data we need. And AI tools are rapidly increasing the productivity of the researchers and engineers who are building the data-generation and AI training systems. These tasks are some of the easiest for AI to tackle, so productivity gains begin compounding rapidly. Computing capacity can now substitute for both data and talent, meaning that compute is the only necessary input to progress. Ever-increasing training budgets, continuing improvements in chip design, and (especially) AI-driven improvements in algorithmic efficiency drive rapid progress; as the lion’s share of innovation starts to be derived from AI rather than human effort, we enter the realm of recursive self-improvement, and progress accelerates.
Because we are no longer training ever-larger models, there’s no need to build a single monster (multi-gigawatt) data center. The primary drivers of progress – synthetic data, and experiments running in parallel – need lots of computing capacity, but don’t need that capacity to be centralized. Data centers can be built in whatever size and location is convenient to electricity sources; this makes it easier to keep scaling rapidly.
There is an awkward intermediate period where AI is becoming aggressively superhuman at encapsulated math and coding problems10, but is still limited in other problem domains, including many areas relevant to AI development (such as setting research agendas). During this period, the leading AI labs are fumbling around in search of ways to push through these limitations, but this fumbling takes place at breakneck speed. AI-driven algorithmic improvements allow a huge number of experiments to be run in a short period; AI tools handle most of the work of designing, executing, and evaluating each experiment; AI assistants help brainstorm new ideas, and help manage what would otherwise be the overwhelming problem of coordinating all this work and bringing improvements into production without destabilizing the system. Thus, human creativity is still a bottleneck on progress, but the AI tools are enabling us to run an unprecedented number of experiments, which yield serendipitous discoveries.
Overall, capabilities are not driven primarily by throwing ever-more data into ever-larger models (as in the 2018-2023 period); instead, advances in data generation and curation, model architecture, and training techniques allow increasing capabilities to fit into models of static or even declining size (as we’ve seen in 2024)11. This helps keep inference costs down, enabling the increased pace of experimentation and increased use of AIs in AI R&D. And the rapid progress maintains competitive pressure to motivate ongoing investment in data center buildout and AI training; this eventually extends to the international realm (especially US vs. China), bringing national budgets into play.
The recent trend toward use of “inference-time compute” – achieving better results by allowing an AI to think for an extended period of time – continues. However, today’s clumsy techniques (such as performing a task 1000 times and keeping the best result) outlive their usefulness. The focus will be on training systems that can think productively for an extended period, just as people do when working on a difficult problem. The current simple techniques will retain a role, but are used only on occasions when a problem is so important that it’s worth spending a lot of extra money just to get a slightly better solution.
A few major breakthroughs (and many intermediate breakthroughs) emerge to help things along. One of these probably involves giving AIs access to “knowledge in the world”, including the ability to create and revise notes, to-do lists, and other data structures to support them in complex tasks. Another probably involves continuous learning, at both coarse scale (getting better at selling a particular product over the course of 500 sales pitches) and fine scale (figuring out how to make progress on a tricky problem after grinding away at it for a few days). Among other things, this alleviates the knowledge cutoff problem that would otherwise interfere with rapid AI self-improvement.
Other breakthroughs are found that allow us to apply LLMs to messy problems that can’t be decoupled from their real-world context. I have no clear idea how this might be accomplished on a fast timeline, but I think it is a necessary assumption for the scenario to hold.
As a result of all these advances, AI agents become truly useful. Success in 2025 is mixed, but 2026 really is the Year of the Agent, with adoption across a wide range of consumer and workplace applications. Subsequent years see rapid increases in the breadth and depth of AI applications – including use in the development of AI itself.
How quickly might this lead to AGI – again, defined as AI that can cost-effectively replace humans at more than 95% of economic activity? I struggle to put a number on this. But it has taken us roughly two years to go from GPT-4 to o312, and in that time we’ve arguably seen just one major breakthrough: RL training on synthetically generated chains of thought. I’ve argued that several further major breakthroughs are needed, at a minimum, to reach AGI. So it should take at least twice as long as the time from GPT-4 to o3.
We might expect progress to speed up, due to increased budgets and AI assistance. But we might also expect progress to be more difficult, as we exhaust easily tapped resources (off-the-shelf data; previously existing GPUs and scientific / engineering talent that could be repurposed for AI), systems become more complex, and we push farther into poorly-understood territory.
Put all of this together, and I have a hard time imagining that transformational AGI could appear before the end of 2028, even in this “fast” scenario, unless more or less all of the following also occur:
We get “lucky” with breakthroughs – multiple major, unanticipated advances occur within the next, say, two years. New approaches at least as impactful as the one that led to o1. Even this might not be sufficient unless the breakthroughs specifically address key limitations such as continuous learning, messy real-world tasks, and long-horizon planning for problems with no clear right and wrong answers.
Threshold effects emerge, such that incremental advances in model training turn out to cause major advances in long-horizon planning, adversarial robustness, and other key areas.
We sustain extremely rapid improvements in algorithmic efficiency, allowing a massive deployment of advanced AI despite the physical limits on how quickly chip production can be increased in a few short years.
That’s my fast scenario. How can we tell whether we’re in it?
Identifying The Requirements for a Short Timeline
My chief motivation for articulating these two scenarios was so that I could review the differences between them. These differences might constitute leading indicators that we can watch in the coming months to see which course we’re on.
The most important question is probably the extent to which AI is accelerating AI R&D. However, I don’t know that this will be visible to anyone outside of the frontier AI labs. What follows are some key leading indicators that the general public will be able to observe if we are on a fast path to AGI.
Progress on reasoning is real, sustained, and broadly applicable. If o3 is released to the public and consistently wows people (in a way that I believe o1 has not consistently done), if its capabilities on math and coding tasks seem consistent with its amazing scores on FrontierMath and Codeforces, and there’s at least one more major step forward in reasoning models in 2025 (possibly leading to unambiguously superhuman scores on very difficult benchmarks like FrontierMath and Humanity’s Last Exam), that supports a fast timeline13. If people report mixed experiences with o3, if its performance is still very hit-and-miss, if benefits outside of math/science/coding are still limited, if the FrontierMath results look less impressive once details emerge, if that doesn’t change in a significant way over the course of 2025, that will suggest we’re on a slower timeline. It would mean that we really haven’t made a lot of progress in fundamental capabilities since the release of GPT-4 in March 2023.
In the rapid-progress scenario, the techniques used to train reasoning models on math / science / programming tasks are succesfully extended to tasks that don’t have clear right and wrong answers. And these models must become more reliable for math / science / programming tasks.
Breaking out of the chatbox: AIs start showing more ability at tasks that can’t be encapsulated in a tidy chatbox session. For instance, “draft our next marketing campaign”, where the AI would need to sift through various corporate-internal sources to locate information about the product, target audience, brand guidelines, past campaigns (and their success metrics), etc.
AI naturally becomes more robust as it gets better at reasoning, fuzzy problems, and incorporating real-world context. Systems are less likely to make silly mistakes, and more resistant to “jailbreaking”, “prompt injection” and other attempts to deliberately fool them into unintended behavior. (This may be supplemented by new forms of anti-trickery training data, mostly synthetic.)
Widespread adoption of AI agents, [semi-]independently pursuing goals across an extended period of time, operating in “open” environments such as the public Internet (or at least a company intranet). These agents must be able to maintain coherent and adaptive planning over time horizons that gradually increase to multiple hours (and seem likely to progress to months), completing tasks and subtasks that don’t have black-and-white answers. No particular barrier emerges as we push reasoning and agentic models into larger-scale problems that require ever-longer reasoning traces; models are able to develop whatever “taste” or high-level strategies are needed. They must be sufficiently resistant to trickery and scams such that this is not impeding their adoption.
Real-world use for long-duration tasks. Users are actually making use of AI systems (agentic and otherwise) to carry out tasks that take progressively longer. They are finding the wait and cost to be worthwhile.
Beyond early adopters: AI becomes more flexible and robust, achieving adoption beyond early adopters who find ways of incorporating AI into their workflow. It is able to step in and adapt itself to the task, just as a new hire would. AI’s increasing flexibility flows over and around barriers to adoption. This greatly increases the pace at which AI can drive productivity gains across the economy – including the development of AI itself14.
Scaling doesn’t entirely stall. We see the release of a “larger” model that appears to incorporate more forms of training data, and constitutes an impressive advance along many fronts at once – like GPT-3.5 → GPT-4, or even GPT-3 → GPT-4 (and unlike GPT-4o → o1). Preferably before the end of 2025. We aren’t looking for a model that is larger than GPT-4, but one that is larger than its contemporaries in exchange for broader and deeper knowledge and capabilities.
Capital spending on data centers for AI training and operation continues to increase geometrically. This is a useful indicator for both the level of resources available for developing and operating AIs, and the internal confidence of the big players.
Unexpected breakthroughs emerge. To get transformational AGI within three or four years, I expect that we’ll need at least one breakthrough per year on a par with the emergence of “reasoning models” (o1)15. I suspect we’ll specifically need breakthroughs that enable continuous learning and access to knowledge-in-the-world.
How To Recognize The Express Train to AGI
If we are on the road to transformative AGI in the next few years, we should expect to see major progress on many of these factors in 2025, and more or less all of them in 2026. This should include at least one major breakthrough per year – not just an impressive new model, but a fundamentally new technique, preferably one that enables continuous learning, access to knowledge-in-the-world, or robust operation over multi-hour tasks.
Even in this scenario, I have trouble imagining AGI in less than four years. Some people have shorter timelines than this; if you’re one of them, I would love to talk and exchange ideas (see below).
If models continue to fall short in one or two respects, AI’s increasing array of superhuman strengths – in speed, breadth of knowledge, ability to take 1000 attempts at a problem, and so forth – may be able to compensate. But if progress on multiple indicators is slow and unreliable, that will constitute strong evidence that AGI is not around the corner.
We may see nontechnical barriers to AI adoption: inertia, regulatory friction, and entrenched interests. This would not necessarily indicate evidence of slow progress toward AGI, so long as these barriers are not posing a significant obstacle to the ongoing development of AI itself. In this scenario, AI adoption in the broader economy might lag until AI capabilities start to become radically superhuman, at which point there would be strong incentives to circumvent the barriers. (Though if inertia specifically is a major barrier to adoption, this might constitute evidence that AI is still not very flexible, which would suggest slow progress toward AGI.)
I am always interested in feedback on my writing, but especially for this post. I would love to refine both the slow and fast scenarios, as well as the scorecard for evaluating progress toward AGI. If you have thoughts, disagreements, questions, or any sort of feedback, please comment on this post or drop me a line at amistrongeryet@substack.com.
Thanks to Andrew Miller, Clara Collier, Hunter Jay, Jaime Sevilla, Julius Simonelli, Nathan Lambert, and Timothy Lee.
In other words, it was not so very long ago that language models couldn’t reliably string together a grammatical sentence, and now we’re preparing to measure whether a single AI has become as powerful as an entire team of research mathematicians. The quote is a reference to this tweet:
I’m excited to announce the development of Tier 4, a new suite of math problems that go beyond the hardest problems in FrontierMath. o3 is remarkable, but there’s still a ways to go before any single AI system nears the collective prowess of the math community.
When reviewing a draft of this post, Julius Simonelli asked an excellent question: how do we know o1 and o3 don’t improve on tasks that don’t have black-and-white answers, when by definition it’s difficult to measure performance on those tasks?
For example, poetry doesn't have black-and-white answers, but I don't see why we should say it's “bad” at poetry.
I’m basing this statement on a few things:
Vibes – lots of people saying that o1 doesn't seem better than 4o at, for instance, writing.
OpenAI explicitly stated that o1 primarily represents progress on math, science, and coding tasks.
I vaguely recall seeing non-math/science/coding benchmarks at which o1 does not beat 4o. But I could be misremembering this.
There are sporadic reports of o1 doing much better than other models on non-math/science/coding tasks. For instance, here’s Dean Ball being impressed by o1-pro’s answer to “nearly a pure humanities question” about Beethoven’s music and progress in piano construction; he also says that “o1-preview performs better than any non-specialized model on advanced and creative legal reasoning”. But you can find anecdotes in favor of almost any possible statement one might make about AI. My best guess is that Dean has identified something real, but that o1’s gains over 4o are mostly limited to black-and-white questions.
For another counterpoint, see this tweet from Miles Brundage.
Large Language Models, the technology underlying ChatGPT and the basis of most claims that AGI is coming soon.
Note that over the course of 2024, released models have been relentlessly shrinking in parameter count (size), squeezing ~equivalent knowledge and improved capabilities into fewer and fewer parameters. Here I am envisioning that there will be a bump in this downward progression – there will be some new models in the mix that use more parameters than that recent trend, in order to incorporate more knowledge. Even these models may then continue to shrink, if there is room to continue the trend of model compression.
Suppose 50% of my time is spent on tasks that can be handed to an AI, and AI makes me 10x more productive at those tasks. My overall productivity will increase by less than 2x: I’m limited by the other half of the work, the half that AI isn’t helping with. Even if AI makes me 1000x more productive at the first half of the job, my overall productivity still increases by less than 2x.
For example, from someone I know:
One example from yesterday: I wanted to set up a pipeline in colab to download random files from common crawl, and pass them by OpenAIs API to tag whether they are licensed.
This should be an easy task for someone with encyclopedic knowledge of common crawl and the OA API, yet the models I tried (o1, Gemini) failed miserably.
A recent tweet from Dan Hendrycks expresses this succinctly.
Both the 4o and o1 variants.
Models that can perform web search can be aware of developments after their cutoff date. But they will not have deeply internalized that knowledge. For instance, if a new training algorithm has been released after the cutoff date, I might expect a model to be able to answer explicit questions about that algorithm (it can download and summarize the paper). But I'd expect it to struggle to write code using the algorithm (it won't have been trained on a large number of examples of such code).
It’s possible that “reasoning” models with strong chain-of-thought capabilities will outgrow this problem. But barring a substantial breakthrough that allows models to learn on the fly (the way people do), I’d expect it to continue to be a handicap.
People have pointed out that advanced math bears little resemblance to the tasks required for survival in prehistoric times, and so there’s no reason to believe that human beings are very good at it on an absolute scale. It’s possible that AI will blow straight past us on many tasks relevant to AI research, just as it has done for multiplying thousand-digit numbers or playing chess. As Jack Morris puts it, “strange how AI may solve the Riemann hypothesis before it can reliably plan me a weekend trip to Boston”.
I can imagine that keeping model sizes down might involve creating multiple versions of the model, each fine-tuned with a lot of domain knowledge in some specific area. The alternative, training a single model with deep knowledge in all domains, might require the model to be large and thus expensive to operate. But perhaps this will turn out to be unnecessary (mumble mumble Mixture of Experts mumble mumble).
GPT-4 was released on 3/14/23. I believe o3 is rumored to have a release date in January, so 22 months later. OpenAI is understood to have additional unreleased capabilities, such as the “Orion” model, but it is not obvious to me that the level of unreleased capability at OpenAI as of a hypothetical January o3 release is likely to be substantially more than whatever they had in the wings in March 2023. So I’ll say that progress from March 2023 to January 2025 is roughly equal to the delta from GPT-4 to o3.
Here, I mean performance that is, on average, superior to the score you’d get if you assigned each problem to an elite specialist in the technical domain of that specific problem.
The tech industry, and AI labs in particular, will be heavily populated with early adopters. But the ability of AI to move beyond early adopters will still be a good indicator that it is becoming sufficiently flexible and robust to broadly accelerate AI R&D.
It’s possible that we’ll see “breakthroughs” that don’t come from a radical new technique, but simply emerge from threshold effects. That is, we might have incremental progress that crosses some important threshold, resulting in a dramatic change in capabilities. Quite possibly the threshold won’t have been apparent until it was reached.
As a non-technical reader, my only feedback is to compliment you on your engaging, rigorous and methodical analysis of the AGI landscape. Most commentators are vibe-based, which gets increasingly irritating each week as the field matures and the landscape becomes more granular.
Great analysis as always.
1) Given that frontier lab insiders likely know more about this than the rest of us, what do you think of some of their comments that imply a shorter timeline? Do we assume they are drinking the Kool-Aid?
https://x.com/McaleerStephen/status/1875380842157178994
https://x.com/shuchaobi/status/1874988923564564737
https://x.com/OfficialLoganK/status/1873768960975671296
https://x.com/polynoamial/status/1855037689533178289
2) Another possible topic for a future post might be your thoughts around this idea that is similar to Leopold's shortcut "we just need to build AI ML researchers." That idea is that we just need AI to help us create Make/N8N flow charts (instead of raw code so less technical humans can easily review) that are made up of as many deterministic parts as possible (e.g. code) and then use AI for the non-deterministic parts broken into small steps that an AI will likely get correct. If we can get workflows that are reliable and automate many white-collar jobs, then this alone might be a major societal disruption. Basically "we just need AI to help us write automation scripts that also call AI as needed."