No, LLM's aren't AGI, but that's not going to stop the Robots.
The Large Language Model has already changed the world, but the holy grail of AI seems just out of reach. Here's why the bots are still coming.
"Artificial General Intelligence" or AGI if you’re on acronym terms, is the AI holy grail. The dream: build a true thinking machine that can learn, interpret and reason like a human, but better. LLM’s have brought us closer than ever before to that reality, with these models seemingly capable of understanding language, solving complex problems, writing code, and even showing signs of personalities. Much of what Asimov envisioned in his “Robots” series, with his “three laws safe” robots and their “positronic brains” feel closer than ever to being a reality.
But here’s the comedown: much of it is smoke and mirrors. Really sophisticated smoke. Really shiny mirrors. LLMs don’t think, they predict. Impressive party tricks built on prediction, not conscious comprehension, a kind of super advanced autocomplete, built on vast oceans of data. There isn’t a conscious mind inside an LLM, and the bigger the models get, the fancier the autocomplete becomes, but still, AGI remains elusively out of reach. Don’t just take my word for it, listen to folks like François Chollet, who have been thinking and speaking about the problems of LLM’s when it comes to AGI. Link to a recent talk below.
There’s no ghost in the shell, just math. Math that can't yet abstract reason like a person, But that's going to be just fine. How many tasks will really require abstract reasoning of the kind LLM’s struggle with? Less than you might expect I'd wager, and LLM’s won't be doing the heavy lifting all on their own.
At the core of the Humanoid Robot will be (and already is) a powerful multi-modal AI agent architecture. Complex sounding I know, but the fundamental core of agent architecture is very simple. What if one model could talk to another model with text or images, and in turn that model could then reply, feeding back upon itself. A circle of logic. A crude kind of thought in a crude digital stream of consciousness.
This isn’t theoretical. It’s already happening. Agents are handling emails, scheduling meetings, taking phone calls, even video calls. Answering questions intelligently using the data of your digital life, Google Notebook is already happy to sell you this feature today as part of their pro AI subscription.
All that remains is to bring the models and agent systems together inside the package of a humanoid robot, of which there are dozens of companies, big and small trying to become the first to solve this problem. Judging by the progress shared on social media, they aren't far away either.
And just to remind the doubters out there. The first bot can be as dumb as a proverbial box of rocks, as slow as the average 80 year old man, as long as it's minimal competent at its core tasks, can navigate safely the work area with other humans, and costs less than the equivalent human per hour that's it! That's the threshold where the world changes!