Are Large Language Models (LLMs) the Holy Grail of AI? Almost, but not quite yet!

27.11.2023

By Pasi Karhu, CTO at Ai4Value Oy

November 27, 2023

In the fall of 2021, OpenAI opened their GPT-3 large language model API for all developers. Until then, it had been developed and tested behind closed doors and with only selected outside testers. Once they released it, it soon became clear to everyone following the AI field that it was something revolutionary. The model’s general knowledge was vast, and it could be programmed to chat with people as if it were a person too. When ChatGPT was released on November 30, 2022, for everybody to use, the whole world came to realize that something special had happened.

What is so very different about LLMs from any other AI systems before them? It is their mastery of generic world knowledge! That was the unobtainable Holy Grail of AI for over half a century, ever since the idea of and term “Artificial Intelligence” was coined in the 1950s. Every AI researcher knew that a generic AI system needed to have very broad knowledge of how the world works and how people behave, but no one knew how to achieve that. During the past decades, there were many futile attempts with symbolistic, logic-based approaches, and no one seriously believed that neural networks could solve the problem. Some very specific “coincidences” needed to happen before the solution popped up almost out of nowhere.

Who would have thought that social media and game playing played (no pun intended) crucial roles in the realization of the world knowledge in AI? The former contributed to the vast knowledge bases that are today freely available on the Internet’s many platforms, where people discuss every area of human life and record facts of the world. The latter has demanded ever-increasing computing power for ever more realistic computer games, which could only be achieved by faster and faster graphics computing, which in turn utilizes much of the same matrix mathematics as neural networks in machine learning. Huge knowledge bases and increased computing power, together with some small but fundamental advances in neural network algorithms, enabled the breakthrough technology that we now call Large Language Models or LLMs.

You can think of LLMs as a compressed database of everything that they have been fed with during their training – the compression ratio being something in the order of 100x. Analogously to image or video compression, some detail is lost in the compression, but in many cases, that does not matter. Vice versa, the unexpected benefit of this compression is the generalization of common sentence structures and longer ideation patterns in human text. This gives the models very human-like traits, so much so that we are tempted to think that the models actually think the way we humans think. They will readily spit out memorized patterns of thinking from the generalizations of similar training data examples – analogous to us humans spitting out instantly the memorized answer to “how much is 2 x 7.”

Thinking, however, is more than memorization. When during a task we come across a formerly unknown challenge, we stop for an analysis of the challenge and form strategies to conquer it. LLMs do not do that; they are a non-stop one-pass machine, just predicting words forward from previous words, without the capability of contemplation. They simply start hallucinating solutions for reasoning tasks where there are no commonly known memorized solutions. Human kind of reasoning loop can be emulated somewhat by running the challenge situation many times through the LLM by splitting it into smaller chunks, re-thinking prompting, looking at different angles of the problem with agent roles, etc.

Current LLMs with the few years of cumulated add-on technologies do help to automate many common real-world private and business tasks. But the hardest real-world challenges do not yield even to the most sophisticated current LLM utilization techniques. The awaited (and maybe also feared) AGI is still no available. Some fundamentally new technique is still needed to achieve that.

There has been a lot of buzz around the Q*-project (Q-star) of OpenAI in recent days (end of November 2023). It has been rumored to have been the ultimate reason behind the OpenAI CEO Sam Altman’s firing drama. Wildest suggestions even say that with Q*, OpenAI has already taken the final step to achieve AGI, Artificial General Intelligence. Speculators are guessing that Q* might be a combination of the well-known A* path searching and Q-learning algorithms, which are both suited to complex problem solving. If OpenAI indeed has succeeded in creating a seamless combination of these and current LLM technologies, then the so-far missing contemplation part of thinking may have been solved, and the Holy Grail of AI finally achieved.