What’s inside GPT-4’s “brain” anyways?

October 4, 2023

AI Computer2.jpg — Dark figure sits in front of a computer screen. Photo credit: Emily Rice

Shortly after ChatGPT ballooned in popularity, Microsoft released a paper: “Sparks of Artificial General Intelligence: Early experiments with GPT-4.”

Artificial Intelligence? Sure. Most people are familiar with the day-to-day AI tools of Siri or Google Maps, which use algorithms and extensive data to make predictions based on the input. Getting computers to replicate human tasks has been a goal and reality since the 1940s. As early as 1966, a computer program named ELIZA could interpret and output human language akin to today’s ChatGPT.

This early conversation is taken from the inventor’s academic paper about his chatbot:

“Men are all alike,” the human wrote to the program.

“IN WHAT WAY,” Eliza replied.

“They’re always bugging us about something or other,” the human wrote.

“CAN YOU THINK OF A SPECIFIC EXAMPLE,” Eliza replied.

“Well, my boyfriend made me come here,” the human wrote.

“YOUR BOYFRIEND MADE YOU COME HERE,” ELIZA retorted.

ELIZA was so convincing that some users couldn’t believe they were talking to a computer. To some, ELIZA may have seemed like artificial general intelligence (AGI), the theoretical concept that a machine could learn and think like a human, implying it has self-awareness and a consciousness.

The misconception that one is talking to a human when, in reality, it’s a computer is a basic test for AGI. Computer scientists had achieved a couple “sparks” of this in the mid-century in the form of ELIZA, among others.

In Microsoft’s recent paper, researchers tested GPT-4, the most advanced publicly available large language model (LLM), on a wide array of tasks that tested the model’s creativity and curiosity. The paper demonstrates the powers of a computer that can interpret complex, emotional, open-ended situations.

This is only “sparks” of AGI though. The possibility of achieving full-fledged AGI remains hotly debated. Open AI, the company behind ChatGPT and GPT-4, has gone so far as to fund, research and report extensively about safety precautions in case they achieve AGI.

Sam Altman, the CEO of Open AI, has equated AGI with the “median human,” a term he uses for a computer that could replace a co-worker. Since GPT-4 emerged and ChatGPT grew in popularity, Open AI’s rhetoric around AGI has suggested that it is feasible.

Dr. Ubbo Visser, the graduate director of the University of Miami’s computer science department and a member of the AI field since 1986, remains dubious about current prospects of AGI.

“I don’t think right now we can get there because we would have to have social intelligence, to have emotional intelligence. You have to have a deep understanding of how humans work and how they feel and how is this possible? How would you represent that? Right now, we’re just putting a bunch of words in there and telling them to learn the connections between words,” Visser said.

Current LLMs work by predicting the next token, such as a word, based on what it has understood from an extremely large dataset. The model uses “weights” to help assign priority to its data. If there’s an error, a process called “backpropagation” helps the model learn from its mistake and avoid them in the future. Over the past few decades, these fundamental tricks that allow GPT-4 to perform like a human have remained the same.

“That has not changed, really. What has changed is the sheer dimensions I mentioned. Instead of 1000s of weights, now you have trillions of weights,” Visser said.

The scaling is incredible. As AI models have improved in the past decade, the advances have come from more computational power and the ability to process inputs, especially visual. Instead of trying to copy what the brain does, AI models focus on computer-specific methods and take inspiration from how the brain works.

For example, in visual processing, experimentalists learned how the brain receives an image to further their own understanding of how a machine could also “see.”

“The brain processes inputs hierarchically, so when we see a cat we process first the orientation and edges of the image and then textures and parts of shapes and objects until we can identify the cat,” said Odelia Schwartz, an associate professor of computer science whose research is at the intersection of the brain and AI.

Computer scientists applied this principle to AI models to help it make sense of images. Still, experimenting with the model reveals shortcomings indicative of its non-human nature. In one study, the machines with this visual processing ability were asked to process a cat with the skin of an elephant. While humans identified the animal as a cat, the machine identified an elephant, revealing a bias towards the texture of the animal.

Ultimately, AI is still only complicated algorithms making sense of a massive conglomeration of data. Despite acting remarkably human at times, the machines lack the precision humans take for granted.

“Not only that you’re being able to speak in a certain language and choose your words wisely. You are also being able to mix this with emotional intelligence,” Visser said. A human can evaluate, “Is it a safe environment? Is your discussion partner friendly or out to harm? All this information we have as humans. We feel it. We don’t think about it.”

But what if these AI models, chatbots for example, could feel as humans do through the embodiment of a robot? Visser works in both of these fields, with an extensive history in robotics. Part of his work at UM focuses on designing service robots that use LLMs for communication and other AI for additional human tasks, such as object identification.

“Robotics have an important piece which is the interaction and embodiment with the physical world. So they interact with us, they can interact with other objects and then sense, like we do, in real time,” Schwartz said.

Between robotics and deep learning, the field that allows AI to extract detailed data, computers are improving at processing the inputs that humans only need to passively understand.

Combined with the field of quantum computing, which could theoretically make computers exponentially faster at processing very large datasets, there seems to be potential for AGI in the next several decades. The field is always rapidly changing, making speculation difficult, but perhaps in a world where computer hardware can process a tremendous range and amount of inputs, AGI could exist.

This is where Visser objects with perhaps the most important point: the data. GPT-4 trains itself off of the millions of users who have contributed to websites such as Stack Overflow, Reddit and Wikipedia.

“Who can guarantee that that information that is out there is actually true?” Visser said.

The falsities and inaccuracies within these data sets contribute to the errors of GPT-4, still present in Microsoft’s experiments.

Not only do computers need to accurately process incredible amounts of data, but it needs to be very strong data. The subtleties and details that one learns over 30 years of living experience are summarized bluntly on the Internet. For Open AI, with its undoubtedly massive datasets, architectural strengths and thorough training systems, it still lacks a human’s extreme attention to detail. The question will remain, can humans create a database that has the same information as the human brain?

ABOUT US