IE 11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

UCLA Tests Whether ChatGPT Is Smarter Than College Students

Researchers at the University of California, Los Angeles fed GPT-3 a battery of tests, and it solved about 80 percent of given problems correctly, compared to just below 60 percent of the 40 undergrads who participated.

ChatGPT laptop,Chatgpt,Chat,With,Ai,Or,Artificial,Intelligence.,Young,Businessman,Chatting
Shutterstock
(TNS) — Is GPT-3 smarter than a college kid?

According to a new study by researchers at UCLA, the answer is yes. At least when it comes to a few common intelligence tests, and even some SAT-style questions.

"It's fair to say that we were extremely surprised by the results that we got," said Keith Holyoak, a psychology professor who was one of the authors of the study, along with UCLA Professor Hongjing Lu and postgraduate student Taylor Webb.

Holyoak said the tests asked OpenAI's GPT-3 chatbot to solve a battery of tests specifically designed so that the program would not have "seen" them before. Chatbots like GPT-3 are fed huge amounts of information from across the Internet that they use to formulate answers to questions, and Holyoak said the team wanted to make sure the bot couldn't "cheat" by finding the answers to the problems already stored in its vast set of training data.

So Webb took a common visual pattern test that is part of the Raven's Progressive Matrices test, a common psychological intelligence test, and broke it down into numbers so the bot could understand it. Another test required the program to figure out what letter would come next in a pattern of them, while another had it figure out verbal analogies like "'Love' is to 'hate' as 'rich' is to which word?"

To the researchers' surprise, GPT-3 solved about 80 percent of the problems correctly. That was well above the average score of just below 60 percent of the 40 undergrads who participated in the study.

So what does that mean? Is the program "thinking" like a human does?

"Is it the case that somehow they've developed a human-like mental representation in their neural networks?" Holyoak said. "That's possible."

But the broader implications are somewhat open-ended, especially since OpenAI keeps its underlying code a close-kept secret, unlike other AI models like Meta's Llama 2 that recently was released.

The authors wrote that "we must also entertain the possibility that this type of machine intelligence is fundamentally different from the human variety," in its sheer computational ability that allows it to solve "complex problems in a holistic and massively parallel manner, without the need to segment them into more manageable components" like a human brain does.

OpenAI did not immediately respond to an emailed request for comment.

The company's chatbots have already been shown to be able to handle the law school entrance exam with ease, and GPT-3, the version used by the UCLA researchers, is not even the latest or most updated the company has released.

Another report released by researchers in May found that even the most advanced GPT-4 model mostly failed when it was presented with puzzle and pattern-prediction problems. Part of those struggles point to how the programs are tested, which led Holyoak and his team to break down visuals tests into numbers so that the program could more easily digest them.

But before humans get too hysterical about the coming of robot overlords, Holyoak sounded a note of calm, as there was one test the bot pretty much flunked out on.

When the program was asked to read short stories and identify the ones that conveyed the same message in a different way, it was outperformed by the students.

"No matter how impressive our results, it's important to emphasize that this system has major limitations," Webb said in a statement.

There were other obvious limitations as well, according to Webb.

"It can do analogical reasoning, but it can't do things that are very easy for people, such as using tools to solve a physical task. When we gave it those sorts of problems — some of which children can solve quickly — the things it suggested were nonsensical."

©2023 the San Francisco Chronicle. Distributed by Tribune Content Agency, LLC.