Stanford scientists find GPT-4 gets dumber over time

Miscellaneous / by admin / July 20, 2023

The accuracy of the paid version of ChatGPT was lower than that of the free chatbot.

New study from scientists at Stanford University confirmed what netizens have been complaining about for weeks: ChatGPT Plus, based on GPT-4 has really become dumber - unlike GPT-3.5, which runs the free version of the chatbot.

In their study, the authors compared responses to different requests from a chatbot based on GPT-4 and GPT-3.5. They found that the behavior of the chatbot and the accuracy of responses to some requests in the new version are significantly worsened.

The authors compared the GPT-4 and GPT-3.5 language models released in March and January. They found that during this time the accuracy of the GPT-4 dropped noticeably, while that of the GPT-3.5, on the contrary, increased.

For example, the accuracy of answering the question of whether 17077 is a prime number fell by 95.2%, while GPT-3.5, on the contrary, increased from 7.4% to 86.8%. Also, the chance of successful execution of the code written by the neural network has decreased in the current versions of both models.

instagram viewer

Previously, OpenAI Vice President Peter Welinder has already answered to the accusations of ChatGPT Plus users:

No, we didn't make the GPT-4 dumber. Everything is exactly the opposite: we make each new version smarter than the previous one.

Now we have this hypothesis: when you use something more actively, you begin to notice problems that you did not see before.

Peter Welinder

VP of Product at OpenAI

In one of the replies to this tweet, Welinder asked provide evidence that the chatbot has become worse. A study from Stanford scientists appeared 5 days after that - and there has not yet been a response from OpenAI.

This is not the first time GPT-4 has been accused of providing false information. In March, NewsGuard analysts discoveredthat ChatGPT based on GPT-4 is easier to make tell a lie - while OpenAI itself claims a 40% increase in the accuracy of answers compared to GPT-3.5. IN in particular, in the NewsGuard tests, the new version of the neural network was less likely to refute false information - including false data about modern events and theories conspiracies.