Is GPT-5 Actually Dumber? Analyzing the Claims of Underperformance
Z
Zack Saadioui
8/10/2025
Is GPT-5 Actually Dumber? Analyzing the Claims of Underperformance
There's a HUGE buzz going around, & honestly, it's not all positive. The release of GPT-5, the latest & greatest from OpenAI, was met with a ton of anticipation. But now that it's in the hands of users, a vocal portion of the community is saying it feels… well, dumber.
It’s a strange thing to hear about a brand-new, supposedly more advanced AI model. You'd expect it to be a leap forward, not a step back. So what's REALLY going on? Is GPT-5 a flop, or is there a more complicated story here? Let's dig in & unpack the claims.
The Chatter: What Are People Actually Saying?
If you've spent any time on Reddit, Twitter, or the OpenAI forums lately, you've probably seen the complaints. They're coming from all corners, but a few key themes keep popping up.
The Creative Spark is Gone
One of the biggest gripes is coming from the creative crowd – writers, role-players, & anyone who used ChatGPT for storytelling. The consensus seems to be that GPT-5 is just not as creative or nuanced as its predecessor, GPT-4o. Users are reporting that the new model gives shorter, more direct responses & struggles with the finer points of a story.
For example, someone trying to write a novel might find that GPT-5 forgets key character details mentioned just a few prompts ago. Another user mentioned that while GPT-4 would actively contribute to the plot with new ideas, GPT-5 tends to just rephrase what the user has already written in a more "poetic" way, without adding much substance. This is a MAJOR letdown for people who relied on the AI as a brainstorming partner.
The "Buddy" is Now a "Robot"
Another common complaint is that the personality of the AI has changed. Where GPT-4o was often described as conversational & even "friendly," GPT-5 is being called "curt" & "cold." The vibe seems to have shifted from a patient, sometimes even sycophantic, assistant to a much more to-the-point tool.
This has been a jarring change for casual users who enjoyed the more human-like interactions. For them, ChatGPT wasn't just a productivity tool; it was a digital companion. The new, more efficient model feels like a downgrade in that department.
The "Dumb" Moments
Then there are the straight-up claims of underperformance. People are sharing examples of the model giving weird or incorrect answers to seemingly simple questions. Some have even said that other models on the market, like Claude's Sonnet 4, are now outperforming GPT-5 in certain tasks.
This has led to a lot of frustration, especially from paying customers who feel like they're getting a worse product. When you're used to a certain level of quality, any perceived dip is going to be met with backlash.
The Plot Twist: The "Model Router" Fiasco
So, is OpenAI just bad at making AI now? Not so fast. It turns out there's a pretty significant technical explanation for a lot of this. In an "Ask Me Anything" session on Reddit, OpenAI's CEO, Sam Altman, admitted that they had an issue with the "auto switcher" on the day of the release.
What does that mean? Well, GPT-5 isn't just one single model. It's a system that's supposed to intelligently switch between different versions of the AI depending on the task. If you're doing something simple, it might use a faster, less powerful model. If you're doing something complex, it should switch to the more "intelligent" version.
The problem was, for a big chunk of the initial release period, the switcher wasn't working correctly. This meant that a lot of users were interacting with the less powerful version of GPT-5 without even knowing it. No wonder it felt dumber!
OpenAI has said they're working on fixing this & making it more transparent which model is being used at any given time. This is a crucial piece of the puzzle, & it explains a lot of the initial negative reactions. It's a classic case of a messy rollout, not necessarily a bad product.
The Business Angle: Is This All About Saving Money?
Of course, there's always a business angle to consider. Some users have speculated that the changes with GPT-5 are a cost-saving measure for OpenAI. Running these massive AI models is incredibly expensive, & it makes sense that the company would be looking for ways to optimize.
The "model router" system is a prime example of this. By using less powerful models for simpler tasks, OpenAI can save a TON of money on computing resources. The problem, as we saw, is when the system doesn't work as intended.
It’s also possible that the push for faster, more efficient responses is a business decision. For professional users who are using the AI to get work done quickly, a curt & to-the-point answer might actually be preferable. The challenge for OpenAI is balancing the needs of these professional users with the desires of the more casual audience.
This is where a tool like Arsturn can be super helpful for businesses. Instead of relying on a one-size-fits-all model, businesses can use Arsturn to build their own no-code AI chatbots trained on their own data. This means they can create a chatbot with the exact personality & knowledge base they need for their specific audience. It's a great way to provide personalized customer experiences & boost conversions, without being at the mercy of the latest general-purpose model's quirks. For example, if your customers are used to a more conversational style, you can build a chatbot that reflects that, ensuring a consistent & positive experience.
The Psychology of AI: The "Yes Man" Effect
There's another interesting layer to this whole debate, & it's about our expectations of AI. Sam Altman has talked about how some users want ChatGPT to be a "yes man" – an AI that's always agreeable & supportive. This is especially true for people who might not have a strong support system in their own lives.
It's possible that with GPT-5, OpenAI has tried to dial back the "sycophantic" tendencies of the AI. They might be trying to create a more neutral & objective tool. But for users who were accustomed to the more agreeable personality of previous models, this can feel like a negative change.
This highlights a real challenge for AI developers: how do you create an AI that is both helpful & unbiased, without alienating users who have grown attached to a certain personality? It's a tricky balancing act, & it seems like OpenAI is still figuring it out.
How Do We Actually Measure "Intelligence"?
This whole situation also brings up a bigger question: how do we even decide if one AI is "smarter" than another? A lot of the current debate is based on anecdotal evidence & personal feelings. But as some experts have pointed out, that's not a very scientific way to go about it.
A model might be worse at creative writing but better at logical reasoning. Or it might be faster but less detailed. To really compare these models, you need a structured approach. This could involve running a series of tests & benchmarks that cover a wide range of tasks.
One YouTuber even did an experiment where they gave the same logic puzzle to both GPT-4o & GPT-5. At first, GPT-4o got it wrong. But when the user pointed out the mistake (using the correct reasoning from GPT-5), GPT-4o corrected itself. This shows that a simple, surface-level comparison can be misleading.
For businesses looking to leverage AI, this is a REALLY important lesson. You can't just jump on the latest model & assume it's the best for your needs. You need to do your own testing & evaluation.
This is another area where a platform like Arsturn shines. It allows businesses to create custom AI chatbots that are specifically trained on their data & for their use cases. This means you're not just getting a general-purpose AI; you're getting a tool that's perfectly tailored to your business goals. Whether it's providing instant customer support, answering specific questions about your products, or engaging with website visitors 24/7, a custom chatbot is going to be FAR more effective than a generic one.
So, Is GPT-5 Actually Dumber?
Here's the thing: it's complicated. It's not as simple as a yes or no answer.
On one hand, the claims of underperformance are real, at least in the sense that many users are having a negative experience. The issues with creative tasks & the "colder" personality are valid concerns.
On the other hand, the "model router" fiasco provides a pretty compelling explanation for a lot of the initial problems. It's very likely that many users were not even interacting with the "real" GPT-5.
Ultimately, it seems like GPT-5 is not so much "dumber" as it is different. It's a more specialized tool, with a focus on efficiency & cost-effectiveness. This is a great thing for some users, but a major letdown for others.
The whole episode is a fascinating look into the challenges of developing & deploying AI at scale. It's a reminder that these models are not just lines of code; they're complex systems that interact with human psychology in often surprising ways.
I hope this was helpful in breaking down the situation. It's a rapidly evolving story, so it'll be interesting to see how OpenAI responds & what the future holds for GPT-5. Let me know what you think in the comments