GPT-5 vs. Claude Sonnet 4: Why the 'Older' Model is Still My Go-To
Z
Zack Saadioui
8/12/2025
Here’s the thing about AI models: the newest, biggest one isn’t always the best one for the job.
We’ve all seen the headlines about GPT-5. It’s here, it’s powerful, & the benchmarks are, frankly, pretty nuts. On paper, it looks like it should blow everything else out of the water. But I've been spending a LOT of time with these models, both for development & for content, & I gotta tell you, I keep coming back to Claude Sonnet 4.
It feels a little weird to say, right? Like arguing a souped-up classic car is better than a brand new supercar. But the more I use them, the more I’m convinced. Even with all of GPT-5’s updates & power, Sonnet 4 has this staying power that’s hard to ignore. It’s not about being a contrarian; it’s about what actually works best day-to-day.
Let's get into why the old champ might still have the edge.
The Benchmark Battle: Let's Give GPT-5 Its Due
Okay, let's be real. If you just look at the raw numbers, GPT-5 is a beast. OpenAI has clearly pushed the boundaries of what these models can do in terms of pure, unadulterated intelligence.
You look at the leaderboards for stuff like complex reasoning & knowledge, & GPT-5 is sitting right at the top. On the MMLU-Pro benchmark, which tests massive multitask language understanding, GPT-5 scores around 87% to Sonnet 4's 84%. It’s a similar story with GPQA Diamond, a benchmark for PhD-level scientific reasoning, where GPT-5 hits 85% while Sonnet 4 is at 78%.
And in math? Forget about it. GPT-5 is scoring near-perfect on some advanced math competition benchmarks, something we just haven't seen before. It even has a monstrous 400k token context window, double Sonnet 4's 200k, which sounds like a game-changer for anyone dealing with huge documents or codebases.
So, if you’re trying to solve a theoretical physics problem or find a bug in a 300,000-line code repository, GPT-5 probably has the raw horsepower to get it done. No one’s debating that. But here’s the kicker: most of what we do, day in & day out, isn't a theoretical physics problem.
And that’s where the story starts to change.
Beyond the Benchmarks: The Real-World Case for Sonnet 4
The hype around a new model is always about what it can do. The reality of using it is about how it actually does it. This is where Sonnet 4 starts to pull ahead, not because it’s necessarily "smarter" on every test, but because it’s often more practical, efficient, & just… better to work with.
The "Feel" Factor: It’s About More Than Just the Right Answer
One of the first things you notice when you put these two head-to-head on a real project is the difference in their personality. It sounds weird to talk about an AI having a "personality," but it’s true.
GPT-5 is… verbose. It’s like that brilliant coworker who is technically always right but over-engineers every single solution. On Reddit, you’ll find tons of developers talking about this. You ask it to create a simple UI component, & it might come back with a super complex, heavily abstracted system that, while technically impressive, is a nightmare to maintain. Sonnet 4, on the other hand, is more reserved. It gives you clean, straightforward code that does the job. It feels less like it’s trying to show off & more like it’s trying to be a helpful partner.
This isn’t just about coding. When writing, GPT-5 can sometimes have a slightly more generic, AI-ish tone, whereas Sonnet often feels more natural & less like it needs heavy editing to sound human. It's a subtle difference, but when you're generating dozens of pieces of content, that subtlety saves a TON of time.
Speed & Efficiency: When "Thinking" Time Becomes a Bottleneck
This is probably the single BIGGEST advantage for Sonnet 4. GPT-5 has this feature called "thinking," where for complex queries, it will literally pause for seconds, or even minutes, to process the request. In theory, this leads to better, more reasoned answers.
In practice, it can be a workflow killer.
Imagine you're in a creative flow, bouncing ideas off the AI. You send a prompt, & then… you wait. And wait. Some users have reported waiting 15-20 minutes for a response from GPT-5, only for the output to be unusable. That completely breaks your concentration & makes the whole process frustrating.
Sonnet 4 is just faster. It’s more nimble. For the vast majority of tasks—writing an email, refactoring a function, summarizing a document, brainstorming ideas—the speed difference is night & day. You get a high-quality answer almost instantly, which lets you stay in the zone & iterate quickly. That rapid back-and-forth is often more valuable than a single, "perfect" answer that took ten minutes to generate.
This speed & reliability are CRUCIAL, especially for businesses. Think about customer interactions. If you're a business using AI to power your customer service, you can't have customers waiting minutes for a response. That’s where tools built for this purpose really shine. For instance, a platform like Arsturn helps businesses create custom AI chatbots trained on their own data. The goal is to provide instant, 24/7 support. The underlying AI needs to be fast & reliable, something Sonnet 4 is exceptionally good at. You need answers now, not "thinking about it."
The Cost-Effectiveness Conundrum
At first glance, OpenAI’s pricing for GPT-5 looks aggressive. The per-token cost is actually lower than Sonnet 4 in some cases. But this is where the numbers can be deceiving.
Because GPT-5 is so verbose & its "thinking" process consumes a lot of tokens, the total cost of a query can end up being significantly higher. It’s like having a car with great gas mileage per gallon, but it insists on taking the longest, most scenic route to the grocery store. You burn more gas.
Sonnet 4 is more token-efficient. It gets to the point faster & uses fewer tokens to do it. For businesses running thousands or millions of queries a day, this difference adds up QUICKLY. A lower cost-per-query means you can deploy AI solutions more broadly without breaking the bank. It makes advanced AI accessible for more use cases, from internal knowledge bases to public-facing website assistants.
A Different Kind of Intelligence
This all points to a different philosophy of intelligence. GPT-5 seems to be chasing raw, academic-style brainpower. It’s built to ace the test. Sonnet 4 feels like it's been optimized for practical intelligence & usability.
Anthropic has done some fascinating research on what they call "visible extended thinking." Instead of the black box of GPT-5's "thinking," Claude models can show their work, sampling multiple lines of reasoning & picking the best one. This often leads to more robust & transparent problem-solving. Some tests have even shown Sonnet 4 with its thinking features enabled outperforming GPT-5 on complex benchmarks like SWE-bench for coding.
It's not just about getting the right answer, but how you get there. Sonnet’s approach often feels more collaborative & less like a lecture from an all-knowing oracle.
Where Sonnet 4 is the Undisputed King
So, where does this leave us? While GPT-5 is a monumental achievement, there are clear areas where Sonnet 4 isn’t just an alternative, but the SUPERIOR choice.
Customer Service & Engagement: As mentioned, this is a no-brainer. You need speed, reliability, & cost-effectiveness. The ability for a tool like Arsturn to build no-code AI chatbots that can be trained on a company's own data is a game-changer. It allows businesses to provide personalized, instant answers to customer questions, guide users through their website, & even generate leads. This kind of conversational AI platform helps businesses build meaningful connections, & that relies on the snappy, efficient performance that Sonnet 4 delivers.
Rapid Prototyping & Development: When you're in the early stages of a project, speed of iteration is everything. Developers consistently report that Sonnet 4 is better for getting a project off the ground, writing maintainable code, & handling UI/frontend tasks.
Content Creation & Writing: For writers, bloggers, & marketers, Sonnet 4's less-is-more approach is often a blessing. It produces drafts that require less heavy-lifting to get to a final, polished state. Its speed also makes it an incredible brainstorming partner.
The Final Takeaway
Look, GPT-5 is an incredible piece of technology. It has pushed the entire field forward, & for certain high-stakes, deeply complex reasoning tasks, it’s probably the best tool out there.
But the narrative that it has made every other model obsolete is just plain wrong.
Claude Sonnet 4 holds its ground because it strikes a brilliant balance between intelligence, speed, cost, & user experience. It's a workhorse model that delivers exceptional results without the friction & overhead that can sometimes come with GPT-5. It understands that in the real world, the "best" answer isn't always the one that took the longest to find or the one that’s the most academically complex. It's the one that helps you get the job done, efficiently & effectively.
So before you jump on the latest hype train, give Sonnet 4 a serious look. You might be surprised to find that the reigning champ still has a few tricks up its sleeve.
Hope this was helpful & gives you a different perspective. Let me know what you think.