Impressions of GPT-5.4
I spent the entire weekend testing GPT-5.4 — a decent neural network, but OpenAI continues to make the same mistakes.
I’ll start with the model selection in ChatGPT. A year ago, the company was already criticized for having a selector with GPT-4o, GPT-4.5, o1, o3-mini-medium, and o3-mini-high. And nothing has really changed. They promised to launch a router that would automatically choose the appropriate model — it appeared, but so far it works poorly. As a result, you have to switch manually. And then…
Let’s talk about GPT-5.3 Instant — what kind of beast is this? Previously, the division between Instant and Thinking was within one version; now they are separate products. Why? OpenAI hasn’t provided clear explanations: GPT-5.3 Instant was released a few days before GPT-5.4 and was presented as the best chat model — but without benchmark results or details. Versions 5.3 and 5.4 have different styles of responses, so one can assume different architectures, although these are just guesses.
GPT-5.4 works only in reasoning mode and takes from tens of seconds to minutes to respond. There are settings for the duration of these reasoning processes — Standard and Extended Thinking — but I didn’t notice much difference.
Much can be said about benchmarks; however, the more important aspect is the style of responses: they should be clear, energetic, and in Russian should avoid anglicisms and untranslated words. Claude Sonnet/Opus and all Gemini versions handle this better, while GPT still has issues.
GPT-5.3 tries to joke, adds emojis, and behaves like a friendly conversationalist, but often the old problem reappears — when the model writes giant responses with lists and tables at 80%, which looks overloaded.
The other problem with GPT-5.4 is that it either tries to write simple text but does so with heavy paragraphs highlighting important parts in bold, or the text “breaks apart” into short sentences and paragraphs of one or two words.
Interestingly: on the English-speaking internet, GPT-5.4 is praised precisely for its creativity and richness of expression; however, I myself spent a couple of hours chatting with it in English — the same problems remained. Overall, a friendly and lively chatbot in the style of GPT-4o is no longer available in ChatGPT — and that was exactly what users loved about it before.
So what is available? GPT-5.4 is traditionally good at searching for information online: it literally dives into the World Wide Web, verifying each fact in its answers. However, because of this, responses sometimes delay by a minute or two even for simple queries.
The model also handles criticism well: it is honest and can analyze any idea thoroughly — highlighting strengths and pointing out weaknesses. I have a main version of Opus 4.6, and GPT-5.4 complements it perfectly — if I doubt something or need to verify information again, I just transfer the question there.
And with programming, things are also interesting: I do projects in Claude Code, and check codes via Codex (previously this was GPT-5.3). Now I use exactly 5.4 — the model is very attentive, reacts quickly, and can even work with a computer.
By the way, I need to admit my mistake: initially, I confused and wrote in the announcement that computer control would be available in all ChatGPT versions — that’s not true! The function is limited to API and Codex: to use it via API or built-in code editor (like Playwright), you first need to install a special skill.
With this skill, GPT-5.4 learns to interact with website or application interfaces (Web or Electron), clicking on elements to check functionality or usability. To do this, you need to enable headed mode (which makes the browser visible). During this process, the neural network literally “opens” a site or app and checks — is everything done correctly? A great tool for developers! And additionally: it can be used on any online site for design analysis or finding specific information.
What can be summarized? Despite some flaws, GPT-5.4 turned out to be very good — especially in programming and task automation (this is confirmed by professional developers’ reviews). But one thing is clear: OpenAI needs to seriously work on the style of presenting responses — currently, the entry barrier into ChatGPT has increased compared to models like Sonnet/Opus or Gemini.
That’s all — stay tuned for updates!
Created with n8n:
https://cutt.ly/n8n
Created with syllaby:
https://cutt.ly/syllaby
