The rumors have been true: OpenAI on Thursday introduced the discharge of its new frontier giant language mannequin (LLM) household, GPT-5.2.
It comes at a pivotal second for the AI pioneer, which has confronted intensifying strain since rival Google’s Gemini 3 LLM seized the highest spot on main third-party efficiency leaderboards and lots of key benchmarks final month, although OpenAI leaders careworn in a press briefing that the timing of this launch had been mentioned and labored on effectively upfront of the discharge of Gemini 3.
OpenAI describes GPT-5.2 as its "most succesful mannequin sequence but for skilled data work," aiming to reclaim the efficiency crown with vital good points in reasoning, coding, and agentic workflows.
"It’s our most superior frontier mannequin and the strongest but available in the market for skilled use," Fidji Simo, OpenAI’s CEO of Purposes, mentioned throughout a press briefing immediately. "We designed 5.2 to unlock much more financial worth for individuals. It's higher at creating spreadsheets, constructing displays, writing code, perceiving photographs, understanding lengthy context, utilizing instruments, and dealing with advanced, multi-step tasks."
GPT-5.2 contains a large 400,000-token context window — permitting it to ingest a whole bunch of paperwork or giant code repositories directly — and a 128,000 max output token restrict, enabling it to generate in depth reviews or full purposes in a single go.
The mannequin additionally contains a data cutoff of August 31, 2025, guaranteeing it’s up-to-date with comparatively latest world occasions and technical documentation. It explicitly contains "Reasoning token assist," confirming the underlying structure makes use of the chain-of-thought processing popularized by the "o1" sequence.
The 'Code Pink' Actuality Examine
The discharge arrives following The Info's report of an emergency "Code Pink" directive to OpenAI workers from CEO Sam Altman to enhance ChaTGPT — a transfer reportedly designed to mobilize sources following the "high quality hole" uncovered by Gemini 3. The Verge equally reported on the timing of GPT-5.2's launch forward of the official announcement.
Through the briefing, OpenAI executives acknowledged the directive however pushed again on the narrative that the mannequin was rushed solely to reply Google.
"It is very important notice this has been within the works for a lot of, many months," Simo advised reporters. She clarified that whereas the "Code Pink" helped focus the corporate, it wasn't the only real driver of the timeline.
"We introduced this Code Pink to actually sign to the corporate that we need to marshal sources in a single specific space… however that's not the rationale it's popping out this week particularly."
Max Schwarzer, lead of OpenAI's post-training staff, echoed this sentiment to dispel the thought of a panic launch. "We've been planning for this launch since a really very long time in the past… this particular week we talked about many months in the past."
A spokesperson from OpenAI additional clarified that the "Code Pink" name utilized to ChatGPT as a product, not solely underlying mannequin growth or the discharge of recent fashions.
Underneath the Hood: Instantaneous, Pondering, and Professional
OpenAI is segmenting the GPT-5.2 launch into three distinct tiers inside ChatGPT, a technique probably designed to steadiness the large compute prices of "reasoning" fashions with person demand for velocity:
-
GPT-5.2 Instantaneous: Optimized for velocity and day by day duties like writing, translation, and knowledge searching for.
-
GPT-5.2 Pondering: Designed for "advanced, structured work" and long-running brokers, this mannequin leverages deeper reasoning chains to deal with coding, math, and multi-step tasks.
-
GPT-5.2 Professional: The brand new heavyweight champion. OpenAI describes this as its "smartest and most reliable choice," delivering the very best accuracy for tough questions the place high quality outweighs latency.
For builders, the fashions can be found instantly within the utility programming interface (API) as gpt-5.2, gpt-5.2-chat-latest (Instantaneous), and gpt-5.2-pro.
The Numbers: Beating the Benchmarks
The GPT-5.2 launch contains main metrics throughout most domains — particularly those who goal the "skilled data work" hole the place rivals have just lately gained floor.
OpenAI highlighted a brand new benchmark referred to as GDPval, which measures efficiency on "well-specified data work duties" throughout 44 occupations.
"GPT-5.2 Pondering is now state-of-the-art on that benchmark… and beats or ties prime trade professionals on 70.9% of well-specified skilled duties like spreadsheets, displays, and doc creation, in accordance with skilled human judges," Simo mentioned.
Within the important enviornment of coding, OpenAI is claiming a decisive lead. Schwarzer famous that on SWE-bench Professional, a rigorous analysis of real-world software program engineering, GPT-5.2 Pondering units a brand new state-of-the-art rating of 55.6%.
He emphasised that this benchmark is "extra contamination resistant, difficult, various, and industrially related than earlier benchmarks like SWE-bench Verified."Different key benchmark outcomes embody:
-
GPQA Diamond (Science): GPT-5.2 Professional scored 93.2%, edging out GPT-5.2 Pondering (92.4%) and surpassing GPT-5.1 Pondering (88.1%).
-
FrontierMath: On Tier 1-3 issues, GPT-5.2 Pondering solved 40.3%, a big leap from the 31.0% achieved by its predecessor.
-
ARC-AGI-1: GPT-5.2 Professional is reportedly the primary mannequin to cross the 90% threshold on this basic reasoning benchmark, scoring 90.5%
The Worth of Intelligence
Efficiency comes at a premium. Whereas ChatGPT subscription pricing stays unchanged for now, the API prices for the brand new flagship fashions are steep in comparison with earlier generations, reflecting the excessive compute calls for of "pondering" mode. They're additionally on the upper-end of API prices for the trade.
-
GPT-5.2 Pondering: Priced at $1.75 per 1 million enter tokens and $14 per 1 million output tokens.
-
GPT-5.2 Professional: The prices leap considerably to $21 per 1 million enter tokens and $168 per 1 million output tokens.
GPT-5.2 Pondering is priced 40% larger within the API than the usual GPT-5.1 ($1.25/$10), signaling that OpenAI views the brand new reasoning capabilities as a tangible value-add relatively than a mere effectivity replace.
The high-end GPT-5.2 Professional follows the identical sample, costing 40% greater than the earlier GPT-5 Professional ($15/$120). Whereas costly, it nonetheless undercuts OpenAI’s most specialised reasoning mannequin, o1-pro, which stays the costliest providing on the menu at a staggering $150 per million enter tokens and $600 per million output tokens.
OpenAI argues that regardless of the upper per-token price, the mannequin’s "better token effectivity" and skill to resolve duties in fewer turns make it economically viable for high-value enterprise workflows.
Right here's the way it compares to the present API prices for different competing fashions throughout the LLM area:
|
Mannequin |
Enter (/1M) |
Output (/1M) |
Complete Price |
Supply |
|
Qwen 3 Turbo |
$0.05 |
$0.20 |
$0.25 |
|
|
Grok 4.1 Quick (reasoning) |
$0.20 |
$0.50 |
$0.70 |
|
|
Grok 4.1 Quick (non-reasoning) |
$0.20 |
$0.50 |
$0.70 |
|
|
deepseek-chat (V3.2-Exp) |
$0.28 |
$0.42 |
$0.70 |
|
|
deepseek-reasoner (V3.2-Exp) |
$0.28 |
$0.42 |
$0.70 |
|
|
Qwen 3 Plus |
$0.40 |
$1.20 |
$1.60 |
|
|
ERNIE 5.0 |
$0.85 |
$3.40 |
$4.25 |
|
|
Claude Haiku 4.5 |
$1.00 |
$5.00 |
$6.00 |
|
|
Qwen-Max |
$1.60 |
$6.40 |
$8.00 |
|
|
Gemini 3 Professional (≤200K) |
$2.00 |
$12.00 |
$14.00 |
|
|
GPT-5.2 |
$1.75 |
$14.00 |
$15.75 |
|
|
Gemini 3 Professional (>200K) |
$4.00 |
$18.00 |
$22.00 |
|
|
Claude Sonnet 4.5 |
$3.00 |
$15.00 |
$18.00 |
|
|
Claude Opus 4.5 |
$5.00 |
$25.00 |
$30.00 |
|
|
GPT-5.2 Professional |
$21.00 |
$168.00 |
$189.00 |
Picture Era: Nothing New But…However 'Extra to Come'
Through the briefing, VentureBeat requested the OpenAI members if the brand new launch included any increase to picture technology capabilities, noting the joy round related options in latest competitor launches like Google's Gemini 3 Picture aka Nano Banana Professional.
Sadly for these searching for to recreate the form of text-and-information heavy graphics and picture enhancing capabilities, OpenAI executives clarified that GPT-5.2 comes with no present picture enhancements over the prior GPT-5.1 and OpenAI's built-in DALL-E 3 and gpt-4o native picture technology fashions.
"On picture Gen, nothing to announce immediately, however extra to come back," Simo mentioned. She acknowledged the recognition of the characteristic, including, "We all know it is a essential use case that individuals love, that we launched [to] the market, and so positively extra to come back there."
Aidan Clark, OpenAI's lead of coaching, additionally declined to touch upon visible technology specifics, stating merely, "I can't actually converse to picture Gen myself."
The 'Mega-Agent' Period
Past uncooked scores, OpenAI is positioning GPT-5.2 because the engine for a brand new technology of "long-running brokers" able to executing multi-step workflows with out human hand-holding."
Field discovered that 5.2 can extract data from lengthy, advanced paperwork about 40% quicker, and in addition noticed a 40% increase in reasoning accuracy for Life Sciences and healthcare," Simo mentioned.
She additionally famous that Notion reported the mannequin "outperforms 5.1 throughout each dimension… and it excels on the form of actually ambiguous, longer rising duties that outline actual data work."Schwarzer added that coding startups like Increase Code discovered the mannequin "delivered considerably stronger deep code capabilities than any prior mannequin," which is why it was chosen to energy their new code evaluate agent.Visible capabilities have additionally seen an improve.
OpenAI's launch weblog submit reveals an instance the place "a traveler reviews a delayed flight, a missed connection, an in a single day keep in New York, and a medical seating requirement."
The result? "GPT‑5.2 manages the whole chain of duties—rebooking, special-assistance seating, and compensation—delivering a extra full final result than GPT‑5.1."
A brand new analysis referred to as ScreenSpot-Professional, which checks a mannequin's skill to know GUI screenshots, reveals GPT-5.2 Pondering reaching 86.3% accuracy, in comparison with simply 64.2% for GPT-5.1.
Science and Reliability
OpenAI leaders additionally careworn the mannequin's utility for scientific analysis, making an attempt to maneuver the dialog past easy chatbots to analysis assistants.
Aidan Clark, lead of the coaching staff, shared an instance of a senior immunology researcher testing the mannequin.
"They examined it by asking it to generate a very powerful unanswered questions in regards to the immune system," Clark mentioned. "That immunology researcher reported that GPT-5.2 produced sharper questions and stronger explanations for why these questions… matter in comparison with any earlier professional mannequin.
"Reliability was one other key focus. Schwarzer claimed the brand new mannequin "hallucinates considerably lower than GPT-5.1," noting that on a set of de-identified queries, "responses contained errors 38% much less usually."
The 'Vibe' Shift
Apparently, OpenAI acknowledged that not each person may instantly desire the brand new fashions.
When requested why legacy fashions like GPT-5.1 would stay out there, Schwarzer admitted that "fashions change somewhat bit each time.
"Some customers might discover that they like the vibes of the earlier mannequin, although we expect the newest one is throughout the board typically a lot better," Schwarzer mentioned. He additionally famous that for some enterprise clients who’ve "actually fine-tuned a immediate for a particular mannequin," there could be "small regressions," necessitating entry to the older variations.
Security, 'Grownup Mode,' and Future Roadmap
Addressing security issues, Simo confirmed that the corporate is making ready to roll out an "Grownup Mode" within the first quarter of subsequent yr, following the implementation of a brand new age prediction system.
"We're within the technique of bettering that," Simo mentioned concerning the age prediction expertise.
"We need to do this forward of launching grownup mode."Wanting additional forward, trade reviews recommend OpenAI is engaged on a extra basic architectural shift below the codename "Challenge Garlic," concentrating on a flagship launch in early 2026.
Whereas executives didn’t touch upon particular future roadmaps in the course of the briefing, Simo remained optimistic in regards to the economics of their present trajectory.
"Should you take a look at historic tendencies, compute has elevated about 3x yearly for the final three years," she defined. "Income has additionally elevated on the identical tempo… creating this virtuous cycle."
Clark added that effectivity is bettering quickly: "The mannequin we're releasing immediately achieves a fair higher rating [on ARC-AGI] with virtually 400 occasions much less price and fewer compute related to it" in comparison with fashions from a yr in the past.
GPT-5.2 Instantaneous, Pondering, and Professional start rolling out in ChatGPT immediately to paid customers (Plus, Professional, Crew, and Enterprise). The corporate notes the rollout can be gradual to take care of stability.
Keep forward of the curve with NextBusiness 24. Discover extra tales, subscribe to our e-newsletter, and be part of our rising neighborhood at nextbusiness24.com

