OpenAI Releases GPT-5.2 After “code Pink” Google Menace Alert

In trying to maintain up with (or forward of) the competitors, mannequin releases proceed at a gradual clip: GPT-5.2 represents OpenAI’s third main mannequin launch since August. GPT-5 launched that month with a brand new routing system that toggles between instant-response and simulated reasoning modes, although customers complained about responses that felt chilly and medical. November’s GPT-5.1 replace added eight preset “persona” choices and centered on making the system extra conversational.

Numbers go up

Oddly, despite the fact that the GPT-5.2 mannequin launch is ostensibly a response to Gemini 3’s efficiency, OpenAI selected to not checklist any benchmarks on its promotional web site evaluating the 2 fashions. As an alternative, the official weblog publish focuses on GPT-5.2’s enhancements over its predecessors and its efficiency on OpenAI’s new GDPval benchmark, which makes an attempt to measure skilled information work duties throughout 44 occupations.

In the course of the press briefing, OpenAI did share some competitors comparability benchmarks that included Gemini 3 Professional and Claude Opus 4.5 however pushed again on the narrative that GPT-5.2 was rushed to market in response to Google. “It is very important observe this has been within the works for a lot of, many months,” Simo advised reporters, though selecting when to launch it, we’ll observe, is a strategic determination.

Based on the shared numbers, GPT-5.2 Considering scored 55.6 % on SWE-Bench Professional, a software program engineering benchmark, in comparison with 43.3 % for Gemini 3 Professional and 52.0 % for Claude Opus 4.5. On GPQA Diamond, a graduate-level science benchmark, GPT-5.2 scored 92.4 % versus Gemini 3 Professional’s 91.9 %.

GPT-5.2 benchmarks that OpenAI shared with the press.

Credit score:

OpenAI / Venturebeat

OpenAI says GPT-5.2 Considering beats or ties “human professionals” on 70.9 % of duties within the GDPval benchmark (in comparison with 53.3 % for Gemini 3 Professional). The corporate additionally claims the mannequin completes these duties at greater than 11 occasions the velocity and fewer than 1 % of the price of human specialists.

GPT-5.2 Considering additionally reportedly generates responses with 38 % fewer confabulations than GPT-5.1, in keeping with Max Schwarzer, OpenAI’s post-training lead, who advised VentureBeat that the mannequin “hallucinates considerably much less” than its predecessor.

Nonetheless, we at all times take benchmarks with a grain of salt as a result of it’s simple to current them in a approach that’s optimistic to an organization, particularly when the science of measuring AI efficiency objectively hasn’t fairly caught up with company gross sales pitches for humanlike AI capabilities.

Unbiased benchmark outcomes from researchers outdoors OpenAI will take time to reach. Within the meantime, if you happen to use ChatGPT for work duties, anticipate competent fashions with incremental enhancements and a few higher coding efficiency thrown in for good measure.

Keep forward of the curve with NextBusiness 24. Discover extra tales, subscribe to our e-newsletter, and be a part of our rising group at nextbusiness24.com

What's Hot

The fulcrum level of change: Why transformations stall earlier than they start

Prezent AI appoints ex-Cisco & Salesforce govt Tony Colon to Senior Government Board

UAE Markets To Renew Shopping for And Promoting On Wednesday Following Supervisory Coordination

OpenAI releases GPT-5.2 after “code pink” Google menace alert

Paintless Dent Restore Idaho: How PDR Works And When It Beats Physique Store Repairs

Power firm Doosan GridTech lays off employees at Seattle-area workplace – GeekWire

Dwelling Pilates Tools for Studio-High quality Exercises (2026)

The fulcrum level of change: Why transformations stall earlier than they start

Prezent AI appoints ex-Cisco & Salesforce govt Tony Colon to Senior Government Board

UAE Markets To Renew Shopping for And Promoting On Wednesday Following Supervisory Coordination

Meta Wants a Detailed Log of Attributed Conversions

The fulcrum level of change: Why transformations stall earlier than they start

Prezent AI appoints ex-Cisco & Salesforce govt Tony Colon to Senior Government Board

UAE Markets To Renew Shopping for And Promoting On Wednesday Following Supervisory Coordination

Topics

-

Regional Insights

What's Hot

OpenAI releases GPT-5.2 after “code pink” Google menace alert

Numbers go up

Related Posts

Topics

-

Regional Insights