OpenAI releases a new series of models, GPT-4.1, with performance surpassing GPT-4o in all aspects

Written by

Silas Grey

Updated on:July-01st-2025

At one o'clock in the morning, OpenAI officially released the new GPT-4.1 series models in the form of an API . That's right, there is only an API, no web page available, including three versions with different positioning: GPT-4.1 , GPT-4.1 mini and GPT-4.1 nano .

This iteration is a comprehensive surpass of the existing flagship model GPT-4o, especially achieving significant leaps in encoding, instruction following, and long context processing.

OpenAI officially announced that the new GPT-4.1 series is superior to the highly acclaimed GPT-4o "in almost every aspect". The core improvements of this release focus on the following key areas:

Greater Intelligence and Lower Latency: Overall performance is improved while responsiveness is optimized.
Excellent coding skills: Outstanding performance on software engineering benchmarks (such as SWE-bench Verified), significant improvements in code editing capabilities (Aider's Polyglot benchmark) and front-end development tasks.
Accurate instruction following: It has significant improvements over GPT-4o in understanding complex instructions, multi-round dialogue tracking (MultiChallenge), and format compliance (IFEval).
Breakthrough long-context processing: Supports context windows of up to 1 million tokens, far exceeding the 128k of GPT-4o, and demonstrates a full range of accurate information retrieval capabilities in tests such as "Needle in a Haystack".

1. GPT-4.1 (Ultimate Edition):

Positioning: A high-performance flagship model designed for complex tasks and cross-domain problem solving, officially described as “smarter” than GPT-4o.
Features: Has a context window of 1,047,576 tokens, a maximum output token number of 32,768, and the knowledge deadline is updated to June 1, 2024.
Cost-effectiveness: The cost-effectiveness is improved by 26% compared with GPT-4o.

2. GPT-4.1 mini (high-efficiency version):

Positioning: Medium-sized cost-effective model.
Features: Performance is close to GPT-4o, but the cost is significantly reduced by 83%, and the latency is also reduced by half. Multimodal capabilities even exceed GPT-4o on some tasks.

3. GPT-4.1 nano (high-speed version):

Positioning: Extremely lightweight, ultra-high-speed model.
Features: It is the fastest and cheapest model currently available, and is particularly suitable for simple tasks such as classification and completion that are extremely sensitive to latency and cost.

At the same time, OpenAI has also adjusted its pricing strategy to improve cost-effectiveness:

Note: The hybrid average price is an estimated reference value after considering the typical input/output ratio and cache hit rate.

In addition, the discount for the Prompt caching mechanism has been increased to 75%, and a further 50% discount is available for using the Batch API, further reducing the cost of large-scale applications.

The release of the GPT-4.1 series by OpenAI is undoubtedly another important milestone in the development of large language models. It not only brings a leap in performance, but also achieves structural optimization in context length, reasoning efficiency, and cost-effectiveness.

This indicates that AI will be able to be applied to more complex real-world tasks in a more stable and controllable manner.