OpenAI secretly released Codex, the most powerful AI programming assistant in history, a new generation of programming artifact is here

Written by
Jasper Cole
Updated on:June-21st-2025
Recommendation

OpenAI released a new generation of programming artifact Codex late at night, and the AI ​​programming assistant ushered in a revolutionary upgrade.

Core content:
1. OpenAI Codex research preview version released, AI programming assistant new breakthrough
2. Codex cloud software engineering intelligent agent, multi-task parallel processing capabilities
3. Efficient code generation, precise follow-up instructions, accelerated development work

Yang Fangxian
Founder of 53A/Most Valuable Expert of Tencent Cloud (TVP)

Breaking news, I didn't expect OpenAI to drop a bomb again late at night. I feel that before the Google I/O conference in the next few days, many artificial intelligence companies will gradually release some products to compete in the Google I/O conference.

Just two days ago, Sam Altman tweeted on X:

A "low-key" preview of the research will be released soon xx;

The name xx is better than "ChatGPT" in case the product takes off like ChatGPT.

Well, now it’s confirmed, OpenAI has launched a research preview of Codex, its most powerful AI programming assistant to date.

What is the specific effect? ​​We still need users to practice and evaluate and experience it themselves.

When it comes to today’s AI products, don’t believe the exaggerations of AI bloggers on the Internet. Most AI bloggers have cooperation with manufacturers. In order to attract traffic, they hype up the products to be amazing. In the end, when users actually experience them, they are just so-so, very average.

The most typical case is: Manus.

Many people’s first reaction to Codex is that it is the programming model that made Microsoft Copilot popular all over the world. Isn’t it just new wine in old bottles?

This time it seems to be different. The official said that the previous Codex model was an automatic completion assistant, and now it is re-emerging as a "cloud-based software engineering agent" with the identity of artificial intelligence.

Okay, let’s see how the official introduces Codex.

Codex is a cloud-based software engineering agent that can handle multiple tasks in parallel, including writing features, answering codebase-related questions, fixing bugs, and submitting pull requests for review. Moreover, each task runs in its own cloud sandbox environment with a pre-loaded codebase.

Sam Ott said:

Today we launched Codex. It’s a software engineering agent that runs in the cloud and does tasks for you, like writing new features or fixing bugs. You can run many tasks in parallel. “Just do it” is one of my favorite sentences; I didn’t expect it to come so quickly and apply in such an important way to AI itself and its users.

The model behind Codex is actually codex-1, which is a version of OpenAI o3 that is optimized for software engineering.

It uses reinforcement learning to train on real programming tasks in a variety of environments, generating code that highly reflects human style and PR preferences, follows instructions accurately, and can iteratively run tests until satisfactory results are achieved.

We not only pursue high scores on benchmarks, but also focus on generating code that developers are really willing to merge into the code base - taking into account comments, avoiding unnecessary changes, and conforming to code style, so as to truly speed up development work.

In programming evaluations and internal benchmarks, the codex-1 model performs very well even without optimizations to other agent capabilities.

So, OpenAI calls it the strongest encoding model to date.

The following are its core features:

  • Efficient code generation: Codex is able to generate "cleaner" code, strictly follow user instructions, and ensure that the code passes all test cases through iterative testing. This makes it excellent in generating production-level code.
  • Multi-tasking: As an autonomous AI coding agent, Codex is able to perform multiple development tasks simultaneously, such as writing new features, fixing bugs, answering codebase-specific questions, and running tests. Task completion time is typically between 1 and 30 minutes.
  • Secure operating environment: Codex runs in a sandbox virtual computer in the cloud, using an air-gapped environment with no internet or external API access to ensure security. It also rejects requests to write malware, further enhancing its applicability in security-sensitive projects.
  • GitHub integration and customization: Codex can connect to GitHub and preload the user's code base to better understand the development environment. Developers can provide project-level configurations such as code base navigation, testing strategies, and code style standards by adding an "AGENTS.md" file to the repository. This customization feature helps Codex adapt to specific project needs more accurately.
  • Transparency and traceability: Codex records all actions, references test outputs, and summarizes changes made, providing developers with a detailed work log. This not only improves transparency, but also facilitates tracking and auditing.

In addition, the Codex CLI version has been updated to use the codex-mini-latest model, which is suitable for low-latency code editing and question-answering tasks, expanding its application in terminal environments.

Codex is currently released as a research preview version, which is limited to ChatGPT Pro, Enterprise and Team users, and is planned to be expanded to Plus and Edu users in the future. The following is the usage process:

  • Access portal: Users can access Codex through the sidebar of the ChatGPT web version without additional installation.
  • Task Assignment: After entering the prompt, click the "code" button to generate code, or click the "ask" button to ask questions about the code base. Codex will execute tasks in a separate container that mirrors the user's development environment to ensure consistency with the actual development setup.
  • Task monitoring: The task progress will be displayed below the prompt bar, and developers can track it in real time. Tasks are usually completed within 1 to 30 minutes, depending on the complexity of the task.
  • Review and verification: Although Codex is capable of generating high-quality code, OpenAI emphasizes that developers must manually review and verify all AI-generated code to ensure that it meets project requirements and safety standards. This is particularly important because AI models may be biased or generate non-standard code.
  • Usage Limits and Pricing: Currently, Codex provides free and permissive access, and it is expected that usage limits and pricing will be introduced in the future. Pricing is $1.50 per million input tokens and $6 per million output tokens, with a 75% cache discount to reduce the cost of repeated requests.

For engineers, Codex feels like a "morning to-do list" tool: developers can assign multiple tasks at once and then review draft solutions later. This workflow is particularly suitable for handling repetitive or time-sensitive tasks.

OpenAI is excited about the future of Codex.

Greg Brockman, co-founder of OpenAI, envisions: “What you really want is a remote colleague who has their own computer, but who can also ‘stand behind you’ and look at your screen at any time. You’re writing code, you want to go to lunch, you say to Codex ‘Can you help me with this?’ and it can seamlessly take over and run it in the cloud.”

Future plans include:

  • Functional integration: The local synchronous Codex CLI and the cloud-based asynchronous Codex will be integrated to form a unified workflow.
  • Greater interactivity: Allows developers to provide guidance mid-task, collaborate on implementation strategies, and receive proactive progress updates.
  • Deep integration: Connecting with GitHub is just the beginning. In the future, you will be able to assign tasks directly from the Codex CLI, ChatGPT desktop version, or even Jira issue tracker or CI/CD system. If CI reports an error, Codex may be able to automatically fix it.

Greg sums it up: “It’s an intern you can delegate tasks to, a mentor, and a pair programming partner, all rolled into one. Our goal is to accelerate useful work, get more software engineers in the world, do more useful programming, and move the world forward.”