OpenAI Launches ChatGPT Agent: AI Now Takes Action and Completes Complex Tasks

Moving beyond chat, the new agent autonomously researches and performs complex digital tasks on your computer.

July 17, 2025

OpenAI Launches ChatGPT Agent: AI Now Takes Action and Completes Complex Tasks
OpenAI has launched a significant advancement in its artificial intelligence capabilities with the introduction of the ChatGPT Agent, a new feature designed to autonomously perform complex, multi-step tasks by combining deep research functionalities with the ability to take action on a computer.[1][2][3] This unified agentic system represents a major step towards a more capable and interactive AI, moving beyond simple conversational responses to actively completing tasks for users. The feature is currently being rolled out to paid subscribers, including Pro, Plus, and Team users, with plans for a future release to Enterprise and Education customers.[1]
The new ChatGPT Agent integrates and enhances what were previously separate experimental features: "Deep Research" and "Operator".[1][4] Deep Research, launched earlier, gave ChatGPT the ability to conduct extensive web research on complex topics and generate detailed, cited reports.[5][6] Operator, on the other hand, was developed to interact with websites by clicking, typing, and scrolling, effectively mimicking human computer use.[1][7] The new unified agent can now seamlessly transition between these abilities, using a suite of tools that includes a visual browser, a text-based browser, a code terminal, and API integration.[1][4] This allows the agent to not only gather and synthesize information but also to act on it, performing tasks such as booking travel itineraries, creating presentations, analyzing data in spreadsheets, and managing calendar appointments.[1][3]
At its core, the ChatGPT Agent operates on its own virtual computer, a sandboxed environment where it can navigate websites, download files, run code, and analyze outputs within a single, continuous session.[1][8] This virtual machine preserves the state of a task across different tools, enabling the agent to handle complex workflows that might involve browsing for information, processing that data, and then creating a document or presentation based on the findings.[1][9] OpenAI has emphasized user control, stating that the agent will request permission before executing significant actions and that users can interrupt or take over the process at any point.[1][3] The system can also connect to external applications like Gmail and GitHub through secure connectors, expanding its range of potential tasks.[1][4]
The implications of this new capability are substantial for the AI industry and the future of work. By creating an AI that can not just find information but also perform actions, OpenAI is pushing the boundaries of what is expected from an AI assistant. This development places ChatGPT in a more direct competitive position with not only other AI chatbots but also with productivity software suites from companies like Google and Microsoft.[10] The ability to generate editable spreadsheets and presentations directly within the ChatGPT interface could reduce user reliance on traditional office applications.[10][9] The introduction of such agentic AI is seen as a foundational step towards more advanced, autonomous systems, with some speculating that this technology is a precursor to the capabilities expected in the highly anticipated GPT-5.[11][2]
While the new feature offers a glimpse into a future where AI assistants act as proactive partners in completing digital tasks, OpenAI acknowledges that the technology is still in its early stages.[8][12] The company has noted limitations, including the potential for the AI to "hallucinate" facts or struggle with distinguishing authoritative information from rumors.[12] To address safety concerns, especially given the agent's ability to take actions, OpenAI has implemented a robust safety stack and requires the agent to seek permission for consequential tasks.[3] The initial rollout to a limited user base allows the company to study and improve the system in a real-world setting before broader deployment.[8] The current usage is also subject to limits, with Pro users receiving 400 messages per month and Plus and Team users getting 40, with options for additional usage through credit-based pricing.[1] As the technology matures, it holds the potential to significantly streamline workflows, automate repetitive digital chores, and redefine how individuals and businesses interact with computers.

Sources
Share this article