Copilot Studio: Added the "Computer Operation RPA" skill to let AI handle tedious computer tasks for you

Written by
Audrey Miles
Updated on:June-30th-2025
Recommendation

Microsoft Copilot Studio's new AI function allows intelligent agents to directly operate computers, breaking through the boundaries of traditional automation.

Core content:
1. Copilot Studio has added a new "computer operation" function, and AI directly interacts with the graphical interface
2. AI agents can simulate operations such as clicking buttons, selecting menu items, and entering text
3. Cloud operation reduces enterprise costs and supports automation of multiple desktop and browser applications

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)


If you feel that AI is still far away from your daily work, or that configuring AI tools is too complicated, then this new feature brought by Microsoft Copilot Studio may be a bright spot for you.


The fast track of AI development and the continuous evolution of Copilot Studio



Artificial intelligence technology is developing at an unprecedented rate, and Microsoft has been at the forefront of this wave. Their Copilot Studio platform is a powerful tool dedicated to integrating top AI technologies to help companies solve practical business challenges.

Just last month, the platform introduced a new feature that enables AI assistants to have stronger "deep reasoning" capabilities, supports the new Model Interaction Protocol (MCP), and officially launched the "Agent flows" feature.

If you are interested, you can watch my previous videos.

Today, Charles Lamanna, CVP of Business & Industry Copilot at Microsoft, announced another exciting news: Copilot Studio is about to launch a new feature called "Computer Use", which is currently in the early research preview stage.

What's so great about this feature? Simply put, it allows the AI ​​agent you create to operate your computer directly like a human!


"Computer operation": AI can directly interact with any graphical interface



Yes, you heard it right. With the "computer operation" function, AI agents are no longer just information carriers or interlocutors. They can directly "see" and "operate" the graphical user interface (GUI) of websites and desktop applications - that is, the screen interface that we usually interact with using a mouse and keyboard.

Imagine your AI assistant could:

Click the button

Select a menu item

Type in the input box

What does this mean? Even if a software or system does not provide a dedicated application programming interface (API) for the program to call, as long as a person can manually operate it through the interface, now Copilot Studio's AI agent can do it! This greatly broadens the boundaries of AI automation.


It is not just a simulated click, but also an improvement in intelligence and efficiency



The benefits of this new feature are obvious:

Super adaptability


The most troublesome thing is that the software interface is often updated, the button position changes, the menu is renamed, and the traditional automation script may "go on strike".

But the "Computer Operation" function has the ability to adapt in real time. It has a built-in intelligent reasoning mechanism that can understand changes on the screen in real time and adjust itself to ensure that automated tasks are not interrupted and the workflow continues smoothly.

Safety and compliance are guaranteed


This feature is built on Copilot Studio's mature security measures and governance framework. Enterprise data will remain within the boundaries of the Microsoft cloud and will not be used to train the underlying AI models, helping to ensure data security and meet corporate and industry compliance standards.

Cloud operation reduces costs and increases efficiency


The "PC Actions" feature runs on Microsoft-hosted infrastructure. This means that enterprise users do not need to purchase, deploy, and maintain servers themselves and can use it directly. This not only speeds up deployment, but also reduces maintenance workload and infrastructure costs. It supports the execution of automated tasks on a variety of desktop and browser applications (including Edge, Chrome, and Firefox).

Unlock new automation scenarios and say goodbye to tedious repetition


What practical value can this technology bring to us? Let's look at several typical application scenarios:



Automated data entry


Imagine that an enterprise needs to enter a large amount of data from different sources (such as various forms, web pages, and old systems) into a centralized new system. This work is usually time-consuming, labor-intensive, and error-prone. The "computer operation" function can automatically simulate manual operations, accurately complete data migration and entry, and free up manpower.

Market research information collection


The marketing department needs to regularly collect market data from various online channels (news websites, social media, industry report websites, etc.) for analysis. "Computer operation" can automate this process, simulate people's operations of browsing web pages, copying and pasting information, and efficiently obtain the required data without human intervention.

Invoice processing automation


Finance departments process a large number of invoices every day. Using "computer operation", AI agents can automatically open scanned invoice files (or electronic invoice web pages), identify key information (such as invoice number, amount, date, supplier), and then automatically enter this data into the accounting system, greatly simplifying the invoice processing process and reducing manual errors.

Redefining RPA (Robotic Process Automation)


If you know anything about RPA (Robotic Process Automation), you might think this sounds a bit like it. Indeed, the "computer operation" function is revolutionizing traditional RPA.

A major pain point of traditional RPA is its fragility - it often relies on fixed interface elements (such as button IDs or positions). Once the software interface is slightly modified, the RPA script may become invalid and require professional maintenance. In addition, traditional RPA has limited ability to handle complex, dynamically changing interfaces.

Copilot Studio's "computer operation" overcomes these limitations by introducing AI intelligence:

Smarter, not afraid of change


When interface elements change, the AI ​​agent can still find the correct operation object with its "vision" and "understanding" capabilities, and the automated process is not easily interrupted.

Easier to use, lower threshold


Creating automated tasks has become easier. You can even use natural language to describe the actions you want AI to complete (such as "open this website, find the latest report, and download it"), and then test and optimize your instructions through real-time side-by-side videos (one side is the AI's reasoning process, and the other side is the actual interface operation simulation) without writing complex code. This makes it easy for non-professional RPA developers to create automated processes.

More powerful, to cope with complex scenarios


AI agents are able to “see” screen content in real time and make intelligent decisions based on the current situation, working effectively even in complex or changing environments.

The process is transparent and traceable


Developers and managers can view the activity history of "computer operations" at any time, including screenshots of the operations and AI's decision-making reasoning steps, to facilitate monitoring and debugging.



The future of Copilot Studio: enabling innovation and efficiency



In short, Microsoft Copilot Studio is developing towards an end-to-end intelligent agent platform designed to help organizations achieve their AI goals and improve operational efficiency. Through innovative features such as "computer operation", Microsoft hopes to empower users with more powerful capabilities to simplify processes, improve productivity, and ultimately drive business innovation.

Want to try something new?

If you are very interested in this powerful new "computer operation" feature and want to be one of the first to experience it, you can fill out this form provided by Microsoft to express your willingness to participate.

https://aka.ms/mcs-cua-preview

In addition, Microsoft also announced that they will share more details about this new feature at the Microsoft Build Global Developers Conference in May 2025. Interested friends can pay attention to it.