Woter AI detection.Hurry - ends Jul 26th

New Year Sales :up to 80% OFF

AI Humanize AI Translator Bypass AI AI Rewriter AI Detector

PRICING

TRY FOR FREE

Less Structure: The inspiration Manus brings to Agent product design

Written by

Clara Bennett

Updated on:July-12th-2025

The night before last, a product demo video of an AI agent named Manus quietly swept the AI media circle.

After watching the video, I had to post a blog late at night criticizing myself for "over-reliance on manual design". The rigidity of engineering thinking and the slow response to the model's capabilities are sins. There was a problem that bothered me for a long time in the project recently. A simple demo video by Manus made me suddenly enlightened and found a solution.

When I woke up the next day, I found that Manus had already set the domestic circle on fire. The code was so hard to come by that some people even spent 50,000 yuan to ask for an invitation code on a certain fish website. This universal agent that allowed large models to have "an entire computer" overturned people's perception of "shelling".

Of course, a hot product is bound to be hyped up by the media and will also be subject to sour grapes from those who didn’t get the invitation code.

In fact , from my point of view, whether Manus is the first general agent or what its capabilities are is not very important. On the contrary, the agent design concept of the Manus team, " Less structure More intelligence ", is very inspiring to me.

01 • What Manus did

Manus is named after the Latin phrase 'Mens et Manus' (which means 'unity of knowledge and action', the motto of MIT)

Unlike other products on the market such as Coze and Dify that require manual workflow arrangement, the core of Manus lies in:

Give the model a 'virtual machine': deeply integrate Computer Use and Tool Use, so that the model can work independently of humans and does not compete with humans for computers;
Autonomous task planning: The model automatically breaks down, plans and executes tasks based on the goal, rather than relying on manual workflow construction;
The whole process is unattended: from 'generating PPT outline' to 'outputting complete .pptx files', no human intervention is required; (According to actual tests, it is still necessary in some cases, such as when encountering a verification code)

It is equivalent to equipping AI with a complete workbench, allowing it to operate a computer to complete tasks like a real person. It not only has the level of an ordinary college student but can also do dirty work without complaint.

02 • Why is it said that it has changed people's perception of "shell"

Apart from the concepts of Computer Use, MCP, and self-planning, which have long been around, there is one question that I thought about all night when I saw the demo video:

Why do I need to give the model a PC? Is there any difference in the output results when I directly use Function Call?

It wasn’t until Manus released the invitation code and tested a few cases that we got the answer:

First of all , the biggest difference is that Manus obtains more information that the API cannot provide by operating the browser + vision. The breadth and depth of the search are much greater, and he is not inferior to ordinary campus recruits and interns in Deep Research work.

The most representative case of this is when Teacher Hai Xin asked Manus to teach her how to make a horror film. Manus opened Bilibili and learned for more than 20 minutes. He even reported it after reading a related article on Sohu. (Manus: It was just a matter of convenience.)

As we all know, Xiaohongshu has a large amount of high-quality notes, with an average daily search volume of about 600 million times. Nearly one-third of users search for the first thing they do when they open Xiaohongshu. Unfortunately, there is no open search interface that allows the model to directly obtain high-quality content.

Once the model has PC, the community truly has no walls.

Secondly, it can be seen that the model's use of tools has undergone a qualitative change, completing the role evolution from "consultant" to "senior cow and horse".

I believe most people have tried to let AI generate PPT, but the result is that AI will only output an outline and some content for you. The better ones can call AIPPT to use a fixed template or directly draw one for you using HTML. You also have to find 'Export' to get the .pptx file.

This can indeed solve some simple PPT needs, but I really think it is too anti-AI.

Manus hands the PC to the model, which collects information and learns independently, can design a PPT that is more suitable for the target scenario, and hand the .pptx file directly to you.

When models actually start to 'use' rather than just 'call' tools, people do get a glimpse of a general purpose agent.

The following is the result of giving the same lesson plan to AIPPT and Manus and asking them to make English teaching courseware. It is obvious that teachers will not use the former to teach, otherwise it will be a teaching accident...

No PC output (Gamma)

With PC output (Manus)

Therefore, Manus became popular and was praised by the media because it really showed people a new agent paradigm and changed people's perception of the chemical reaction that can be produced by "shelling" at the product level. It's not just because of the invitation code.

Indeed, as those who advise to stay calm said, Manus just put some shells on Claude, Computer Use, MCP, autonomous planning and other functions that have already been implemented in various Agent concept products, and there is no technical breakthrough.

But the realization and implementation of this product application is like talking to a cow if you insist on talking about technology. Some companies can develop the second "TikTok", but I have never seen a company that can make a second "TikTok".

Judging from the results, the Manus team was able to deeply integrate these scattered concepts and implement them first. While it became popular, it also allowed more people to learn about AI Agent, which is enough to prove its excellence.

If you don't put the shell on, you will get the shell that is not put on well or the shell that exists but is not put on.

People always go from one extreme to another. Our attitude is not to praise or diss too much, and even a lot of conspiracy theories have emerged...too much is as bad as too little. We should be able to look at Manus and the Chinese entrepreneurial team with enough tolerance and rationality. The current explosion is also beyond the expectations of the Manus team. The reason for setting up an invitation code is that Multi-Agent consumes a lot of computing resources, and it is difficult for such an entrepreneurial team to fully open the product to all users.
《Some Exclusive Information and Slow Thoughts about Manus》Agent Universe

Returning to the topic at the beginning, I was really inspired by Manus’s idea: Less structure, more intelligence – try to block the limitations of the human cognitive framework on the model as much as possible, explore the boundaries of the model’s capabilities, and allow intelligence to emerge.

Limited by my limited understanding of the model's capabilities, I have designed countless prompts and workflows in previous projects to meet complex cases. Ultimately, relying too much on the design framework to constrain model output resulted in "artificial intelligence, artificial intelligence, more artificial than intelligent."

In some scenarios, after providing enough tools, letting the model create freely may get better results.

03 • Cold water after getting wet

To be honest, although Manus has brought a new paradigm to Agent design, it is still a long way from being truly universal. After looking at so many actual test cases, the crash rate is still quite high. There are several problems:

Efficiency: Even simple cases take 2-4 hours, such as comparing prices of e-commerce products, which is like using a sledgehammer to crack a nut.
Resources: If I understand correctly, each task requires an independent Docker container, which puts a huge pressure on the server;
Vision: The model’s ability to understand the UI is clearly not enough. Apart from contextual limitations, most tasks are stuck due to interface comprehension issues.
Segmented fields: In scenarios that rely heavily on data within a segmented field, useful information cannot be collected even through a browser.

Let’s not talk about the issue of computing power consumption, after all, the price of computing power has been falling.

04 • Thoughts on the two-tier differentiation of the Agent track

We can see two paths for Agent entrepreneurship:

Generalists: Pursuing general intelligence of 'Less Structure, More Intelligent';
Vertical faction: Focus on niche areas and use professional knowledge and closed industry data to build a moat.

In Manus's closed-door sharing session, there was a PPT page with the following content:

'Agent Killer' can be roughly interpreted as the general faction represented by Manus has killed the subdivision faction Agent at this stage.

As an agent designer working in the education niche, I personally disagree.

In the coding field, there is a powerful open source community like GitHub, which makes Devin and MetaGPT become the first mature and commercialized agents. However, in other fields such as finance and medicine, there are many private data hidden under the iceberg, and data is the biggest moat.

When large models have not yet internalized industry know-how, structures designed by experts in niche areas are still a rigid demand, and expert models trained with private data in the industry are still indispensable - but this window of opportunity may be shorter than expected.

At the crossroads of silicon-based and carbon-based, the paradigm of human-machine collaboration is being constantly rewritten. There are no eternal winners in the transformation, only continuous evolutionists. Thanks to Manus for the inspiration brought by the 'weak framework, strong intelligence' architecture, which may be the breakthrough point of my current project.