The mainstream definition of Agent in the industry is that Agent is a system that perceives the environment through user input, performs actions through the use of tools, and achieves cognitive capabilities through basic models combined with long-term and short-term memory.
Through the experience of the new Quark, we found that Quark is actually a super agent based on reasoning and multimodal models. The implementation architecture is not complicated. Specifically:
1. Based on multimodal capabilities, perceive the environment through interaction between the input box and the user.
2. Based on the capabilities of Alibaba's self-developed big model, planning is performed to intelligently identify user intent based on user input and delegate specific agents to respond.
3. Call different tools (Actions) to provide accurate answers based on different user needs and solve all user instructions in a targeted manner.
4. In the areas of academia, medical care, and education, it has a rich and reliable industry database and exclusive knowledge base, which serves as data memory and fills the gap in vertical field knowledge that the big model itself lacks.
In our research, we found that many researchers and developers are committed to enriching the architecture and interaction of agents, such as building multi-agent systems that communicate and collaborate with each other in the hope of solving complex problems. However, Anthropic mentioned their experience in the article "Building effective agents" published at the end of 24, "Successful agents do not lie in building the most complex systems, but in building systems that meet user needs. And only when simple solutions cannot meet the requirements, more components are added." This coincides with Jobs' product design philosophy "Simple can be harder than complex".
The same is true for Quark's upgrade this time - it seems to be a reduction, but in fact it enhances the user experience. Various functions are condensed into a super box, and one entrance can meet all the problems of users in learning, work and life.
The previous Quark was an excellent search engine and toolbox. Users explored the world and obtained information through the "search box" and interacted with the tools of various vertical scenarios through the GUI. The new Quark abandoned the traditional "search box" and upgraded it to the "AI Super Box", an all-round assistant. Not only has the interactive form become simpler, but it also uses Agentic capabilities to efficiently meet users' deep-seated needs.