How AI Agents are changing corporate R&D

Written by

Jasper Cole

Updated on:June-26th-2025

Regarding the application of AI Agent in corporate R&D, let me first talk about a real case study. Based on the size of the R&D department, it can be roughly judged that this is a comprehensive American industrial and technology group with a market value of more than 100 billion US dollars. The study conducted a random experiment on 1,018 scientists to compare the changes in the quantity and quality of work brought about by using AI in different groups. They trained a large model for R&D by themselves, which is a graph neural network system used to generate formulas for compounds with specified characteristics. It is a bit like a Vincent diagram. Scientists input instructions into the system to determine what kind of chemical structure they want, and then generate various chemical structures through the diffusion model in it. Scientists then evaluate these generated materials and screen out candidate materials. In the entire R&D pipeline, the R&D department identifies the prototype of the compound product, and then it is handed over to the product department.

The study found that the use of AI tools increased the number of new materials discovered by 44%, the number of patents applied for by 39%, and the number of innovations by 17%. These compounds are novel and of higher quality, so many of them can be patented. AI can improve R&D efficiency by about 13%-15%.

(Source: Artificial Intelligence, Scientific Discovery and Product Innovation )

This study was published at the end of last year and caused quite a stir in AI , R&D, and industry circles. It proved that AI has a significant effect on improving corporate R&D. I have seen some case studies on the application of AI in enterprises before, such as research on real scenarios such as customer service and call centers. This is the first time I have seen high-quality research on corporate R&D.

What will the introduction of AI bring to the R&D mechanism, collaboration mode and organization of enterprises? The time spent on the creative generation stage has been greatly reduced, and the time spent on judging the generated candidate new materials has increased. It can be seen that the introduction of AI has made the more senior scientists the winners, giving full play to their abilities and experience, while the bottom 1/3 of junior scientists are losers, and they have basically not improved at all by using AI tools. After introducing AI tools, this company made an adjustment to the R&D department, laying off 3% of the people, mainly junior R&D personnel, and adding some more senior and advanced scientists. When an enterprise introduces AI , the way the enterprise organizes and collaborates will definitely change.

So who is doing R&D Agents? First, there are technology giants, such as Microsoft, Google, etc.; second, there are some industry giants, such as biomedicine, chemistry, electronics, automobiles, materials, etc.; third, universities and research institutions are doing it, often in cooperation with industry or technology giants; fourth, AI large model companies, such as OpenAI and Anthropic, will launch deep research functions, and data and code functions can also be classified as general R&D fields, which are actually a kind of general agent; fourth, there are some start-ups, which directly enter vertical fields, or focus on a certain fragment or module in the entire R&D pipeline. They can also be classified as native AI start-ups, which are more in the fields of new materials and biomedicine, which are also high-patent density fields. To give another example, Johns Hopkins University and AMD jointly made a R&D agent. After reading the paper, I think that people in professional fields who do professional agents are better than those technology giants who do the above professional agents.

(Source: Johns Hopkins University )

Johns Hopkins is a global leader in the field of biomedicine and it led this research. Each of the little people in this picture is an agent , including a medical postdoctoral fellow, a software engineer, a medical doctoral student, and a machine learning engineer. The experimental process, including literature search, planning, data preparation, experiment implementation, report writing, and then report review, is all completed by agents , and even the final review is done by AI .

Some of the conclusions it drew are very interesting. For example, it tried three models from OpenAI. At the beginning of this year, the best models were OpenAI's reasoning models o1-preview, o1-mini, and GPT-4o. He found that o1 preview was indeed the best, and the worst was GPT-4o. The improvement of each generation of large models ultimately needs to be verified in practice. This experimental agent can be used in full-automatic mode or collaborative mode, and it is finally proved that the results of collaborative mode are still better than those of full-automatic mode. Collaborative mode means that there will be feedback from human experts at each node.

When the experiment is completed with an agent , its cost becomes the computing cost. One point to note here is that the 84% cost savings mentioned in the test results are mainly relative to the test benchmark and the results compared with other cutting-edge R&D agents , mainly the time cost and reasoning cost when actually running the experimental agent .

Research has found that using R&D agents can also bring some problems. The first is academic rigor. Experimental reports and papers in the field of medical biology require review by human experts or peers. Second, the inherent biases in the underlying data sets and algorithms used in the experiment may be brought into the entire research process, and the agent may accept them all, leading to systematic biases. Finally, as the agent becomes more and more autonomous, whether the research results are responsible for humans or agents requires clear disclosure of the degree of involvement of artificial intelligence.

The above two examples, one represents the actual application in the enterprise, and the other represents the research being conducted by academic institutions. Next, let me share a very popular paper recently. After reading it, I feel that it actually talks about the next generation of R&D agents .

(Source: Welcome to the Era of Experience )

This paper, "Welcome to the Age of Experience," co-authored by David Silver, vice president of reinforcement learning at Google DeepMind, and Richard Sutton, this year's Turing Award winner and founder of reinforcement learning, divides large models into three stages: the simulation stage, the human data stage, and the experimental data stage. One of its core points is that our current research on large models has not only hit a data wall in the pre-training stage, but also in the entire AI research. The quality human data we are using now has basically been exhausted, and more sources of truly high-quality data are outside the boundaries of humans. AI is entering the era of experience, that is, agents constantly learn from the experience of interacting with the environment.

They cited three examples. The first was the magical move 37 in the second game when AlphaGo defeated Lee Sedol. The second was that DeepSeek directly used unsupervised reinforcement learning when training the R0 model. The so-called Aha moment in it was the emergence of new wisdom in the model. The third example was Google's recent AlphaProof, which, in addition to training with all the solutions provided by humans, tried many new solutions on its own and won the silver medal in the Mathematical Olympiad.

These three examples are about AI learning after it has acquired prior knowledge. Instead of using data fed by humans, it learns around data generated by the machine and its environment. This represents the future of big models. The most discussed topic in the paper is actually the development of agents. For example, continuous learning means that the agent can continuously interact with its environment, rather than just asking and answering questions. In this way, the agent can continuously adjust and adapt. Now big models can provide more and more lasting memory. The context has reached tens of millions of tokens, and some companies claim to provide unlimited memory. Only continuous learning can be called a true intelligent agent. The second is observation and action. The agent can interact with the real world, including the physical world, through digital interfaces. In the future, the perspective of observing the world, the feeling of the agent looking up at the stars, may be different from that of humans. The third is the reward mechanism. In the past, it was set by humans. In the future, it will continuously generate reward mechanisms for itself based on its own experience through its own algorithms. So the agent may build a world model for itself.

The first of the three examples mentioned above is an application in a large enterprise, and it is used by more than a thousand scientists, proving that AI can indeed improve efficiency in the field of R&D. The second is an experimental agent in a world-class university, which proves its effectiveness and limitations, but most professionals are willing to try it. The third example actually illustrates that the ability of the agent ultimately depends on the breakthroughs made by cutting-edge large models. We expect agents to be applied faster in the field of R&D. On the one hand, it depends on the deeper integration of know-how in vertical fields with AI. On the other hand, the innovation and breakthrough of cutting-edge large models, including the generation of new algorithmic paradigms, are still the most basic and leading.

Finally, let me summarize a few points. First, following the current technical route, we can almost see that agents in the R&D department will achieve a sequence of single point - module - workflow - multi-agent collaboration - business - organization - ecology. At the same time, the autonomy of agents will become stronger and stronger, and they will grow along the path of tools - assistants - agents - experts - innovators - organizers. In the middle, we will see that the boundaries between professional agents and professionals will become increasingly blurred. Whether R&D agents are more trustworthy still depends on the new paradigm of large models, such as empirical learning, which is worth looking forward to. Finally, all these technological breakthroughs will continue to challenge the ethics of invention and creation, how to divide the rights and responsibilities of agents and humans; how to prevent someone from abusing and malicious use. Finally, when agents learn more and more based on their own experience throughout their lives, feed themselves data, and set up their own reward mechanisms, their autonomy will become stronger and stronger, so how can humans control them?