Dai Yusen of Zhen Fund: Long talk about AI Agents, every industry will encounter the "Lee Sedol moment" (Part 2)

In-depth analysis of the changes and opportunities faced by various industries in the AI era.
Core content:
1. The trend of AI surpassing humans in specific fields and its impact
2. The impact of DeepSeek's open source approach on large technology companies
3. The strategic considerations behind Tencent's access to DeepSeek
Last month, Dai Yusen, managing partner of ZhenFund, had a long chat with LatePost about AI and Agents. We have compiled this interview into a complete transcript, which will be published in two parts: (Part 1) and (Part 2).
In " Zhen Fund's Dai Yusen: Long Talk about AI Agents, All Industries Will Encounter "Lee Sedol Moments" (Part 1) ", Yusen analyzed the breakthroughs of the o1 and R1 models, pointing out: "In the Agent era, Attention is not all you need." In 2025, the phenomenon will become more and more common. How will this change reshape the future and what opportunities and challenges will it bring?
Q: Another thing that has a big impact on the current situation is that DeepSeek has gone viral. This itself has a big impact, and it uses a very thorough and consistent open source approach. I think it can be divided into several levels. One is the big technology companies. First of all, the big technology companies that were originally closed-source are now making a lot of moves, such as Tencent and Baidu, which have all connected to DeepSeek. In addition, Tencent has connected many products, including its main AI product Yuanbao, and its largest national product WeChat. There are about a dozen products that have been connected. Baidu has connected Wenxin, but Alibaba and ByteDance have not.
When do you think Doubao will be connected to DeepSeek?
Dai Yusen: I would be very surprised if Doubao chose to connect to DeepSeek. Because in my opinion, ByteDance is particularly eager to explore the frontier of intelligence and pays great attention to the research and development of its own basic models. If it connects to DeepSeek, it may be a big change in both its external image and internal morale.
But from another perspective, if Doubao users find DeepSeek more useful, then from the perspective of Doubao user value, this is reasonable. However, I think this is definitely not ByteDance's original intention for AI. As far as I know, they still want to achieve comprehensive leadership in the field of AI, and they have abundant manpower and resources.
Q: What about Tencent?
Dai Yusen: These are all hearsay. After all, as an angel investor, we have no way of understanding the ideas of their decision-makers. Some people have said before that Tencent is a latecomer in video, letting others run for three years first. Anyway, there are so many WeChat users that they can always mobilize them. I have heard before that Tencent is also holding the idea of strike back in the model area, because there are user relationships and user data, and everyone can't do without WeChat, so they will wait until the model technology converges or matures before connecting. Moreover, WeChat is an infrastructure product for users, and it cannot be adjusted too much, otherwise it will have a lot of impact on users. So I think Tencent's access to DeepSeek is worth a thumbs up. Because I heard that AI search started to be promoted last year, but the decision to connect to DeepSeek must have been made by the top management.
I think this is good for Tencent users. I heard that after connecting to DeepSeek, the data of many Tencent products has grown very well, probably double-digit growth. From the perspective of DAU, many people now click on WeChat search, and there will be a prompt to download "Yuanbao using DeepSeek-R1". This traffic-driving ability is simply unparalleled. So Yuanbao is now ranked second in the App Store, and I think it may be the first tomorrow, which is all normal.
Q: So do you think this is Tencent's choice? It is not so aggressive in developing its own large model. It is a little slower. It knows that someone will make a better model. When the time comes, it will actively join in with WeChat, the killer. Do you think this is a proactive strategic route that it has set long ago?
Dai Yusen: I heard that this is a strategy chosen by Tencent, but I also heard that Hunyuan Damo Model is recruiting a large number of people to expand the team. From the past experience of China's Internet, large companies rarely rely entirely on third parties to provide key infrastructure, rather than doing it themselves. So on the one hand, I think Tencent's current decision is very powerful, and perhaps it will usher in a new era. There are many such examples in the United States, such as Netflix has always used Amazon Cloud Services (AWS). Although Amazon has Prime Video, which is a direct competitor of Netflix, Netflix still thinks AWS is the best choice in business and technology. But in China, in the past, if you had Alipay, you had to have WeChat Pay, and everyone wanted to have something of their own. But I think choosing DeepSeek is definitely a very neutral decision, because the DeepSeek team has no intention of making a super App, nor does it want to do to C.
Q: So I think Ma Huateng knows that Liang Wenfeng is not that interested in making a product with a large DAU.
Dai Yusen: Yes, so I think at least their goals are clear now, and there is a basis for cooperation between the two sides. But it is hard to say whether Tencent will not want its own large model. After all, technology changes too fast. Just like everyone said before that Microsoft relied on OpenAI, and then Microsoft seemed to plan to train its own models and even invested in Anthropic. So these situations may change. But I think the core here is who can always be at the forefront. In the past two years, we have seen that many people who claimed to make basic models and challenge intelligence have gradually fallen behind, which is reasonable. After all, doing this requires talent, funds and a lot of innovation.
Q: So you just said that the only large company qualified to do this is ByteDance, and the only startup company qualified to do this is Dark Side of the Moon?
Dai Yusen: If we talk about the six AI tigers before venture capital (VC), Kimi is the only one that has such capabilities in terms of talent, team, funding and users. Even the latest paper published by OpenAI also refers to the research results of R1 and K1.5. On the way here this afternoon, Kimi released Moonlight, the latest open source model. I think that being able to continuously contribute to the technology community requires a high level of ability and direction of the team itself.
Q: Speaking of OpenAI's paper, it refers to both K1.5 and R1. These two results were actually released on the same day. After the results were released, I went to talk to people in the technical community. At that time, everyone gave me feedback that their recognition of K1.5 and R1 was not that different, but the actual impact they produced was very different. What do you think about this?
Dai Yusen: I think open source is a key difference. Indeed, some of the work of DeepSeek-R1 is of great significance, and after it was made open source, everyone can use it, which has caused a great response in the West in particular.
In the past few years, some people in Silicon Valley have been questioning whether it is worth spending so much money on pre-training. At least from the perspective of secondary market investors, everyone has begun to worry whether they have spent too much money. At this time, it suddenly came out that 5 million US dollars can train an O1-level model. Of course, this is a misunderstanding. The paper clearly states that it is only the result of the last training. But some people wanted to make a big news, which caused a lot of concerns in the United States, causing Nvidia's stock price to plummet 16 points on January 27. When this matter became global news, its influence is definitely not comparable to Kimi's simple publication of a paper or a technological innovation.
A classmate who is very familiar with DeepSeek told me that they think that OpenAI or Anthropic in the United States don’t even need to spend 5 million US dollars to train a model like V3, because they have larger clusters and more training experience. But at that time, many people who didn’t know much about the industry saw this narrative and began to compare 5 million US dollars with the 1 billion US dollars raised by others. But now everyone gradually understands that this comparison cannot be made. You see, Nvidia’s stock price is about to recover, right?
In terms of training costs, people in the industry don’t think $5 million is that amazing. People may be more concerned about innovations like MLA that reduce inference costs. In addition, the improvement of model intelligence and the reduction of model training and inference costs are things that have been happening. For example, after the launch of the GPT-4 API, the cost has dropped by more than 90%, and it will definitely drop by more than 90% this year. This is inevitable. Chips will be more powerful, and people will find more optimization methods to reduce costs. So I think the first thing people are concerned about now is whether intelligence can be improved. As long as intelligence can be improved, costs will definitely drop rapidly, perhaps to one twentieth or even one tenth of the original every year. So I am not particularly worried about cost reduction. At least in the United States, everyone believes that this curve will definitely happen.
Q: So cost reduction is actually on a trajectory. Later, didn’t Dario, the founder of Anthropic, write a very long article? His previous analysis was quite sufficient, which is to say that cost reduction is on the curve of the big industry.
Dai Yusen: Yes, including the aspect of intelligence improvement. Of course, the latter part of the article is a bit annoyed, but I think his analysis of the previous technology is quite correct. According to him, they spent a long time on the alignment of the Sonnet model because they emphasized security and so on. Indeed, Sonnet is not even an inference model, so they are still quite impressive.
I heard that they are going to release Claude 4 soon. I think this is partly due to the benefit of DeepSeek. It is like the catfish effect. A strong competitor who wants to open source has come to the modeling world, so everyone has to speed up their pace. This is indeed a good thing. And looking back, DeepSeek has another advantage. It is a brand new application. It combines R1 and search right from the start. It is a new product that was created from a blank sheet of paper. This is a big feature.
There is another feature that I later realized. When people train inference models, they actually benchmark their math and programming skills. When we look at papers published by DeepSeek, OpenAI, or Kimi, they all use the American Invitational Mathematics Competition (AIME), Mathematics Competition (MATH), and Code Bench as benchmarks. But after DeepSeek appeared, its writing style stood out. I heard that it was the first to do a special alignment work in writing style, and even asked people from the Chinese Department of Peking University to do the annotation.
When we first saw its answer, our first reaction was that it was a bit fanciful and often mentioned quantum mechanics. In fact, OpenAI, Kimi and Doubao have always tried to avoid this situation because they are afraid that the model will talk nonsense when training it.
But I think that on the one hand, DeepSeek may have intentionally aligned the text. On the other hand, after all, it was previously positioned as a research lab, so it did not make too many fine-tunings on the so-called neutrality and truthfulness. So everyone used it directly after it was released. Unexpectedly, this feature turned out to be a good feature.
We found that many people spread it because they thought its answers and thinking process were particularly creative. I don’t know if this is a coincidence, but in fact it will lead to a higher rate of spread.
Q: Have you talked to people in your circle? Do they think this is a coincidence? Are they training their writing skills on purpose?
Dai Yusen: I heard from some people that they may have indeed strengthened the model in terms of writing ability, but some people also think that this may be the result of insufficient alignment. So I think both situations are possible, and I really don’t have a definite answer.
However, judging from the results alone, this is a very important reason for its popularity. Because not many people actually use it to do math problems, most people use it to tell fortunes, and then suddenly find that the results it gives make sense. And for things like the MBTI test, people don't think this is what a cutting-edge AGI model should do.
Q: Another point that everyone is curious about is how does DeepSeek make money? We just talked about how Tencent, Baidu, and many other companies have connected to DeepSeek. But as far as I understand, it doesn't actually make money directly from these connections, right?
Dai Yusen: If you just use its model, it is already open source. DeepSeek's current way of making money is to sell APIs, and I heard that its APIs have gross profits. Because they have made a lot of infra innovations in reasoning, compared with other companies, their cost of serving their own models is lower.
Now many people want to use its API, but the problem it faces now is insufficient computing power. Because it also needs to train models, it seems that it has closed the API recharge entrance a while ago, which means don’t give me money, I can’t serve you. This is a manifestation of the business model. Many people ask if they can pay for a stable version, a bit like the subscription system of GPT Plus. So I always think that in the early stages of a technological revolution, we should not use the standards of maturity to demand business models too quickly. We should first rely on technology to create value for users and customers, and then extract part of these values as income. I think this will be realized sooner or later, it just requires some patience.
Q: Did you have a clear understanding of this matter in 2024? Or did you have a clearer and more determined idea after the impact or enlightenment brought by DeepSeek?
Dai Yusen: I think this is also a process of continuous learning. When we, the post-80s generation, entered the industry, mobile Internet had already gradually emerged, or the Internet had entered the second half. In the earliest days, such as the 1990s, I was also an Internet user, but I didn’t think about the business model at all. I think we should often learn from history and think about why many early Internet companies were built with strong technology at the time.
In fact, let's review the first problem Google encountered. It used the new technology PageRank to create a search engine with a 10 times better experience. Users liked it very much and spread it spontaneously. But at that time, it didn't know how to make money, because Google's search engine had no ads at the beginning and the interface was very refreshing. After it went online in 1998, an article in the New York Times in 2002 said that "the most difficult thing about Google is its own business model", criticizing it for not having its own business model. But later everyone knew that in 2002 it gradually found two business models, AdWords and AdSense, and after it went public in 2004, it became the best "money printing machine" today. This is a good example. At the beginning, when you asked Google what its business model was, it actually didn't know. But it first had a technological breakthrough, relying on technology to create good products, and then realized the value of the products.
Q: Do all technological breakthroughs follow this natural process? Or do we have a survivor bias and only see those technological breakthroughs that later achieved huge commercial success?
Dai Yusen: Of course, it is impossible for all technological breakthroughs to make money. But I think it depends on which development cycle the technological breakthrough is in. I still hold the same view that now is a time when the slope of technological change is very steep. At this time, if we force the existing technology to realize its value, it is like asking a talented high school student to make money. He may only be able to do some work like moving bricks and cannot make much money. But if we train him more and wait for him to become a doctoral student, he can make a lot of money. So I think that if the development of technology has reached a stable period, such as the mobile Internet, when the technology five years ago is not much different from now, then it is the time for business models to flourish.
Let me give you another example. Not only Google, but also Facebook, when it first appeared, proposed a very cutting-edge product that triggered a "viral" spread. But at that time, no one knew how Facebook made money. It tried to put banner ads, tried local ads, and later did ads in games, but none of these made much money. It was not until 2012 that it changed the news feed from time sorting (like the sorting method of WeChat) to recommendation sorting, forming the so-called information flow recommendation model. Only after the recommended information flow sorting can ads be inserted. So it launched news feed ads in 2012 and went public in the same year. Of course, news feed ads are now also a super "money printing machine" and the core business model of ByteDance. But Facebook was launched in 2005, the news feed was launched in 2007, and the recommended news feed was launched in 2012. At the same time, it found a real business model, which took 6 to 8 years. At that time, Facebook was always a company that users liked very much, but the business model was not clear, so great companies often go through such a stage.
Q: Do you think ByteDance will open source?
Dai Yusen: First of all, is open source something that everyone must do? First, open source is valuable only if it is in a leading position. If you open source something that is not very good, just for the sake of open source, it is meaningless; second, I think the slightly weaker form of open source is free. Free plus leading, I think this is very powerful.
Is it necessary to open source? I think DeepSeek has a "sweet" this time - it has attracted great attention from the West after it was open source. After it caused big news in the United States, everyone in China thought it was even more powerful, which made the Americans "break down". Of course, open source can also take the form of cooperation with WeChat, but I think it is not just a question of open source, but the company must insist on doing it. For example, if Doubao is now open source, will WeChat connect to it? I don't think so. So this is not a simple question of whether to open source or not. Suppose Doubao is as powerful as DeepSeek now and then open sourced, I don't think WeChat will connect to it, and Ali's Qianwen will probably not connect to it either. This is not to say that they are not capable, but this is the case from the standpoint of Ali and Byte. So I think Liang Wenfeng's strength is not just open source, but that they insist on open source, and their market positioning is a positioning that will not make people feel threatened.
Q: Yes, we insist on open source and remain neutral, and do not accept a lot of investment from any big company.
Another recent change is that OpenAI is also considering open source. Altman sent a tweet, giving everyone two options: one is to open source the o3 mini, and the other is to open source the phone size model, which is a model suitable for mobile phones. Which one are you more looking forward to?
Dai Yusen: Of course, I think it’s good to open source any of them, but I’m definitely more interested in o3 mini. Because I think the current use of models on mobile phones may not be that great, and now everyone needs to make breakthroughs in the frontier of intelligence. o3 mini is a very powerful model. After a long period of inference time, that is, in the o3 mini pro and o3 mini high modes in GPT, it performs very well. If a model of this level can be open sourced, everyone can know how it is made and its characteristics, I think it will be of great value to everyone. And I heard that the scale of this model is not large, so it may be of great reference significance to everyone in model training and application.
Q: How big did you hear it was?
Dai Yusen: According to a more reliable source, the size of each activation is 3.7B, which really shocked me. It feels a bit too small. But this size means that they can indeed turn a large o3 (o3 should be quite large) into a very small o3 mini, and then let the o3 mini have more thinking time to get good results. This is indeed a great job.
Q: They have actually shared their concerns about not opening up their software to the public before. They believe that opening up their software to the public will weaken their competitive advantage, for example, giving Google an opportunity to take advantage of the situation.
Dai Yusen: So I think this is Liang Wenfeng's great point. He really shared a lot of technical secrets with everyone, so that everyone can become better. But from the perspective of a purely commercial company, there are indeed many concerns. After all, in addition to the problems just mentioned, OpenAI is also worried that powerful AI will be used by bad people, which may also be a very reasonable concern.
Q: What impact do you think DeepSeek will have on companies that already want to dominate the open source ecosystem, such as Meta and Alibaba, which have always been open source?
Dai Yusen: I think it is definitely an incentive. Everyone has discovered that a more "competitive" opponent has come. The original open source community, jokingly said, is a bit like "cyber Buddha", a bit like charity. Whether it is Alibaba or Meta, they are big companies that provide computing power for everyone to use, driving the development of the entire industry. But now there is a faster-progressing and more open DeepSeek, which is definitely both a pressure and an incentive for everyone. But indeed, I think DeepSeek's neutrality is a relatively unique advantage. Tencent can also use it, Qianwen can also use it. This is not just a question of ability, but a question of where its butt is sitting.
Q: Didn’t Apple recently have discussions and discussions with DeepSeek about cooperation? But in the end, it chose Alibaba.
Dai Yusen: Apple has talked to many companies, including Kimi. I think from Apple's perspective, it is easy to understand why it chose Alibaba. It must choose a partner with stable services, good ability to cope with large-scale users, and excellent infrastructure, services, and technical experience.
Q: Actually, Alibaba is relatively open this time.
Dai Yusen: Qianwen is quite compatible with Llama, and its product model is good and updated frequently, so many developers are using Qianwen. To be honest, DeepSeek R1 has a lot of "illusions" when used, so if you use it for applications, it may not be the best choice.
Q: Before DeepSeek became popular among the public, I felt that Qianwen and DeepSeek had comparable influence in overseas technology circles, because they are both open source series.
Dai Yusen: Indeed, if we review the situation, we will find that no matter how good Kimi’s benchmark is, if it is not open to others, cannot be used as open source, and does not provide overseas application services, then it will indeed have no recognition overseas.
Q: What did you discuss before? Why didn’t Kimi open source it?
Dai Yusen: I think that even now, open source is not something that must be done. As I said just now, open source is only an option for companies in certain situations. For example, open source will only be considered when there is no confidentiality competition pressure or financing pressure. What we are seeing now is the result after the fact, because open source plus some accidental opportunities have led to the current situation, so I think open source is not a must-have option. Of course, those who choose open source are very powerful and very respectable. But for a commercial company, the core is whether it can create user value and ultimately convert user value into commercial value, so I think open source is not a must-go path, but just a very interesting and innovative path.
Q: But today, all companies exploring AGI do not focus on user value.
Dai Yusen: Many of them are still based on technological value. I just said that in the period of technological growth, only when the technological value is improved can it bring user value. So I think it is very important to explore the technological frontier. After the emergence of the big model, a group of so-called more pragmatic investors or entrepreneurs may emerge, who want to make money with existing technology. But I think Kimi definitely belongs to another category. It is to promote the improvement of the technological frontier. This goes back to what we said at the beginning, to create an amazing, magical product experience and ultimately gain commercial value.
In fact, Kimi became popular in 2023. One of the important reasons for its popularity is that it is the first product to combine chat, search and long text. At that time, ChatGPT could not be searched, and ChatGPT was not very good at processing long text, multiple texts, and multiple files. So in the first two or three years, Kimi relied on the technical concept of long text processing and combined search and chat to bring a different user experience, thus successfully breaking out of the circle.
Q: Was the decision to make the long text a non-consensus decision? Was it difficult to make this decision?
Dai Yusen: Actually, long text was definitely an option in the technology selection at that time, but I don’t think there was a consensus on whether to put it in the most important position. At that time, there was a joke that I don’t know whether it was true or not. It was said that after Kimi became popular, Baidu asked why Kimi did long text but they didn’t. It seemed that the priority of long text was not ranked in their first batch of things to do. Because there were many other things with higher priority to do at that time, such as many people doing CharacterAI and doing the kind of emotional intelligence alignment. But Kimi firmly chose long text and made it to the extreme. Because long text can unlock two key scenarios, one is to process multiple files, and the second is search, such as looking at 100 web pages and then summarizing them. These two scenarios cannot be done without long text.
Especially at that time, Kimi had just been established and had not raised so much money. The team was also young and small, with limited resources, so they had to focus on one thing and choose the right direction. In fact, many of the factors that make DeepSeek popular now are also true for Kimi in 2023. When resources are limited, you have to make a breakthrough at a key point and bring users that kind of amazing experience, so that you can stand out. So I summarized it and found that there are many similarities. This is not to flatter myself, I really think they have some similarities.
Q: Does that long text help Kimi with what he is doing now?
Dai Yusen: For example, Kimi is actually better in terms of truthfulness and accuracy when doing the same retrieval. Of course, ordinary users may not make such a comparison. To be honest, many users who use DeepSeek now do not feel the "illusion" produced, but you may use it to write a report and be cheated later. I encountered this situation yesterday. In a group, the people in the group are quite capable. Someone posted an article. I took a look and found that it had a strong DeepSeek flavor.
Q: What direct impact do you think DeepSeek’s popularity will have on the large model “Six Little Tigers” that has been often compared with it in recent times?
Dai Yusen: To be honest, I think it really played a role in clearing the field. Before R1 came out, several of the "Six Little Tigers" had stopped doing their own technical model training and had no intention of impacting SOTA. I think after R1 came out, it also made everyone realize that if it is impossible to achieve SOTA, it is better to do vertical fields or application development.
Q: Why did they give up?
Dai Yusen: There are reasons for funding, as well as reasons for the team and its own positioning. As an angel investor of Kimi, we can actually say that the performance of the K1.5 model and the model they are going to release next will show further performance in the MATH and coding aspects we just mentioned. From the perspective of academic contribution, at least from the reasoning aspect, the long-to-short technology sharing proposed by K1.5 has received good reviews. In addition, Moonlight released today and MoBA released two days ago also show that the Kimi team has the ability to continuously communicate and output with technical peers.
At the same time, Kimi's current user base has reached the level of tens of millions of DAU, and it is still growing. To be honest, after using DeepSeek and Kimi, many people still prefer Kimi in many scenarios. For example, Kimi has fewer "hallucinations" and performs better in some work scenarios. In some multimodal reasoning, such as taking photos to search for questions, DeepSeek has not yet done it. So, maybe I am a little selfish, but I do think that from the perspective of team funding, technical capabilities, and user products, Kimi is the only one among the "Six Little Tigers" who is capable of continuously participating in the SOTA model competition. Of course, this road is difficult to take, and it requires money, people and other conditions, but I think it is at least worth a try.
Q: Will Kimi be more focused going forward? Will he cut some things off?
Dai Yusen: They have cut a lot of things, such as overseas business, and now they just want to continue to impact SOTA.
Q: Are they officially no longer doing video generation?
Dai Yusen: At least for now, I think it is important to avoid certain things.
Q: Most of the "Six Little Tigers" had given up before DeepSeek came out. Was this within your expectations?
Dai Yusen: Actually, we felt that this would be the result by mid-2024. Because at that time, it was very obvious that several companies would find it difficult to continue, both in terms of willingness and resources. I think one good thing about Kimi is that its team is very stable. This is related to the composition of their team, and the co-founders have long-term cooperation. It can be seen that the personnel changes in various model companies are quite large. In fact, starting a business is like walking on a balance beam. As time goes by, there are fewer and fewer people walking with you. In many cases, it is already amazing to stay at the table.
Q: Just now we mainly talked about the impact of DeepSeek on model companies, including large companies, whether open source or closed source, and some startups. Next, can we talk about companies in other ecosystems? For example, what kind of impact will it have in the more open source trend brought by DeepSeek? I think of one type of company, which is the AI cloud platform. According to DeepSeek's announcement, it will open source some inference optimization technologies at the infrastructure layer in the next open source week. What impact may this have on companies such as Silicon Mobility and Wuwen Core Dome in terms of entrepreneurship?
Dai Yusen: We are angel investors of Wuwen Xinqiong. Their business volume has grown rapidly and we have received a lot of requests. In particular, local state-owned assets and governments are desperately trying to deploy DeepSeek, and the demand in this area has skyrocketed.
They have made a lot of innovations, including reasoning on Huawei cards, which is also very popular. Many people want to use it. I think the popularity of open source models has indeed brought great opportunities to AI Infra companies. What models do these companies want to serve? If they are all closed-source and private models such as Doubao and Kimi, then they really can't play any role, because ByteDance will serve them itself. But in the long run, it depends on whether they can continue to serve customers well. After all, public cloud companies, such as Tencent Cloud, Alibaba Cloud, and Volcano Engine, do have sufficient funds, and their Infra capabilities, resources, and customer service capabilities are also better. So for customers, they are definitely not doing charity. Whoever can provide good service and is cheap and good quality, they will choose whoever. So there are still many challenges for startups.
Moreover, DeepSeek is going to open source these "black technologies", which means that it actually has many advantages in terms of services. The cost of the same service may be lower than others. Because no one expected the surge in computing power demand in the short term, it is normal for it to let others bear the responsibility because it cannot bear it by itself. But if it enters a stable state, whether these startups still have advantages in the face of large public cloud companies and DeepSeek's first-party services remains to be seen. But overall, it has definitely created a lot of opportunities.
Q: In fact, the AI cloud platform is sandwiched between the cloud and the model, right? It may be squeezed by both sides, but it may also gain some opportunities due to changes in the ecosystem.
Dai Yusen: Yes, if there are more choices in the middle layer after open source, such as different frameworks and different models to choose from, then this middle layer will become better and better. But if it eventually converges to only a few choices like iOS or Android like the operating system, then it may still be provided by the system provider in the end.
Q: What impact do you think it will have on the majority of companies that only make applications?
Dai Yusen: I think it is definitely positive. It means that there is a better, open source, and self-adjustable model to use. In this process, if you want to do office-related things on the main channel of the model, it is still quite difficult. But if you want to enrich the model ecosystem, it will be different. I always use an analogy that in the early days of the technological revolution, it is equivalent to the BlackBerry era. Because BlackBerry's technical capabilities were limited at that time, you had very few PMFs. The BlackBerry era was mainly about sending emails and messages. Even if Zhang Yiming went back to that era and wanted to make Douyin, he couldn't do it because BlackBerry didn't have such conditions. But why did the mobile Internet flourish later? First of all, it was because of the iPhone, which was strong enough to unlock many new scenarios. It has a good camera, a good screen, a good network, and a good chip, so it can unlock scenarios such as short videos, mobile e-commerce, and social networks.
After the iPhone came Android, which made the market more open. More mobile phone manufacturers such as Xiaomi, OPPO, and vivo joined in, further popularizing smartphones. For example, Sonnet, 4o, and o1 are a bit like the iPhone moment, where closed-source technological advances allow many people to build applications on them. DeepSeek may be the Android moment, as it has changed from closed source to open source, and at the same time, it is strong enough to give everyone more choices in making applications. So technological progress can bring a better product experience on the one hand, leading to the emergence of "killer apps"; on the other hand, it can also make the ecosystem more prosperous. Originally, we could only do a limited number of things, but after having iPhone and Android, we were able to do TikTok.
Q: Then I would like to talk about the impact of O1 and R1 on the demand for computing power of infrastructure that everyone is concerned about. In fact, DeepSeek R1 was very popular for a while, which is related to the sharp drop in Nvidia's stock price that we just mentioned. There is a view that it is because of its low training cost that it will reduce the demand for computing power. I saw that you also posted some Moments, and many people have different opinions on this.
Dai Yusen: I think the computing power requirements have different structures. Originally, it was training and reasoning. In the arms race stage from 2023 to 2024, everyone simply summarized it into a sentence called "great effort makes miracles", as if they thought that as long as they bought enough cards, they could get better results. Of course, at that time, when pre-training had not hit a wall, or when everyone had not yet realized that they had hit a wall, this statement was also valid.
But now we find that the marginal benefit of short-term large-scale investment in pre-training is indeed limited. For example, Grok 3 was trained with 200,000 cards. Although there is progress, the marginal benefit is decreasing. So it cannot be said that "great effort can produce miracles" is wrong, but the marginal benefit of its miracles is decreasing. But I think what will happen is that the model's capabilities have reached the critical point of making agent products, and are still breaking through. So when the agent product form can be implemented, the tokens and inference computing power it uses will increase significantly. If you just make a chatbot, you chat with ChatGPT, Kimi, and Doubao, there is not much to chat about, and you can't spend many tokens. When it can help you do more and more complex things, and requires more tools and thinking, the demand for inference computing power may not increase by 10 times, but by 100 times or 1,000 times. This situation could not happen before because the technology did not reach that level. But now I think the technology has reached this turning point, and the demand for reasoning may increase significantly.
Q: Will the demand for inference computing power increase by a hundred or a thousand times by 2025?
Dai Yusen: First of all, from the perspective of the history of technological development, it doesn’t really matter whether it happens in 2025, 2026 or 2027. Just like autonomous driving, the most important thing is that it can happen in the end, and the specific year in which it is realized is actually not that important.
But I think now Agent products, at least I can feel that it is about to go viral. For example, Deep Research, it must require a lot more tokens. This is why Altman said that although GPT Pro charges two hundred dollars a month, it is still losing money, because the demand for inference increased a lot at that time. But I think there are two situations here. One is that the proportion of pre-training and post-training inference will change; the other is that this will indeed have an impact on NVIDIA's pattern. In February 2025, NVIDIA will definitely still be the most powerful and efficient choice in terms of reasoning and training. However, we also see that when R1 became popular, domestic chips began to optimize for R1, and this fixed-point optimization actually works better.
Q: Actually, you are already using Ascend.
Dai Yusen: They have already seen Ascend’s 910B.
Q: Even if you use NVIDIA products, you can still use FP4 inference technology optimization.
Dai Yusen: Yes, I think this situation has always existed. When the technology has not converged, GPU has strong versatility. Or why is there NVIDIA? At first, it was all CPU, which is the most universal. Later, people wanted to play games, and games have very specific requirements, so GPUs were made to accelerate games. Of course, GPUs can be used for AI later. At present, GPUs are still the most common choice for general training and reasoning of AI. But if it only serves a specific model, there are two ways. One is to do special optimization like Ascend; the other is to do Eclipse, like Broadcom and Marvell.
Q: Or optimize according to your own needs, just like Google did with TPU.
Dai Yusen: Actually, this is also a kind of specialization. Once the architecture is stable, higher efficiency can usually be achieved through specialization in the chip field. So here it involves whether the architecture will be solidified, and I think this is also a point of heated discussion. At present, the O1 and O series can go a long way, and ASIC may gradually work. But from another perspective, assuming that the basic architecture changes next year or the year after, Transformer will not work, and it will be replaced by other architectures. Then making ASIC may be in vain, and it will have to rely on GPU, so there are many uncertainties. However, Nvidia does have a problem, that is, its current market share is too high, and it is difficult to go up.
Q: Yes, it seems to have reached its peak.
Dai Yusen: Yes, its market share is more than 90%, so it is going down. This possibility of going downhill worries many people. On the one hand, everyone has high expectations for future computing power demand, and on the other hand, they also have high expectations for Nvidia's market structure and the resulting gross profit margin. Once the market structure goes wrong, its gross profit margin may also be affected, which is what everyone is more worried about. But if you say what everyone is doing now, including what DeepSeek wants most, it must be Nvidia's products. Buy as many as you can and do everything you can to buy them.
Q: In fact, Broadcom is the most stable one in this wave.
Dai Yusen: Broadcom and Marvell both performed very well. But as for ASIC, first, it will basically not be used until 2027; second, there are still some situations, such as price changes, which may make ASIC not work. And to make ASIC and put it into use, there are many problems in terms of production capacity, yield rate, efficiency, etc. It is not that you can make it just by designing it, so there are many uncertainties.
Of course, NVIDIA has also encountered some problems, such as liquid cooling problems and overall yield rate problems. Anyway, I think the launch of Agent products is definitely good for computing power as a whole. Everyone has heard the term Jevons paradox. But whether the market structure of NVIDIA will change, we can only say that some new possibilities have emerged. So for stock speculators, the first reaction after Deepseek came out may be to sell it first after seeing the relevant news, and now it seems that there is no big problem, so buy it back.
Q: We just talked a lot about the future, some of which may appear this year, and some of which may take a long time. In summary, what do you think we are likely to see in 2025?
Dai Yusen: I think we will see more "Lee Sedol moments", where AI surpasses 99% of humans in some tasks. In fact, this has already happened. For example, AI's ability to write code should be better than 99% of humans.
Q: More than 99% of programmers, or 99% of humans?
Dai Yusen: I am talking about humans now, but I think more than 99% of programmers may soon be able to do so. Because in Codeforces competition-level programming, AI has surpassed 99% of programmers. However, the output of competition-level programming is different from that of daily programming. Daily programming may require more contextual information and reading various code bases. But I think there will be more and more cases like this where AI beats humans or elite humans in terms of ability, and we will see more amazing related news. In addition, I think there will be more Agent products presented in a more convenient and practical form, becoming phenomenal products. There may not be hundreds of millions of people using it, but I think it can further break the circle and reach the level of breaking the circle like Cursor.
Q: What is the current daily active users of Cursor?
Dai Yusen: I don’t know the daily active users, but its annual recurring revenue (ARR) is about 100 million US dollars. Daily active users are difficult to measure, so don’t use daily active users to measure AI products. The key may be how much users are willing to pay for the value provided by the product. I think the development speed of models will accelerate, and open source and experience sharing will increase, which is quite interesting. In fact, in China, we are just experiencing the same feeling as when ChatGPT exploded in the United States, because now governments all over the country have begun to use DeepSeek, and everyone is also connecting to DeepSeek. I think this is very important for improving everyone’s awareness of AI. Everyone will realize that AI is so powerful. Previously, models such as Kimi and Doubao may have only tens of millions of DAUs, and monthly active users may be less than 200 million. I think this is to allow about one in ten people to use more advanced AI models. But if a few dozen percent of people can try out more advanced models and feel the power of AI, then whether from the perspective of entrepreneurs, users, new products, or from the perspective of invested resources and funds, I think the entire industry will usher in an ecological prosperity like the Cambrian Explosion.
Q: It is 2025 now. You also said that in 2025, there may be a "Lee Sedol moment" in some fields, that is, AI will surpass 99% of humans, or even elite humans. I feel that the DeepSeek incident has made the entire industry develop faster. So what do you think will happen if we achieve AGI faster or unlock the "Lee Sedol moment" in more fields? I can't imagine what changes will happen now, such as what people will do and how the social structure will change.
Dai Yusen: I think we are in a very interesting period in human history. In fact, exponential growth is the norm in the world, because every year we grow on the basis of the previous year. But it is very rare to witness and experience exponential growth in person.
Q: What do you mean by exponential growth? Is it the total economic volume or something else?
Dai Yusen: GDP grows 2% to 3% every year, isn't that exponential growth? But generally speaking, this kind of exponential growth takes a lifetime to experience. For example, there may not be much change between this year and next year. But in AI, specifically, from o1, o1 Pro to Deep Research, I clearly felt its exponential growth in just a few months. This experience is very special. And I think this will greatly change our expectations for the future.
So now many people are asking what AGI is and what will happen after it is realized. I personally think that AGI will indeed have a great impact on productivity, society, and even politics and culture. But what will be the specific impact after it arrives? I think we have to be prepared to deal with the impact. Because of security issues and how to solve social welfare issues after the emergence of new technologies, I think people will only really pay attention to them when these situations actually happen.
Q: And whoever holds this power actually affects the world situation.
Dai Yusen: So accelerationism believes that AI will definitely develop, and bad people will use AI to do bad things, so good people should develop AI faster.
Q: For example, there may be financial fraud, including the Deepfake AI pornography that appeared in South Korea before.
Dai Yusen: So we need more powerful means to detect Deepfakes, because people no longer have the energy to identify them. I think this will definitely have a huge impact. In fact, I am thinking that on the one hand, many people may lose their jobs. I think this is likely to happen. Now everyone defines AGI as how many people's jobs it can replace. If the role of AGI is to replace people's jobs, then the realization of AGI is equivalent to many people losing their jobs, right? Of course, this is from a social perspective. Some people also say that at that time, material things will be extremely abundant and everyone will be given money, but I don't know what will happen in the end. I think there will definitely be a lot of shocks.
But from another perspective, the reality in our eyes will change dramatically, whether it is video generation, image generation or content generation. I was born in 1986. When I was born, all the information a person could access was authoritatively certified, either books or newspapers, otherwise it could not be published and disseminated. Later, the great significance of the Internet was that it allowed ordinary people to see what they wrote. Now AI can generate whatever you want. In fact, I found that, including myself, I often had no judgment and could not distinguish the truth from the false. So in such an environment, how to further adapt and establish my own cognitive system, I think this is a very important issue.
Q: There is a popular internet term that is becoming more and more meaningful. It roughly means "the video cannot be photoshopped, so it is real."
Dai Yusen: Yes, now we can generate videos. I think this will have a great impact on our social interaction and the way we perceive the world. I found that there is a pattern in the development of science and technology. The first wave is often the most powerful people creating the most powerful technology, and the second wave is to use powerful technology to create the most powerful tools for the most powerful people. Take computers for example. At first, they appeared to solve nuclear explosion problems or code-breaking problems. This super tool designed for "supermen" will gradually become popular and popular among ordinary people, then become miniaturized and enter homes, and then develop into mobile and exist everywhere.
We are still at the stage where the most powerful group of people are creating super tools for the elite. But I think this thing will eventually benefit the general public. When we invested in Wang Huiwen's Lightyear Away, the slogan was "Accelerate AGI to benefit mankind". I think benefiting the general public is definitely the final result. However, in the middle, it will definitely be like what William Gibson said, "The future has come, but it is unevenly distributed." It is indeed unevenly distributed now. So I think whether it is open source like DeepSeek or products for large users like Kimi and Doubao, they are actually accelerating a more balanced distribution in the future, which is of great significance. I think new technologies will eventually benefit the general public and all mankind, so that they will have real value, rather than just being in the hands of a few rich people or a few companies. I think this is the result I hope to see overall.
Q: I’m curious about what preparations you are making personally for AGI, which may come sooner?
Dai Yusen: Exercise. I think excellent entrepreneurial teams are very important in the investment field. With more technological innovations, entrepreneurs are particularly important. Of course, Liang Wenfeng was also an entrepreneur at the beginning, but he was too good. He could make money by speculating in stocks and doing quantitative analysis. There are many people who may become like Liang Wenfeng, but they may lack start-up capital. So I think VC is very important at this time, especially early-stage investment. Because theoretically, early-stage investment carries the greatest risk. If many things have been determined, then we will not be needed. But I think we are back to a period full of uncertainty. Not everyone can bring 10 billion dry food like Liang Wenfeng.
Q: How do you think the next generation should be educated? What should they learn? I think this is a question that many people are thinking about.
Dai Yusen: I think the most important thing is the ability to ask questions. For example, I often encounter such a situation: facing a very capable deep researcher, what should I ask him? How should I direct him? As the boss of an AI company, I have to think about what to let everyone do every day, what is the direction of this year, and what is the focus of this month. This process actually requires a lot of thinking, because things will not advance by themselves, and we need to take the initiative to decide the direction. But the current education system is more about teaching students "what to do" and letting them master skills.
However, many skills can now be replaced by AI, or can be completed by commanding AI. So, in this case, what should we do ourselves? This becomes a very important question. Secondly, many of our current jobs are essentially a kind of "stitching" work - copying, splicing, and organizing various types of information, and finally forming a report. But AI is already better than humans in this regard. Therefore, we have to think about whether our content can add unique value to humans or the overall knowledge system.
Just like our current conversation, there may be some "stitching" elements, but at least some unique data can be generated. So, can our work create unique information that does not exist in the AI training data? Or is it just repeating what AI already has? This will have a significant impact on the nature of education and work.
Q: I find that statement by Elon Musk very interesting. In short, it is: "I want to die on Mars, but not on landing."
Dai Yusen: Yes, the key is not to get killed.
Q: I have a more personal question. You are currently investing, and are also about to learn HI (Human Intelligence), and are also studying the secondary market. Faced with so many fields, how do you maintain an efficient learning speed?
Dai Yusen: Not particularly fast, otherwise I would have studied DeepSeek more deeply (laughs). In fact, on December 27, the day after V3 was released, I organized a discussion at home and invited more than a dozen friends, including friends from ByteDance and various AI research institutions, to discuss the latest developments in the field of AI. At that time, DeepSeek V3 was released, which was very exciting. This shows that our learning ability is still good.
For example, the day after MLA was released, I thought it was amazing. I was in the United States at the time, discussing this technology with my friends. I think interest is very important - only when you are truly interested can you learn more effectively. I am also quite "nosy". For example, on the day ChatGPT was released, I used it until four in the morning and felt that this technology was completely different. This habit may have originated from the fact that I started surfing the Internet in 1998 and used Google for the first time in 1999. The search engines at the time were very weak and almost no valuable information could be found. Google's search results were completely different, which had a great impact on me.
There are many similar experiences. For example, I started using Xiaonei the day after it went online, and later I was deeply impressed by the development of the entire Internet entrepreneurship. After ChatGPT was released, I tried it as soon as possible and immediately organized a research group. The same is true for Devin. I think it has great potential, so I immediately organized a discussion.
Looking back at history, the first batch of Internet entrepreneurs were often the first to surf the Internet, the pioneers of mobile Internet were usually the first to buy iPhones, and even the Tesla investors who made money first were the first to buy Teslas. Therefore, it is still very important to be willing to spend a little money or even no money to experience the future. For example, the subscription fee for Devin is $500 a month, which is not cheap at first glance, but for investors in the circle of friends, it may only be the price of a bottle of Moutai, and this cost can help us see future trends in advance.
Q: Yes.
Dai Yusen: So the most important thing is to do more hands-on practice, take the initiative to read papers, and pay attention to the work of top researchers. For example, OpenAI, DeepSeek and other cutting-edge institutions, their high-quality information is mostly free and open, and it is worth learning. At the beginning of last year, many people in the secondary market believed that AI demand would encounter bottlenecks and the industry might decline in 2025. But from my observations within the industry, this is not the case at all. AI training is still accelerating, the trend of an arms race is obvious, and companies are purchasing computing power on a large scale. I started investing in ASICs in the second half of last year. The logic at the time was that although ASICs might be important in the future, the short-term realization was not high. Similar stories are common in the industry. For example, AMD was considered to challenge Nvidia in the early years, and now ASICs are also considered to be a possible threat to Nvidia.
Q: ASIC has actually impacted NVIDIA several times. Some companies in the 5G era were representatives of AC companies.
Dai Yusen: Yes, every time it seems that there will be an impact, but in the end the impact is limited. However, the secondary market is often "speculated first and talked later", and the cashing situation is not necessarily important. An interesting thing about the secondary market is that it can be used as a tool to verify cognition. For example, I knew very early that DeepSeek was very strong, but they did not need external investment. In this case, the secondary market provides an opportunity to "bet", just like the training model needs a reward signal, the market feedback can verify whether your thinking is correct. Therefore, I think the real value of the secondary market is not to make money, but to provide a mechanism for constantly testing and correcting cognition.
Q: So how do you use AI tools to make investment decisions now?
Dai Yusen: Deep Research gave me a very specific case. A while ago, Trump announced new tariff policies every Friday. At that time, I was studying the trading trend of US Treasury bonds. I asked Deep Research: "When Trump announced the tariff increase in 2018, how did the US long-term Treasury bond interest rate react?"
At that time, I had two speculations: one is that tariffs would push up inflation, and long-term inflation expectations would rise, leading to higher Treasury bond interest rates; the other is that market risk aversion would increase, and investors would sell stocks and buy Treasury bonds instead, leading to lower Treasury bond interest rates. Deep Research gave an analysis within 5 minutes, pointing out that historical data from 2018 showed that every time Trump announced a tariff policy, U.S. Treasury bond interest rates would fall, and the market tended to be risk-averse. This analysis helped me make the decision to buy U.S. Treasury bonds, which ultimately proved to be correct.
Q: This is indeed a good example of AI-enabled decision-making.
Dai Yusen: Yes, I asked it a question and got the answer in five minutes. If it were my assistant or some friends with rich experience in the secondary market, they might not tell me "it will go up" until the next day. In the financial market, quick response is really important.
Q: You just mentioned learning. It seems that you have a strong interest in AI Agent?
Dai Yusen: Yes, I love reading, which is why I often talk about Agent. I do think it has greatly changed my life. Sometimes when I read, I come across an interesting point of view that I want to study in depth, but if I look up the information myself, it may take a lot of time and even affect the rhythm of reading.
For example, Reid Hoffman's new book "Super Agency" mentions the history of the development of GPS in the United States. The United States was initially worried that the high accuracy of GPS would affect national security, so it artificially added a 100-fold error, which resulted in it being used only for very extensive applications. But later the United States found that this actually limited the commercial value of GPS, so the Clinton administration eventually lifted this restriction, making GPS completely open, and also gave birth to a series of applications such as Meituan Takeout and Didi Taxi.
This example reminded me of the development of AI technology: Should we restrict it on the grounds of national security, or should we choose openness, win-win, and build an ecosystem? So I asked Deep Research to help me research the background of the 2018 GPS open policy and compare it with the current LLM policy. In this case, if I were to look up the information myself, it might take an hour, but I just need to let Deep Research do it first, and I can continue reading. I will look at the summary after it finishes.
Finally, I found that the key to the opening of GPS is that the United States has developed technology to block GPS signals in a directional manner, so that GPS can be partially shut down during wartime, but can be used openly during normal times. This also answers a key question: How can the US government address national security concerns while opening up GPS? If I were to do this myself, it would take a long time, but now Deep Research can help me do it. This is why I am willing to pay for it - from the perspective of time value, it is absolutely worthwhile.
Q: At $200 per session, do you think it’s totally worth it?
Dai Yusen: Of course it is worth it. $200 for one study, which is an average of $2 per study. This is a very high cost-effectiveness.
Q: Do you have any other book recommendations?
Dai Yusen: I particularly recommend a book called "A Brief History of Intelligence". The author of this book is a technology entrepreneur. He talks about the origin of life on Earth to GPT-4, summarizes the five key breakthroughs in the evolution of intelligence, and analyzes the driving force of each breakthrough and the resulting impact. This is one of my annual recommended books for 2024.
I also recommended it to researchers at OpenAI, who also found it very inspiring after reading it. This book not only helps us understand the evolution of intelligence, but also makes us realize that we may be on the eve of the sixth explosion, or even have entered this era.
Q: Do you have any other recommendations?
Dai Yusen: There is a more professional book called "The First Eye". It tells the history of the Cambrian explosion. Life on Earth has existed for 2 billion years, but it has always been a slug-like mollusk. Then in the millions of years of the Cambrian period, life suddenly evolved into multiple categories, and biodiversity exploded.
Why did such an evolution happen? There are many theories, such as changes in air content, changes in seawater composition, etc., but this book proposes a "light change hypothesis", which believes that it is because some organisms accidentally evolved photoreceptors that they can sense light and gain a survival advantage. As more and more photoreceptors emerged, real eyes eventually evolved. When the first eye appeared, the competition in the entire biosphere changed dramatically. Predators became stronger, and prey also evolved protective mechanisms, such as shells or more agile movement.
This theory reminds me of the current state of AI development. The release of DeepSeek and other developments make me feel that AI is also in a similar "Cambrian Explosion" stage. When competition becomes fierce, everyone must move forward quickly to avoid being eliminated. This is like the Red Queen hypothesis in Alice in Wonderland - " You have to run as fast as you can to stay where you are. "
This competition has promoted technological progress and made AI develop faster and faster. But from an evolutionary perspective, this is both a competition for survival and an inevitable result of the development of intelligence.
Q: You mentioned the evolution of intelligence just now. Is the role of language one of them?
Dai Yusen: Yes, language is actually a feature that only appeared relatively late in the evolution of intelligence. It is a highly concentrated way of expressing information. Today's AI is mainly trained based on language models because language itself contains extremely high information density.
But this also raises a question: If AI is really smart enough, will it reinvent a language and no longer be limited to human natural language? Liu Cixin's science fiction novel mentioned that alien civilizations may think that human language communication is an extremely inefficient way.
Therefore, AI currently relies mainly on language models, but it may go beyond language in the future. AI's thinking speed is much faster than that of humans. If it continues to use human language, it may be limited by the way it expresses itself. Looking back at the history of the evolution of intelligence can help us understand the possible future development direction of AI.
Q: You mentioned reinforcement learning. How does it play a role in the evolution of intelligence?
Dai Yusen: This book also explores the origin of reinforcement learning and analyzes it through a large number of cases from evolutionary biology. I think these studies have a great impact on the field of AI.
Q: Thank you very much for visiting us, Yusen. Today we started with the two key advances, o1 and R1, and discussed their impact on the AI landscape and the changes that followed. In 2025, perhaps we will see more PMF breakthroughs in AI agents and more "Lee Sedol moments".
Dai Yusen: Thank you for the invitation. I also look forward to the development of AI in 2025. We are still in the first day of the AI intelligent revolution. There will definitely be more surprises in the future!