Woter AI detection.Hurry - ends Jun 29th

New Year Sales :up to 80% OFF

AI Humanize AI Translator Bypass AI AI Rewriter AI Detector

PRICING

TRY FOR FREE

Huang Renxun's latest interview: 50% of AI developers are Chinese, and the "AI Proliferation Rules" may backfire on the United States

Written by

Silas Grey

Updated on:June-20th-2025

On May 20, after delivering a keynote speech at the Computex 2025 computer show in Taipei, China, Nvidia CEO Huang Renxun accepted an exclusive interview with Ben Thompson, a blogger from the technology blog Stratechery.

In this interview, Huang Renxun discussed a series of AI cooperation agreements recently signed by NVIDIA with Saudi Arabia and the UAE, the H20 chip export ban against China, and frankly expressed his concerns about the current US chip export control policy, believing that this strategy may weaken the technological leadership of the United States, including NVIDIA, in the future.

Huang Renxun also elaborated on his views on the global economic landscape, believing that AI technology has the potential to not only significantly boost global GDP growth, but may also help the United States alleviate its trade deficit problem to a certain extent.

During the interview, Huang Renxun introduced the core advantages of the "NVIDIA Full Stack" solution - maximizing AI performance through deep integration of software and hardware. He explained that modular design can bring greater flexibility to customers, who can choose system components according to their needs without having to buy them all in a package.

At the same time, he also mentioned the key role of the Dynamo system in improving reasoning performance. NVIDIA has built an AI infrastructure platform that runs from chips to software, from training to reasoning through a comprehensive layout.

The following is the full text of Huang Renxun’s latest interview:

AI itself constitutes an entire new industry

Powered by AI Factory

Q: In the past few interviews, I can feel that you really want the world to understand the potential of GPU. At that time, ChatGPT had not yet come out, and now the whole market seems to be hanging on your financial performance. I know you are in a period of silence for financial reports, and I will not ask questions related to financial reports. But I want to know, being pushed into such a position and becoming the focus of global technology attention. How do you feel?

Huang Renxun: To be honest, this incident did not have much emotional impact on me personally, but one thing I am always very clear about is that in the process of constantly reshaping NVIDIA, promoting technological progress and leading industry development have always been the core mission of our work. We are determined to be at the forefront, overcome the most challenging technical problems, and continue to create value for the entire ecosystem.

Today, NVIDIA is no longer just a chip design company, but a company that provides a comprehensive computing platform with data centers at its core. We have not only built a full-stack AI platform covering training and reasoning, but also achieved deep integration and modular decoupling of software and hardware architecture for the first time, providing flexibility and scalability for the broad participation of the ecosystem.

In my keynote speech at Computex this year, I emphasized that what we are building now is not just computer systems needed by the "tech industry", but building the infrastructure for a new industry form called "artificial intelligence". AI is not only a technological revolution, but also a labor revolution - it has significantly enhanced human work capabilities, especially in emerging fields such as robotics, and this enhancement will be more profound in the future.

More importantly, AI is not just a technological breakthrough, it is itself a huge and brand-new industrial system. And this industry will be driven by the infrastructure we call the "AI factory" - the core of which is the data center based on ultra-large-scale computing power. We are just beginning to realize that the focus of the times is shifting: in the future, data centers will no longer be just the carriers of cloud computing, but will become true AI factories, and their scale and importance will far exceed our imagination today.

Q: Microsoft CEO Satya Nadella mentioned in the latest earnings call that they reported a token processing volume figure - I remember it was last quarter. Is this the earnings detail you pay most attention to?

Huang: Actually, the actual number of tokens generated is much higher than that. The data released by Microsoft only covers the part they generate for third-party customers. The amount of tokens they process for their own use is actually much larger than that. In addition, this number does not include the total amount of tokens generated by OpenAI. So, just based on the numbers in Microsoft’s report, you can imagine how huge the actual number of tokens generated in the entire ecosystem is.

AI Proliferation Rules May Backfire on the U.S.

Q: You recently reached a series of AI cooperation agreements with Saudi Arabia and the UAE. From your perspective, why are these cooperations important? Why did you come here in person? What does this mean to you?

Huang: They personally invited me to come, and we are here to announce two very large AI infrastructure projects: one in Saudi Arabia and the other in Abu Dhabi. The leaders of both countries have realized that they have to participate in this AI revolution and they also recognize that their countries have a unique strategic advantage, which is abundant energy resources.

However, these countries have shortcomings in terms of labor. Their national development has long been restricted by the size of the labor force and population. The emergence of AI provides them with a historic opportunity: to achieve the transformation from an "energy economy" to a "digital labor force" and "robot labor force" economy.

We helped to establish a new company in Saudi Arabia called "HUMAIN", whose goal is to enter the world stage, build a global AI factory, and attract international companies including OpenAI to participate in cooperation (OpenAI representatives also attended the event). This is a project of great significance.

Q: To some extent, this seems to be a challenge to the AI Diffusion Rule? I understand that this rule is particularly strict for these countries, such as limiting the number of chip exports, requiring them to be controlled by American companies, and requiring them to rely on domestic manufacturing in the United States in some aspects. Compared with the past, your opposition to the rule is more resolute this time. You used to be less directly involved in government policy affairs, but now Nvidia has become one of the core technology companies in the world. Can you quickly adapt to this role change?

Huang Renxun: It's not that I don't want to participate, but there was really no need for it in the past. For most of NVIDIA's development, we have focused on developing technology, building companies, cultivating industry ecosystems, and moving forward in competition. We are always building supply chains and building ecosystems, which in itself is already very large and complex.

But once the AI Proliferation Rule was introduced, we made our position clear. Now it is clear to everyone that this policy is completely wrong. It is a fundamental strategic mistake for the United States. If the original intention of the AI Proliferation Rule was to ensure America's leading position in AI, then it may actually have the opposite effect and make us lose our original advantage.

AI is not as simple as a certain model or a certain layer of software, it is a complete technology stack. That is why when people talk about NVIDIA, they are not only talking about chips, but also systems, infrastructure, AI factories, and even the entire deployment framework. AI is multi-layered: from the chip layer to the factory layer, infrastructure layer, model layer, and application layer, each layer is crucial - real competitiveness comes from this complete stack.

If the United States wants to stay ahead in the global AI race, it must lead at every level. At this critical moment when our competitors are catching up and accelerating their deployment, we have chosen to limit the spread of our own technology around the world - this is undoubtedly "shooting ourselves in the foot". In fact, we have foreseen this result from the beginning.

It’s impossible to stop China from joining the AI revolution

DeepSeek is an outstanding representative

Q: When you say “international competitors,” do you mean other model developers?

Huang: China is doing very well in AI. About 50% of the world's AI researchers are Chinese. You can't stop them from participating in this technological revolution, and you can't stop them from moving forward. Frankly speaking, projects like DeepSeek are very outstanding representatives. If we are unwilling to admit this, it is a kind of self-deception, which I cannot accept at all.

Q: Did the restrictions placed on them spur technological breakthroughs in areas like memory management and bandwidth efficiency?

Huang: Competition is the engine that drives progress. Companies need competition to motivate themselves, and so do countries. There is no doubt that we have stimulated their technological progress.

But personally, I had expected China to advance rapidly at every stage of AI. Huawei, for example, is a very strong company and a world-class technology company. Chinese AI researchers and scientists are also world-class. If you have been to the offices of Anthropic, OpenAI, or DeepMind, you will find that there are many top talents from China. This is not surprising.

Moreover, the AI Proliferation Rule, which aims to limit other countries’ access to U.S. technology, is a wrong policy from the start. What we should really do is accelerate the spread of U.S. technology around the world—while it’s still not too late. If our goal is to keep the United States at the forefront of the world in AI, then this set of rules is doing the exact opposite.

The AI Diffusion Rules also ignore the nature of the AI technology “stack.” The AI stack is like a computing platform: the more powerful and broadly based the platform is, the more developers it attracts, the stronger the applications it generates, and the higher the value of the platform. Conversely, the more developers there are, the more prosperous the ecosystem, and the larger the platform’s installed base, the more developers it will attract. This “positive feedback loop” is critical to the development of any computing platform, and it is the fundamental reason why Nvidia is successful today.

You can't say, "The United States doesn't need to compete in the Chinese market." That's where half of the world's developers gather. From the perspective of computing architecture and infrastructure, this decoupling is completely untenable. We should give American companies the opportunity to compete in the Chinese market - reducing the trade deficit, creating tax revenue for the United States, developing industries, and providing jobs. This is not only good for the United States, but also good for the healthy development of the global technology ecosystem.

If we choose to give up participation and let China build a complete and prosperous local ecosystem, and American companies are completely absent, then the United States will no longer dominate this new platform in the future. AI technology is spreading rapidly around the world. If we do not actively participate in the competition, what will eventually spread out will be other people's technology and leadership.

Q: I agree with you very much. In my opinion, the current policy logic of trying to restrict chip sales while allowing the other party to obtain all chip manufacturing equipment is simply putting the cart before the horse. We know very well that it is much more difficult to track chips than to track equipment. There is a saying that in Washington, some semiconductor equipment manufacturers have been deeply involved for many years and are good at lobbying, while Nvidia has relatively little influence there, so it is at a disadvantage in the policy game. Do you think this statement is valid? Do you also think it is particularly difficult to make Washington understand your position?

Huang: We have spent a lot of effort over the past few years to gradually establish a presence in Washington. We do have a small team there, and companies of our size usually have hundreds of people in Washington for public relations and policy teams, and we only have a few people. But I have to say that these few people are very good. They are not only trying to tell Nvidia's story, but also helping policymakers understand how chips work, how the AI ecosystem works, and what unexpected chain consequences certain policies will bring.

What we really want is for America to win in the competition. Every company should want their country to win, and every country should want their companies to win. This is not a wrong desire, it is a good thing. It is good for people to want to win, it is good to want to be great, and it is good to compete. If a country wants to be great, we should not be jealous of it, and if a company wants to be great, I will not be jealous of it. This motivation will drive everyone to keep moving forward and do better. I like to see people who want to be great.

There is no doubt that China aspires to become a powerful country, and there is nothing wrong with that. They should pursue greatness. And the AI scientists and researchers I know have achieved what they have today precisely because of this aspiration. They are indeed very good.

What we need to do is not to try to trip others, but to run faster. NVIDIA's achievements today have never been due to any special treatment, but because we have been running desperately.

I think the mindset you mentioned of "protecting yourself by limiting your opponent" will only make the opponent stronger - because they are already amazing.

Q: The Trump administration banned your export of the H20 chip to China, which you had custom-designed based on the previous administration's policy framework. Then you were told "this won't work either." Now they're working on new restrictions. Do you think policymakers are finally realizing that the world is highly interconnected, and that actions in one place can have ripple effects in another? Are they finally beginning to realize that "complete decoupling" is unrealistic, and that it might be time to return to a more pragmatic, management-oriented approach? Are you optimistic about this, or are you prepared for the worst?

Huang: The president of the United States has a vision that he wants to achieve. I support him and believe that he will eventually lead the United States to a positive outcome in a respectful way. He will be competition-oriented, but will also strive to find opportunities for cooperation. Of course, I am not in the White House, so I don’t know what they think internally, but this is my understanding of it.

Regarding the ban on H20 chips, we have designed it according to the maximum limit of what the Hopper architecture can do, and we have cut everything that can be cut. We have already done a large write-off for this, I remember it was $5.5 billion. In history, no company has ever written off such a large inventory. So this additional ban on H20 is extremely heavy and costly for us. Not only is the direct loss of $5.5 billion, but we have also voluntarily given up $15 billion in potential sales and about $3 billion in tax revenue.

You have to know that the annual demand for AI chips in the Chinese market is about 50 billion US dollars. Note that it is not 50 million, but 50 billion US dollars. What does this mean? It is equivalent to the annual revenue of the entire Boeing company. Let us give up such a market - not only the profits and revenue, but also the ecological construction and global influence that come with it. The cost cannot be ignored.

Q: If China eventually builds an alternative to CUDA, will that pose a long-term threat to Nvidia?

Huang: That’s right. Anyone who naively believes that simply by imposing export controls and banning China from using H20 chips, they can stop their development in the field of AI is extremely ignorant.

AI will drive a significant increase in global GDP

Q: When did you really realize that Nvidia would become an “infrastructure company”?

Huang Renxun: If you look back at my past keynote speeches, you will find that I have already talked about many of the things that are happening today five years ago. Maybe I was not clear enough at that time, and my language was not as precise as it is now, but the direction we are moving in has always been very clear, consistent and firm.

Q: So, now that you end every speech with a talk about "robots," is that actually the "five-year preview" that we should pay close attention to? In other words, this is not a distant future, but a reality that will become a reality in a few years?

Huang: Yes, I think it’s really coming, it’s going to happen in the next few years.

One profound and significant thing in the industry is that for the past 60 years, we have been in the IT industry, which is an industry that provides tools and technology for humans. But now, for the first time, we are going beyond the scope of IT - all the products we used to sell were IT equipment, and now we are entering two new areas: manufacturing and operations.

By manufacturing, we mean we are making robots or using robotic systems to make other products; by operations, we mean we are providing “digital employees.” Global operating and capital expenditures combined are about $50 trillion, and the entire IT industry is about $1 trillion. Now, thanks to AI, we are about to move from that $1 trillion industry to a new market 50 times the size.

I believe that while some traditional jobs will be replaced and will indeed disappear, at the same time, a large number of new job opportunities will emerge. In particular, with the popularization of the new form of "intelligent bodies", robotic systems may directly promote the actual expansion of global GDP.

The logic behind this is actually very simple: we are facing a labor shortage. The unemployment rate in the United States is at a historical low. You can see it everywhere in the country: restaurants can't hire waiters, and many factories are also having difficulty recruiting workers. In this context, many people will accept the concept of "spending $100,000 a year to hire a robot" without hesitation, because it can significantly increase their income and output capacity.

So I think that in the next five to ten years, we may experience a substantial GDP expansion and witness the birth of a whole new industry. The core of this industry is to produce digital results based on the system of "generating tokens" - this is something that the public is beginning to understand.

Q: Your two speeches at Computex 2025 and last month's GTC actually had completely different styles. My understanding is that GTC is for hyperscale cloud service providers, while Computex 2025 is for the enterprise IT market. So is your current focus on enterprise IT?

Huang Renxun: You can say it this way - enterprise IT, and "intelligent bodies and robots". The core carrier of enterprise IT is intelligent bodies, and the core application of manufacturing is robots. Why is this so important? Because this is the starting point of the future ecosystem.

Will Dynamo become an AI factory operating system?

Q: In your recent GTC speech, you mentioned some limitations of traditional data centers and explained why Nvidia's solution is a more suitable choice. I understand this as your opposition to "dedicated chips" (that is, ASICs). On the one hand, you showed Nvidia's complete product roadmap, indicating that we have a long-term and clear technical direction; on the other hand, you talked about the balance of "latency and bandwidth", pointing out that GPUs can flexibly adapt to different types of AI workloads because of their programmability, unlike ASICs that can only do a single task. And these dedicated ASIC chips are built by some hyperscale cloud service providers themselves. In contrast, Nvidia provides a general and scalable solution that is more suitable for a rapidly changing AI world.

Huang Renxun: You understand correctly. I did convey these views, but my original intention was not to oppose ASIC. Instead, I wanted to help people understand how the next-generation data center should be designed. We have been thinking about this issue for many years.

The key challenge is that the energy in a data center is limited. So if you think of it as an "AI factory", the first task is to get as much computing throughput as possible for every watt of electricity. The unit we measure this throughput is the token. You can produce extremely cheap tokens, such as free inference for open source models; or you can produce high-quality, high-value tokens for which users may pay $1,000 or even $10,000 per month.

Q: You also mentioned a “$100,000 agent” in your speech.

Huang: Yes. You asked me if I would be willing to spend $100,000 a year to hire an AI assistant. My answer is yes. We hire talents with annual salaries of hundreds of thousands or even millions of dollars every day. If spending $100,000 can improve the productivity of an employee with an annual salary of $500,000, then of course it is worth it.

The point is that the quality of tokens you produce in your AI factory is varied. You need a large number of cheap tokens, but also high value-added tokens. If you build a chip or system that can only handle a certain type of token, it will be idle most of the time, resulting in wasted computing resources. So the essence of the problem is: how to design a platform that can handle a high throughput of free tokens while also being competent for high-quality tasks?

If your computing architecture is too decentralized, different types of tasks will be inefficient when migrating between different chips. If you only focus on high token rates, the overall throughput will usually decrease. If you pursue high throughput performance, the system interactivity will be limited and the user experience will decrease.

"It's easy to optimize on the X or Y axis," but it's very difficult to "fill the entire two-dimensional space." This is exactly what NVIDIA is trying to solve with the Blackwell architecture, FP4 low-precision computing format, NVLink 72 high-speed interconnect, HBM high-bandwidth memory, and the core Dynamo decoupled reasoning system.

Q: Is Dynamo what you call a “data center operating system”?

Huang Renxun: You can say that. The starting point of its design is that the reasoning process of a large language model is not a unified and constant process, but is divided into stages and varies according to the task.

We break this process down into two main stages:

Pre-fill stage: Processing context, which is equivalent to background work such as "understanding who you are" and "what you care about";
Decode stage: The process of generating actual tokens, which often involves complex calculations such as chain-of-thought and retrieval enhancement (RAG).

The demand for computing resources in the decoding phase is highly dynamic - sometimes almost no floating point operations are needed, and sometimes a lot is needed. The significance of Dynamo is that it can automatically disassemble, distribute, and schedule inference tasks to the optimal resource nodes in the entire data center.

Q: From an architectural perspective, is Dynamo the software system that schedules the entire data center as if it were a GPU?

Huang: That’s right. It’s essentially the operating system for the AI factory.

Q: How do you see the future of inference models? Will they be used more for agent workflows? Or will they be used primarily to generate training data to help models optimize themselves?

Huang Renxun: I think it depends on the cost. But from the trend, the inference model will become the "default computing unit" of AI. With the advancement of hardware and software, we can process inference at an amazing speed.

For example, the Grace Blackwell platform is 40 times more powerful than the previous generation; the next generation is 40 times more powerful; and the models themselves are becoming more and more efficient. So , it is entirely possible for me to achieve a 100,000-fold increase in inference speed five years from now.

Today’s AI systems have actually completed a “mountain of thinking” in places you can’t see. It just doesn’t show you that it’s “thinking.” This is a “fast thinking” system—even tasks that originally require deep reasoning and “slow thinking” have become extremely fast in it.

Q: You mentioned that the construction of power infrastructure in the United States is full of difficulties, but in places like the Gulf countries and China, the speed of power acquisition and construction is much faster. Is it true that the problems that NVIDIA solves are not so urgent in these regions?

Huang Renxun: Your perspective is very interesting. I didn’t think so before. But no matter which country, the scale of data centers is always limited, so efficiency per watt is always very critical.

We can do a simple calculation: the shell, electricity, land and operating costs of a 1 GW data center are about $30 billion; plus computing, storage, network and other parts, it is about $50 billion; if you have to build two systems to achieve the same performance because the system is inefficient, then the initial construction cost will swell from $30 billion to $60 billion. So you have to use an extremely efficient architecture to offset the additional costs. In this world, "free computing" is sometimes not cheap enough.

NVIDIA's full-stack strategy

Q: You mentioned several times that "I hope you (customers) buy NVIDIA's full set of products, but you will be happy as long as you buy any part of it." This sentence sounds very pragmatic, like the CEO of an enterprise software company. If customers need to build a complete AI factory, according to you, NVIDIA's full-stack solution will undoubtedly bring the greatest benefit. But many customers don't need the "full stack", they only buy part of it. But once they start using a part of NVIDIA, they usually continue to use it. So from a strategic perspective, it is also very valuable to cover these customers, right?

Huang Renxun: Serving customers is the smart thing to do. If you look at NVIDIA's market strategy, we have always been building end-to-end complete solutions. Because software and hardware must be closely integrated to achieve maximum performance, and we are able to "decouple" software and hardware well, allowing customers to choose some components according to their needs.

If a customer doesn’t want to use our software – no problem. Our system is designed to be flexible enough that if a customer wants to replace certain components, we can do that too.

Grace Blackwell architecture has now been deployed in different cloud services around the world, and each cloud service provider integrates based on our standards, but their implementation methods are different. We are able to integrate into their systems very smoothly.

This is the real strength of NVIDIA’s business model, but it’s also the embodiment of our position as a “computing platform company.” What matters to us most is that customers use at least part of our technology stack: if they choose our compute stack, great; if they choose our networking stack (I care about networking as much as compute), great too; if they choose both, great!

I have always believed that NVIDIA can build the best overall system. If I don't believe we can do better, then there is something wrong with us and we must improve and regain confidence.

Our company has 36,000 to 38,000 employees, and everyone is working together to do one thing: to build the world's leading accelerated computing platform and AI computing platform. So if there is a company with only 14 people who can do better than us, it will be very painful for me, and we must work harder to catch up.

Q: But you also believe in the power of scale, and to maximize scale you have to sell products the way customers want to sell them.

Huang: Exactly, that's the key. We have our own preferences, but we will serve customers in the way they prefer.

Gaming: GeForce's Multiple Roles

Q: In your GTC speech, only 10% of the content is about GeForce, but it is still very important to us. Is this "important" because we are making GPUs and everything can be scaled? How do you explain the relationship between NVIDIA and games?

Huang: I wish I could say that RTX PRO would not be possible without GeForce, Omniverse would not be possible without GeForce, and every pixel we see would not be possible without GeForce. Robots would not work without GeForce, and the same is true for Newton.

GeForce itself is not the core theme of GTC, because the latter focuses on high-performance computing, enterprise and AI, and we also have a dedicated game developer conference. So at GTC, GeForce product launches will not be the core focus like other areas, but everyone knows that GeForce plays a vital role in everything we do.

Q: Does this mean that gamers may not fully realize that GeForce has become much more than just a graphics rendering engine?

Huang: Exactly. We only render one out of every ten pixels, which is a staggering number. Suppose I give you a jigsaw puzzle and give you one of the ten pieces. I don’t give you the other nine pieces at all, and you have to figure out how to complete them yourself.

Q: I tried to connect gaming to the other areas you mentioned. You said that NVIDIA is very strict in designing to separate different modules, and the software management is also very clear and decoupled. This immediately reminds me of the driver problem on Windows. To be honest, this ability itself is one of your core technical advantages.

Huang Renxun: Drivers are indeed very low-level technologies, and the content involved is extremely complex. In fact, the "abstraction" of drivers itself is a revolutionary concept, and Microsoft has played a key role in promoting this system. It can be said that without the design of the driver abstraction layer, there would be no Windows ecosystem today. It is through the establishment of the API abstraction layer that the hardware layer can continue to evolve and change without affecting the compatibility and stability of the upper-layer software.

Currently our drivers are open source, but frankly, I don't see too many people really being able to participate in them. The reason is that every time we launch a new GPU, a lot of the work done on the old drivers has to be rewritten or replaced. Only a team with a large engineering capacity like NVIDIA can continue to drive the evolution of this system - this is almost an impossible task for most companies.

However, it is precisely because we can provide deeply optimized dedicated drivers for each generation of GPU that we have built a stable and powerful abstraction and isolation layer. Whether based on CUDA or DirectX, developers can safely develop on these platforms without worrying about changes in the underlying hardware.