Tencent strikes back! Releases the official version of Hunyuan T1, which is as good as DeepSeek-R1 in actual combat and 3/4 cheaper

Written by

Jasper Cole

Updated on:July-10th-2025

Zhidongxi reported on March 22 that last night, Tencent officially upgraded the deep thinking model of the Hunyuan large model series to the official version of Hunyuan-T1.

T1 is a strong inference model developed by Tencent. It can generate words at a speed of 60 to 80 tokens per second , which is much faster than DeepSeek-R1 in actual generation performance.

The predecessor of this model was the Hunyuan T1-Preview (Hunyuan-Thinker-1-Preview) reasoning model based on the Hunyuan medium-sized base, which was launched by the Hunyuan team on the Tencent Yuanbao APP in mid-February this year.

Compared with T1-Preview, the official version of T1 is based on the industry's first ultra-large-scale Hybrid-Transformer-Mamba MoE large model TurboS fast thinking base released by Tencent Hunyuan in early March. It expands the reasoning capability through large-scale post-training and further aligns human preferences. This is also the first time in the industry that the hybrid Mamba architecture has been losslessly applied to ultra-large inference models .

The evaluation results of T1 on multiple public data sets show that it is basically the same as or slightly higher than R1 in Chinese and English knowledge and competition-level mathematics and logical reasoning indicators such as MMLU-pro, CEval, AIME, and Zebra Loigc .

Currently, T1 has been launched on Tencent Cloud’s official website. The input price is 1 yuan per million tokens , and the output price is 4 yuan per million tokens . The output price is 1/4 of the DeepSeek standard period, which is consistent with the DeepSeek discount period .

▲DeepSeek API price

01 .

The generation speed exceeds DeepSeek-R1

Competent with complex instructions, long text summaries, and role-playing

In the knowledge question-answering scenario, Tencent Hunyuan Research Team demonstrated a comparison of the generation effects of T1 and DeepSeek.

The first prompt is "Can ethyl acetate be mixed with water?" It can be seen that the length and results of the overall generated results of T1 and DeepSeek-R1 are similar, but the generation speed of T1 is obviously faster.

The second biggest challenge is about scientific mathematical reasoning , which has more restrictions on the model and a longer thinking process. From the output results, the conclusions generated by T1 and DeepSeek-R1 are consistent, and the speed of T1 is still faster.

The third challenge is to test the ability to follow complex instructions . T1 was asked to come up with the second line of the couplet. The first line given in the prompt was "deep and shallow streams of water". The difficulty lies in that the model must follow the consistent three-point water radical and the first four characters must be an AABB structure. During T1's thinking process, he accurately analyzed the characteristics of the first line of the couplet and gave the answer after many wrong attempts: "surging waves".

The fourth difficult problem is a general task , and its prompt is an open question: "Generate a text for the circle of friends, with the theme of the long journey of life." There are no clear style instructions and requirements, so it is an open question.

T1 can also be used as a productivity tool to improve user work efficiency. The next demo will demonstrate T1's ability to summarize long articles.

The prompt was "A 4,000-word news report on Microsoft's acquisition of Blizzard, asking T1 to summarize the content of the article." In the output results, T1 not only summarized the main content of the article, but also extracted several key figures in the news report.

The last demonstration was about the role-playing ability of the model. The prompt was "Please play the role of Li Bai, and use a tone that matches Li Bai's characteristics. Guess a riddle: Complaint is invalid." T1's thinking process focused on analyzing the riddle, and after coming to the conclusion that the answer was "Hao", he output the answer in Li Bai's tone and composed a poem.

02 .

Multiple test set results compared to R1

Adopting the innovative architecture of Hunyuan Turbo S

In addition to being basically the same as or slightly higher than R1 in various public benchmarks such as MMLU-pro, CEval, AIME, Zebra Loigc, and other Chinese and English knowledge and competition-level mathematics and logical reasoning indicators, Hunyuan-T1 can also benchmark against Tencent's internal artificial experience set evaluation, among which it is slightly better than R1 in cultural and creative instruction compliance, text summarization, and agent capabilities.

On the MMLU-PRO dataset, which tests the base model's memory and generalization capabilities for a wide range of knowledge understanding, T1's score is second only to o1. In public benchmark tests of Chinese and English knowledge and competition-level mathematics and logical reasoning such as CEval, AIME, and Zebra Logic, T1's performance is basically the same as or slightly higher than R1.

From a technical perspective, the official version of Hunyuan T1 uses the innovative architecture of Hunyuan Turbo S and adopts the Hybrid-Mamba-Transformer fusion mode . This is the first time in the industry that the hybrid Mamba architecture has been losslessly applied to ultra-large inference models. This architecture can reduce the computational complexity of the traditional Transformer architecture, reduce KV-Cache memory usage, and reduce training and inference costs.

In terms of long text reasoning, TurboS's long text capture capability can effectively solve the problems of context loss and long-distance information dependency in long text reasoning. The Mamba architecture can specifically optimize long sequence processing capabilities, and through efficient computing methods, while ensuring the ability to capture long text information, reduce the consumption of computing resources, making the decoding speed twice as fast under the same model deployment conditions.

During the post-model training phase, Tencent's Hunyuan research team invested 96.7% of its computing power into reinforcement learning training, focusing on improving pure reasoning capabilities and optimizing alignment with human preferences.

In terms of data, T1's high-quality prompt collection mainly focuses on complex instruction diversity and data of different difficulty levels. Based on the world's science problems, researchers have collected data sets covering mathematics/logical reasoning/science/code, including problems ranging from basic mathematical reasoning to complex scientific problem solving, and then combined with ground-truth real feedback to ensure the performance of the model when facing various reasoning tasks.

In terms of training plan, T1 adopts a course learning approach to gradually increase data difficulty, while step-by-step expanding the model context length, so that the model's reasoning ability is improved while learning to efficiently use tokens for reasoning.

In terms of training strategies, the researchers referred to strategies such as data playback and phased strategy reset from classic reinforcement learning , which improved the long-term stability of model training by more than 50%.

In the stage of aligning human preferences, it adopts a unified reward system feedback scheme of self-rewarding (comprehensive evaluation and scoring of model output based on the early version of T1-preview) + reward mode to guide the model to improve itself.

03 .

Conclusion: Tencent Hunyuan Model Iteration Acceleration

Tencent Hunyuan model series entered a rapid iteration period this year, and it has successively launched Tencent Hunyuan Deep Thinking Model T1 and self-developed Fast Thinking Model Turbo S. Previously, Hunyuan Turbo S achieved a 44% reduction in first word latency in terms of technological breakthroughs, and has been applied to Tencent internal products such as Tencent Yuanbao.

The preview version of Tencent Hunyuan Deep Thinking Model T1 released this time has also been launched on Tencent Yuanbao. It can be seen that Tencent’s internal business and scenarios have been fully integrated into the Hunyuan series of large model capabilities, such as Tencent Yuanbao, Tencent Cloud, QQ, WeChat Reading, Tencent News, Tencent Customer Service, etc.

On this basis, Tencent Hunyuan team is exploring new research ideas to find new solutions to reduce large model hallucinations, reduce training costs, etc.