Nine AI models from Harbin Institute of Technology are launched to unlock new intelligent paradigms in thousands of industries

The nine AI models of Harbin Institute of Technology lead the new wave of industry intelligence and demonstrate the breakthrough strength of China's AI field.
Core content:
1. Harbin Institute of Technology ranks first in the global NLP university rankings, showing China's achievements in the field of AI
2. The nine AI models of Harbin Institute of Technology cover key fields such as agriculture, medical care, and aerospace
3. The innovative breakthroughs and applications of the "Movable Type" dialogue model and the "Herb" medical AI model
According to CSrangkings statistics, Harbin Institute of Technology ranked first in the 2022-2024 global natural language processing (NLP) university rankings , demonstrating China's breakthrough strength in the field of AI. Behind this achievement is Harbin Institute of Technology's artificial intelligence research concept of "top technology and local application" - both deeply cultivating the academic frontier and keeping a close eye on the industry pulse.
As early as 1979, the computer science major at Harbin Institute of Technology began research in the field of natural language processing in the field of artificial intelligence. Professor Wang Kaizhu, in collaboration with the Russian Language Teaching and Research Section of Harbin Institute of Technology, pioneered research on Russian-Chinese machine translation, setting a precedent for natural language processing at Harbin Institute of Technology. Professor Li Sheng is one of the earliest experts in China to research and develop Chinese-English machine translation systems, and he led the development of China's first Chinese-English machine translation system CEMT-I that passed technical appraisal. The intelligent pinyin sentence input technology invented by Professor Wang Xiaolong was identified as "internationally leading" by the State Science and Technology Commission, and empowered the Microsoft Pinyin input method through the form of achievement transformation.
After more than 40 years of accumulation, Harbin Institute of Technology has outstanding capabilities in artificial intelligence and natural language processing. It began to study pre-training models in 2018 and officially started large-scale model research in 2023.
Recently, the School of Computer Science at Harbin Institute of Technology has developed nine large models for general and vertical fields, covering key areas such as agriculture, medical care, aerospace, and education. These models tackle technical problems such as "catastrophic forgetting" and "multimodal interaction", and are deeply integrated into the industrial ecosystem, becoming the "Chinese engine" to promote the intelligence of the industry.
Movable Type: A large model for universal field dialogue
In 2023, several teachers and students from the Institute of Natural Language Processing at Harbin Institute of Technology jointly developed an open source and commercially available dialogue model - "Movable Type". Its latest version, "Movable Type 3.5", is based on the Chinese-Mixtral-8x7B, the first Chinese-expanded vocabulary hybrid expert model in China previously launched by the team, and on this basis, it uses advanced technologies such as multi-stage incremental pre-training, instruction fine-tuning and model fusion for optimization. In terms of Chinese and English knowledge understanding, mathematical reasoning, code generation, instruction compliance, content security and other aspects, "Movable Type" has shown a leading advantage among open source large models with the same activation parameter scale. For Chinese input scenarios, "Movable Type" has been specially optimized, and its Chinese encoding efficiency ranks among the top domestic Chinese large models, and supports ultra-long context processing with a maximum length of 32K. The "Movable Type" series and its Chinese base model have received more than a thousand star collections on the GitHub open source platform, and the cumulative download volume in international open source communities such as HuggingFace has reached 13,000 times.
By fully open-sourcing training codes, checkpoint weights and technical details, the "Movable Type" series is committed to providing more choices and possibilities for research and application in the field of Chinese natural language processing.
Materia Medica: The “super brain” of medical AI
The "Ben Cao" model was developed by the Social Computing and Interactive Robotics Research Center of the School of Computer Science at Harbin Institute of Technology . It is based on the "Movable Type" model base and is the first open source medical model in China. The accuracy of medical question answering has reached the international leading level. Its technical framework is based on the innovative closed loop of "knowledge graph constraints-generation logic optimization-credibility verification". The GitHub open source code has received 4.7K stars (the first in the domestic medical model list) and has been used by Huawei, Tencent, iFlytek and other companies for the development of industry models.
The model has been selected as the 27th in the "TOP70 Chinese Large Models" (second among universities), has applied for 10 patents, and published more than 10 papers in top conferences such as ACL and AAAI, building a professional and reliable medical AI methodology system.
"Ben Cao" has been deployed in institutions such as the Heilongjiang Provincial Center for Disease Control and Prevention and the Second Hospital of Harbin Medical University, supporting scenarios such as disease early warning and diagnostic assistance, promoting the transition of medical AI from general generation to trusted decision-making, and providing verifiable and traceable intelligent support for electronic medical record analysis and primary diagnosis and treatment, enabling the industry to move towards the "Knowledge Enhancement 2.0 Era". It has received support from the Heilongjiang Provincial Key R&D Project Plan, as well as companies such as Huawei and Shenrui Medical.
Jigsaw puzzles: AI psychologist for teenagers
The "Qiaoban Qiaohuan" model was developed by the Social Computing and Interactive Robotics Research Center of the School of Computer Science at Harbin Institute of Technology. It is the first large-scale model of empathetic companionship and psychological counseling for teenagers in China. It aims to provide psychological support for K12 students in scenarios such as learning growth, life adaptation and interpersonal communication, and effectively alleviate the emotional distress and psychological pressure they may experience.
The research results of this model have been published in more than 20 papers at top international conferences such as ACL and IJCAI, and more than 10 national patents have been applied for. An empathy dialogue database and empathy strategy preference dataset with 100,000 high-quality samples have been constructed, providing solid data support for the precise implementation of emotional counseling and psychological intervention.
At present, the technology related to the "Qiaoban Qiaohuan" large model has been cooperated with many companies such as iFlytek and Huawei. Some functions have been applied to the "AI Psychological Partner" of iFlytek learning machine, and it has cooperated with the First Specialized Hospital of Harbin to realize the empowerment of AI for psychological counseling.
Kites: an "encyclopedia" of aerospace science
The "Kite" model was developed by the Social Computing and Interactive Robotics Research Center of the School of Computer Science at Harbin Institute of Technology . It is based on the Qwen2 series base model and is the first large model in the field of Chinese aerospace science popularization in China. It has reached the industry's leading level in terms of accuracy in answering aerospace knowledge questions. It was obtained through multi-stage post-training on 230,000 high-quality aerospace-related data, and has rich domain knowledge and strong user command understanding capabilities.
The core technology of the model has been published in more than ten papers at top international academic conferences such as NeurIPS, AAAI, ACL, and EMNLP, building a professional and reliable aerospace knowledge enhancement method strategy.
The "Space Science Assistant" built based on this model has been officially launched on the homepage of the "Satellite Encyclopedia", a well-known domestic space enthusiast community website, to provide authoritative and convenient space knowledge services for young people. Since its launch, the assistant has completed tens of thousands of space questions and answers, providing real-time answers to hundreds of visitors every day, and has been widely praised by the user community, showing remarkable results in popular science communication. Constructing an intelligent unmanned communication path for space science popularization provides high-quality, interactive intelligent support for professional education and public science popularization.
Abacus: Programmers’ “Efficiency Tool”
The "Abacus" model was developed by the Social Computing and Interactive Robotics Research Center of the School of Computer Science at Harbin Institute of Technology. It is lighter, stronger, faster and more useful. "Abacus" proposed a continuous pre-training technology, combined with research results in data composition and learning rate scheduling. While improving programming capabilities, it effectively overcomes the catastrophic forgetting problem of the model in general knowledge. With only 2.7B parameters, it has surpassed the same level of code large models at home and abroad for the first time in terms of code and general capabilities, reaching the international advanced level; "Abacus" proposed a Token recovery decoding strategy, and the code generation speed was increased by more than 2 times; "Abacus" proposed a MultiPoT strategy, which effectively supports many applications such as numerical calculation, time, space reasoning, and structured table reasoning.
Based on the "Abacus" code model, the research center developed the "Abacus-vscode plug-in" and "Abacus-SQL" systems, which expanded the code automatic generation application ecosystem, improved user programming efficiency, and were widely used. The number of readings on the first day of release exceeded 5,000.
The core technology of "Abacus" covers the complete technology stack of large models such as data cleaning, pre-training, post-training and lightweight deployment, and all technologies are fully autonomous and controllable. By opening weights, training details and supporting fine-tuning adaptation platforms and plug-ins, it helps the development of the open source community.
EEGPT: A “decoder” for brain science
Developed by the Brain-like Intelligence and Neural Engineering Research Center of the School of Computer Science at Harbin Institute of Technology, EEGPT is the world's third large EEG model, surpassing existing models in scale and performance. The model covers 30 tasks and 8 million samples of brain activity data. It innovatively adopts a dual self-supervised learning framework, breaks through time and individual differences, and achieves deep and precise decoding of EEG signals. The accuracy has reached the international advanced level, providing reliable technical support for in-depth analysis of brain function and general BCI technology.
The model has been applied to major scientific research projects such as the National Defense Space Program, the National Natural Science Foundation, key R&D projects and provincial key R&D projects, supporting scenarios such as brain-computer interaction, disease warning and cognitive enhancement. The research results have been published in three papers such as NeurIPS, and four patents have been applied for. The launch of EEGPT marks the transition of EEG analysis technology from single-task optimization to general intelligence, enabling brain science to move towards the "precision decoding 2.0 era".
Financial Big Model: The “Intelligent Steward” of Corporate Decision-Making
The "Xingguang" financial vertical model, jointly developed by the Language Technology Research Center of the School of Computer Science of Harbin Institute of Technology and Beijing Huabo Cloud Company, is being deployed in major enterprises in China and will provide strong support for the comprehensive digital transformation of central state-owned enterprises. This financial model is based on a domestic open source base model, using self-developed tool calls, RAG and other technologies, and a large amount of real corporate financial data as fine-tuning knowledge. It focuses on 10 major scenarios such as law, risk control, auditing, business forecasting, chart generation, and voucher generation assisted by large models, and is welcomed by many enterprises and institutions.
At present, many public institutions and large enterprise groups such as the Ministry of Finance, Zhejiang State-owned Assets Supervision and Administration Commission, Hubei State-owned Assets Supervision and Administration Commission, China National Petroleum Corporation, China Shipbuilding, China National Gold, China National Nuclear Corporation, and SF Express have reached agreements with Huaboyun Company. The Language Technology Research Center provides technical support to contribute to the development of my country's digital economy.
Security Big Model: The "Security Guard" of Data Center
Developed by the Computer Network and Information Security Technology Research Center of the School of Computer Science at Harbin Institute of Technology, it can automatically analyze attack paths, accurately perceive attack intentions, quickly locate potential risk nodes, and implement early warning and active defense. It has reached the international leading level in log understanding and analysis, anomaly detection, etc., and has applied for more than 10 patents for related technologies, and published more than 10 papers in international top journals such as IEEE TDSC, IEEE TC, and IEEE TIFS. It has built an AI-driven active defense methodology system, which has been deployed in China Mobile and other institutions, supporting scenarios such as data centers and industrial control systems, and has received support from projects such as the Heilongjiang Provincial Key R&D Plan, promoting active defense from expert systems to intelligent perception and decision-making, and providing a strong guarantee for the safe and stable operation of cloud data centers.
Heavenly Enlightenment: Making Agriculture “Think”
Developed by the Harbin Institute of Technology team of the National Key Laboratory of "Smart Farm Technology and System" and Beidahuang Group, it is the first large-scale agricultural model in China to be registered with the Central Cyberspace Administration of China . It won the highest award of the 2024 International BIS Invention Exhibition - the Diamond Award, and was reported by the English version of China Daily. The model can accurately predict 80+ crop growth indicators, surpassing ChatGPT and DeepSeek in tasks such as agricultural knowledge question and answer and agricultural decision-making. In terms of technological innovation, it integrates knowledge collaborative fine-tuning, hybrid retrieval enhancement and multimodal agricultural knowledge distillation, based on four core scenarios of agricultural knowledge question and answer, multimodal generation, crop growth prediction and embodied intelligent control, and has auxiliary decision-making support capabilities for the entire chain of agricultural production.
The model has applied for 5 patents and 4 software copyrights, and published 6 papers in top journals such as TKDE, KBS, and ESWA, laying a solid foundation for the subsequent development of agricultural AI technology.
Currently, Web, iOS, Android, WeChat applets and API interfaces have been opened, and have been successfully deployed in multiple smart farms covering tens of thousands of acres in Heilongjiang Province, including Qing'an Agricultural High-tech Zone, Nenjiang Farm, and Jianshe Farm, and have received support from the Science and Technology Innovation 2030 Major Project and Heilongjiang Province Key Project.