WeChat chat box has built-in gold coins, and another round of super app evolution begins

WeChat chat box adds AI assistant "Yuanbao Red Envelope Cover Assistant", opening a new chapter in AI technology infrastructure.
Core content:
1. WeChat built-in AI assistant "Yuanbao Red Envelope Cover Assistant", based on Tencent Hunyuan large model
2. Yuanbao Assistant actual test: chat, answer questions, draw pictures, customize red envelope cover and other functional experience
3. DeepSeek V3 model update highlights: 685 billion parameters, front-end development capabilities, open source ecology, etc.
01
Actual test shows that AI is smoother
DeepSeek released an important update to the V3 model on March 24, 2024. Although this version is not the expected V4 or R2 iteration, it has achieved significant breakthroughs in performance and open source policy. The following are the key points of this update:
Technical Specifications and Release Information
The model parameter scale reaches 685 billion, which is a minor version upgrade (V3-0324)
Post-training optimization based on the new 32K GPU cluster
It has been opened for use through multiple channels such as the official website, App and mini-programs
The open source version was launched simultaneously, continuing the high cost-effectiveness of the first generation V3 " the cost of $5.576 million is comparable to Claude3.5 "
Core capability improvement
The front-end development capability is close to the expert level of Claude3.7. User testing shows:
Can generate complete HTML files including CSS animation and JavaScript interaction (such as dynamic effects of weather cards)
The code quality is significantly better than the old R1 model (the comparison case shows a clear gap between visual effects and functional implementation)
The generated effect in the website building test is comparable to Claude3.7 Sonnet
Support complex command parsing (such as switching multiple weather animations through functions/button groups)
Improved context understanding, especially in multi-round conversation scenarios
Can accurately handle cross-language mixed programming needs (not shown in the example but implied in the text)
Open source ecosystem construction
Adopt more relaxed open source protocols to reduce restrictions on commercial applications
Continuing the advantage of the original V3 as the first open source model to enter the top ten of the Chatbot Arena list
Within 1 hour of release, it attracted global developers to participate in testing and verification
Industry impact and user feedback
It is rated by professional users as " equivalent to the upgrade from Claude Sonnet 3.5 to 3.6 "
In the absence of official benchmarks, user test data becomes the main basis for verification
The developer community is highly concerned about the commercialization possibilities brought about by the adjustment of its open source protocol
Future Outlook
The official has not disclosed the development progress of V4/R2, but user expectations continue to rise
The current version has established a new benchmark in the field of open source large models, which may intensify industry competition
A breakthrough in programming capabilities
Enhanced multimodal understanding
This update marks that DeepSeek has narrowed the gap with top closed-source models through technological iteration while maintaining its cost advantage, showing disruptive potential in the field of programming in particular. Its open source strategy adjustment may reshape the large model ecosystem.
Qingming Festival, as one of China's traditional festivals, is not only a time to remember our ancestors and express our grief, but also a good opportunity to get close to nature and feel the breath of spring. Here are some suggestions for activities that can be done during Qingming Festival:
Ancestor Sweeping : Go to the ancestral tomb or cemetery, clean the grave, offer flowers, paper money, etc. to express remembrance and respect for deceased relatives.
Go outing to enjoy the spring : Take advantage of the Qingming holiday to go out for an outing with family and friends, enjoy the beautiful scenery of spring, breathe fresh air, and relax.
Planting trees : Participating in tree planting activities and adding a touch of green to the earth is both a return to nature and a tribute to life.
Taste Qingming food : Make or buy Qingming Festival-specific food, such as green rice balls, Qingming fruit, etc., to experience the flavor of the traditional festival.
Cultural experience : Visit museums and cultural sites to learn about the historical and cultural background of Qingming Festival and increase your knowledge.
Family gathering : Use the holidays to reunite with family, share family love, and enjoy the warmth of family.
Quiet Meditation : In a quiet environment, meditate or contemplate, reflect on life, and plan for the future.
Charity activities : Participate in charity activities organized by the community or charity organizations, help those in need, and spread positive energy.
Outdoor sports : Engage in outdoor activities such as hiking, cycling, and picnicking to exercise and enjoy the fun of the outdoors.
Photography creation : Bring your camera to capture the beautiful moments of spring, do photography creation, and record the beauty of life.
No matter which method you choose, it is important to feel the meaning of the festival with your heart, cherish the time spent with family and friends, and also pay attention to safety and environmental protection to make Qingming Festival a meaningful and enjoyable holiday.
Technical analysis and industry insights into document content
1. Evolution of Language Model Technology
Core breakthrough :
The paradigm shift from N-gram to Transformer has increased the number of language model parameters from millions (GPT-1) to trillions (GPT-4).
Self-supervised learning (MLM/NSP) solves the problem of massive data labeling, and the pre-training + fine-tuning model becomes the industry standard
The word embedding dimension has been expanded from 768 (BERT) to 12888 (GPT-3), and the semantic representation ability has been improved exponentially.
Key technical indicators :
Training data volume: GPT-3 uses 45TB of data (about 1 trillion words), equivalent to 13.51 million Oxford dictionaries
Computing cost: ChatGPT training requires 10,000 V100 GPUs, costing over 1 billion RMB
Model efficiency: DeepSeek reduces the training cost to 1/3 of the same-scale model through technologies such as sparse attention
2. Transformer Architecture Innovation
Revolutionary technology :
The self-attention mechanism achieves O(n²) global correlation modeling, which is a significant breakthrough compared to RNN’s sequence dependency.
The number of multi-head attention layers increased from 12 (BERT) to 96 (GPT-3), and the context window expanded from 512 to 32K (Llama2)
Position encoding has evolved from absolute position (Sinusoidal) to relative position (RoPE), which can better handle long sequences.
Engineering practice breakthrough :
Mixed precision training (FP16/FP32) saves 40% of video memory consumption
Gradient Checkpointing technology achieves 100-fold sequence length expansion
Tensor parallelism + pipeline parallelism improves the training efficiency of models with hundreds of billions of parameters by 80%
3. DeepSeek’s Technological Breakthrough
Contribution to the open source ecosystem :
Model architecture: Proposes a dynamic sparse attention mechanism, which increases the inference speed by 2.3 times compared to Llama
Training efficiency: MoE architecture enables feasible training of trillion-parameter models on the Qianka cluster
Chinese optimization: Building a bilingual pre-trained corpus of 2.6 trillion tokens in Chinese and English
Performance index comparison :
Mathematical reasoning: DeepSeek-Math-7B achieves 83.5% accuracy on GSM8K (GPT-4 is 92%)
Code generation: HumanEval scored 68.9% (CodeLlama-34B scored 53.7%)
Multimodal Understanding: ViT-22B achieves 88.7% top-1 accuracy on ImageNet-21K
4. Industry Development Trends
Technology frontier direction :
Multimodal fusion: GPT-4o achieves 200ms cross-modal response delay (average human reaction time is 250ms)
Reasoning Ability Breakthrough: DeepSeek-V3 Reaches IMO Gold Medalist Level in Theorem Proving Tasks
Energy efficiency: New hybrid architectures (such as DeepSeek-R1) achieve 5x more computing power per watt
Competition between China and the United States :
The gap between open source models has been shortened from 12 months to 3 months (Llama3 vs DeepSeek-v2)
Computing power infrastructure: China's planned computing power of the intelligent computing center under construction reaches 2000EFLOPS (the United States currently operates 1200EFLOPS)
Industry application penetration rate: China's manufacturing AI quality inspection deployment rate reached 37%, surpassing the United States' 29%
5. Key data insights
Economic perspective :
Marginal cost curve for large model training: for every 10-fold increase in parameter size, the unit token training cost decreases by 28%
ROI cycle: The commercialization return cycle of the leading enterprise model has been shortened from 36 months to 18 months
Talent density: The salary gap between top AI researchers in China and the United States has narrowed from 50% to 15%
Technological Ethical Considerations :
Safe alignment: Latest RLHF technology reduces the probability of harmful output from 3.2% to 0.07%
Energy consumption optimization: The PUE value of the liquid cooling cluster was optimized from 1.5 to 1.08, and carbon emissions were reduced by 40%.
Explainability: Causal attribution algorithm visualizes 85% of decision paths
Note: The companies and technical parameters mentioned in this article are all from public data. The analysis is based on the law of technological evolution, and the specific implementation needs to be adjusted in combination with industry dynamics. It is recommended to pay attention to the official website of Zhejiang University CCAI Center to obtain the latest research results.
02
Accelerating AI deployment