Financial payment × real-time recommendation: How Milvus supports the “Guess What You Like” service for billions of transactions worldwide

Written by

Silas Grey

Updated on:June-18th-2025

Preface

This article is contributed by a senior user of Milvus community. He is the technical leader of AI, ML and platform architecture in a multinational financial technology giant. In the article, he shared his design practice of post-payment recommendation system, technical trade-offs in database selection, and some workplace experiences and insights in large enterprises.

Everyone who works in the payment industry knows an old saying: no matter how many systems and machines you have running behind the scenes, the 1 second after clicking "Pay" and whether you can "incidentally" make accurate recommendations to users is the key to success or failure.

As one of the world's largest digital payment platforms, our company's products have hundreds of millions of active users, and the system runs tens of billions of transactions a year, with money flowing back and forth in the background like electricity.

In terms of business, we support cross-border settlement of 25 currencies, covering tens of millions of merchants in more than 200 countries, and are connected to tens of millions of websites, providing them with full-scenario services ranging from personal point-to-point transfers to enterprise-level payment solutions.

The number of users is large enough, the concurrency is large enough, and the cultural and customary gaps in the regions covered by the business are deep enough, but all of this is not the most difficult thing - the most difficult thing is to launch a generative AI-driven intelligent recommendation system in a system of such hellish complexity .

This task is mainly undertaken by my company's AI, ML, and platform solutions team.

However, the company’s requirements are not just to create an intelligent recommendation system . We also need to consider building a reusable AI/ML infrastructure for multiple business scenarios on this basis, which can continuously optimize customer experience, improve the level of operational automation, and explore innovative business growth points through cutting-edge technologies such as real-time event stream processing and generative AI (GenAI).

To sum up the situation at that time in one sentence: the demand was urgent and the difficulty was not low.

The following is a review of our complete project experience:

01 Business Challenge: How can intelligent systems support billions of transactions?

In 2023, we launched a strategic project, hoping to use AI to make real-time recommendations on "buying this is more cost-effective" at the moment the user checks out, using all possible signals such as merchant inventory, consumption context, user behavior, language preference, etc.

It sounds simple, just a common "Guess what you like". However, during the project, the team encountered two major technical barriers:

First, the data flood was about to overwhelm the existing platform : tens of billions of transactions per year, daily inventory changes, not to mention calling models to run inferences, even data ingest was already overwhelmed. Because the solutions on the market at the time could not meet the needs in terms of performance and scalability, our team built a graph database a few years ago.
Secondly, the existing vector database is not powerful enough : to achieve personalized recommendations with millisecond-level response, we must rely on efficient vector retrieval capabilities. However, at the start-up phase of the project, vector database products on the market were generally in the early stages of development. They could not support high-throughput real-time data updates, nor could they meet the stringent requirements of enterprise-level production environments for stability and low latency. They could not even pass our production environment requirements during stress testing.

After all, the scale is here, and we want them all: performance, stability, low latency, and low cost.

At that time, we thought we had to develop a vector database on our own, just like before.

02 Technology selection: Why Milvus stands out

At that time, we tested a lot of well-known products on the market, from Weaviate to AlloyDB, and the results were more or less "on the line". You say it's not good, but it seems to run OK; you say it's OK, but problems arise one after another later.

But Milvus is an exception. Its overall measured performance and horizontal scalability exceeded expectations, successfully meeting our various technical indicators for handling our backlog of AI projects.

The specific advantages can be discussed from three aspects:

Breaking the performance limit : Our updated commodity inventory data every hour places strict demands on system throughput. Actual test data shows that Milvus completes the import of the full data set 5-10 times faster than other solutions. A task that competitors need 8 hours to complete can be completed by Milvus in less than 1 hour. This performance advantage directly determines the upper limit of the system's real-time performance.
Flexible and elegant architecture : China has Double 11 and the world has Black Friday. It is common for payment systems to experience traffic peaks and troughs. Milvus's storage and computing separation and dynamic expansion capabilities can greatly improve resource allocation efficiency and help us survive one shopping spree after another.
The development experience is surprisingly smooth : Milvus' community building is well known around the world. Vector database is a new product, but Milvus's clear documentation system, friendly developer tool chain, and active community atmosphere have greatly reduced our learning costs. This ease of use has laid the foundation for the rapid iteration of subsequent AI applications.

After solving these problems, our only remaining concern is: Does an open source project dare to be included in our production chain?

Later, when engineers from Zilliz (Milvus’ actual project development team) showed up, their professionalism and corporate support sealed the deal.

After the overall recommendation system was launched, the effect was amazing. Based on real-time recommendations, dynamic inventory response, and flexible scheduling of product pools, we not only improved the conversion rate, but also significantly improved the user satisfaction index. In payment, don't underestimate any 1% improvement in indicators. When placed on the volume of hundreds of millions of transactions, it is all real money .

03 Surprise after launch: Vector search can do more than just recommendation

As mentioned earlier, the company's requirement is to seamlessly transfer the experience of the intelligent recommendation system to other businesses and build a reusable AI/ML infrastructure for multiple business scenarios . Therefore, after the recommendation system was successfully launched and running stably, we are currently expanding the application scenarios of Milvus to the field of intelligent customer service.

Compared with the "retarded" robots in the past that always irritated users by giving irrelevant answers, the new generation of multilingual customer service robots can automatically handle more than 80% of routine inquiries through vector semantic understanding technology, which is more than one dimension higher than our original keyword-based bot.

At the same time, the benefit of using a vector database is that the customer service system can be connected to the knowledge base in real time . In the past, it was time-consuming, labor-intensive and labor-intensive to synchronize system updates and business changes with global customer service systems to ensure that everyone understood the information. Often, the system has been updated, but the customer service's answers are still based on the previous version, resulting in deviations in user understanding, and then endless complaints.

Now, using vector databases, information alignment can be done by just writing to a database . While reducing labor costs, customer service response speed is increased to seconds.

Next, we are evaluating the migration from Milvus to Zilliz Cloud. Currently, self-built clusters are fine, but to be honest, it takes a lot of effort to maintain them . Using fully managed services can reduce a certain amount of operation and maintenance manpower investment, allowing engineers to focus more on business innovation rather than infrastructure maintenance.

end

A brief summary from a technical person

After reviewing the project, I gained several profound insights:

Insight 1: In implementation, the model is not the bottleneck, but the system is . When many people talk about AI, they like to talk about how powerful the model is and how accurate the reasoning is, but the reality is: the model is only a small part. In fact, 80% of the problems in AI projects will eventually fall into the pit of "the system cannot keep up with the model". Computing power, storage, retrieval, concurrency - optimization must be done one by one.

Second, don’t underestimate the strategic value of infrastructure . Milvus has a data throughput efficiency and elastic architecture that is 5-10 times that of competitors. For us, it is not just about good performance indicators, but it can determine the launch rhythm and quality of the entire AI project, and even the amount of business that our team can get in the future. Without a fast and good transformation of the recommendation system, we will not be able to grab the opportunity to upgrade the subsequent customer service system.

Insight three, don't let engineers be tied to operation and maintenance . Self-built brings freedom, but hosting brings speed. If you are in an enterprise with an AB Test atmosphere or surrounded by strong teams, believe me, compared to whether you save the company a few dollars through self-development, the boss cares more about whether you can complete things that others cannot do the earliest and fastest.

In the future, the speed of enterprise-level AI implementation will increasingly depend on the maturity of the platform infrastructure. Using the best products and building a reusable architecture is always the right logic.

This is true in financial transactions and in all industries.