Kubernetes powers secure innovation at Mastercard’s AI workbench

How Mastercard uses Kubernetes to improve AI productivity and security
Core content:
1. Mastercard builds an AI workbench based on Kubernetes and uses Red Hat OpenShift to ensure data security
2. Use Jupyter Notebooks to accelerate experiments and dynamic GPU allocation to improve resource utilization efficiency
3. Integrate Kubeflow and Spark Operator to achieve AI/ML model training and workflow automation
Mastercard built an AI workbench based on Kubernetes and used Red Hat OpenShift to build offline clusters to ensure data security. It used Jupyter Notebooks to accelerate experiments, and through dynamic GPU allocation and Kubeflow and Spark Operator integration, it achieved efficient AI/ML model training and workflow automation, improved data scientist productivity, and accelerated AI innovation.
Artificial intelligence is not a foreign concept to many businesses today. With the right tools, platforms, and teams, effective application of AI and machine learning to business can grow as an extension of a company’s current infrastructure. For Mastercard, a deep understanding of data science [2] is already embedded within the global payments company.
Alexander Hughes, director of software engineering at Mastercard, said the company’s mission is to “connect and power an inclusive digital economy that benefits everyone by making transactions safe, simple, smart and accessible. We leverage secure data and networks, partnerships and passion to deliver solutions and innovation that help individuals, financial institutions, governments and businesses realize their greatest potential.”
Creating space for experimentation
He and Ravishankar Rao, Chief Kubernetes Architect at Mastercard, spoke at the OpenShift Commons Gathering [3] , a concurrent event at KubeCon Salt Lake City [4] . In their talk, they detailed the transformation they had achieved for data scientists at Mastercard. They began by envisioning a new platform built around the idea of providing a space for experimentation that could be seamlessly and securely moved into production.
“We wanted to ensure that the platform provided a rapid experimentation space for data engineers and data scientists, and we used Jupyter Notebooks [5] to achieve this,” said Hughes. “This, combined with fine-tuned CPU and GPU profiles, enabled efficient resource utilization, enabling rapid iteration and innovative solutions.”
“Next, we started to address the workflow orchestration problem for training,” Hughes said. “As a result, we achieved efficient and scalable machine learning model training through dynamic GPU allocation and a dedicated GPU cluster environment.”
He continued, “We have centralized collaboration capabilities that further enhance the training workflow, making it seamless and efficient. The platform provides the ability to seamlessly register, manage, and share features. This facilitates collaborative feature engineering, ensuring these teams can work together effectively and leverage shared resources.”
Building an AI Workbench on Kubernetes
Key features of Mastercard's AI Workbench on Kubernetes. Source: Mastercard
The new platform, called AI Workbench, also has extremely high security requirements [6] because there is no greater source of credit card information than the credit card providers. It is the essence of private data that forms the core of the Mastercard dataset.
Of course, being able to run this type of AI workbench in offline mode, where neither the workload nor the cluster has any open internet access, is the best-case scenario for this type of work, which is why Mastercard built it using Red Hat OpenShift [7] : the cluster can run in a disconnected environment.
“Everything we do in this workbench is based on Kubernetes [8] , but we want to ensure that the resources of the Kubernetes cluster we are talking about are protected and isolated from general-purpose workloads, and we do this through purpose-built pure AI/ML clusters,” Hughes said. “This ensures a dedicated ecosystem that is tailored for these advanced purposes.”
“In the AI product development space, we noticed that our engineers often performed repetitive tasks, so we implemented automated workflow instantiations to assist with activities such as hyperparameter optimization, model selection, and feature selection,” Hughes explained.
Technical diagram of Mastercard AI Workbench components. Source: Mastercard
Rao described the final steps and process of building AI-Workbench. "We integrated all the components of Kubeflow and Spark Operator so that data scientists can run their AI/ML workloads. In the end, what we actually achieved was that we were able to deploy quite a bit of AI-Workbench in development, staging, and production environments in an automated way, and we were able to onboard a large number of data scientists to the platform and were able to deliver some value-added solutions. The most important aspect was that we were also able to introduce GPU computing, which accelerated the training and reduced it from weeks to days," said Rao.
Mastercard conducted an internal survey to understand what users think of AI-Workbench. Their internal data scientists and developers noted that it is a “great platform for experimentation.” Internal teams were pleased with how easy it was to get data from AI-Workbench on OpenShift, as the platform already supported the necessary libraries and tools.