Note
- All links in this article lead to the page github.com
Ever wondered how a major financial institution like PostFinance keeps its digital operations running smoothly and efficiently? Dive into the world of Kubernetes at PostFinance and learn how we tackle the complexities of cloud orchestration, discover the innovative solutions we’ve developed, such as the open-source kubenurse monitoring tool, and find out how these tools not only enhance our own operations, but also contribute to the global tech community.
Kubernetes has become the de-facto orchestration platform in the industry. It allows PostFinance to dynamically and reliably scale banking workloads across multiple data centers according to customer needs. In addition, because Kubernetes is open source and has gained widespread adoption, it enables us to benefit from the latest developments in cloud orchestration technologies, and even gives us the chance to play a leading role by providing open-source software ourselves and by actually contributing to Kubernetes.
We operate around 30 shared Kubernetes clusters across multiple network zones and environments, which are based on the vanilla / open-source distribution of Kubernetes. Unlike many companies, we simply use open-source tools (such as kubeadm and ansible) to provision for and manage our clusters, which gives us a great degree of freedom and allows us to update and benefit from the latest releases.
Given the highly dynamic and distributed nature of Kubernetes, where applications are spread across a multitude of compute nodes (or servers if you prefer) and are expected to move around frequently, the task of ensuring adequate network performance and latency between all nodes is a complex one. Even more so because for some of the applications running on Kubernetes, the slightest network issue can result in a major delay when trying to pay by card or withdraw money.
The link will open in a new window Kubenurse is an open-source distributed network-monitoring application written and maintained by PostFinance. It runs on all our Kubernetes compute nodes and constantly performs a series of network checks towards neighboring nodes, towards the http(s) entry point of the cluster itself and more. It helps us to monitor network stability and latency and allows us to quickly identify nodes that are showing suboptimal performance or that are losing network packets.
First, we benefit a great deal from the fact the Kubernetes is open source, and it is therefore only fair that we contribute to the community by making a network-monitoring tool available to all other users, especially as the tool doesn’t contain any confidential data or business logic.
It also allows us to benefit from contributions from users across the globe (such as this improvement, this fix, or this minor improvement), and to open discussions on issues we haven’t encountered (yet), such as this discussion about optimizing the tool for extremely large clusters, in which cross-company exchanges between engineers led to an appropriate solution.
The fact that kubenurse is open source means that its source code is readable by anyone, and that any user can contribute to the project or raise an issue and start a discussion.
It also typically implies that the project will be published and made available to all users, and ideally that PostFinance will continue to maintain it (i.e. update the tool, implement new features and fix potential bugs or issues).
The tool is distributed across all compute nodes on the cluster, and a series of checks are performed to ensure that the various network paths across the nodes are valid. kubenurse also collects precise latency measurements and exposes them as metrics, which later allow us to build visualizations/dashboards and to issue an alert when a given metric exceeds our service-level objective.
A Grafana dashboard is made readily available in the project’s open-source code repository, and kubenurse users can import it into their Grafana instance to start visualizing and interpreting measurements carried out by kubenurse.
Once this has been done, a dashboard like the one shown below typically becomes available in a matter of minutes, providing detailed real-time insights into the overall network health of the cluster.
PostFinance benefits from open-source projects like Kubernetes and its cloud orchestration platform, and contributes to the ecosystem by making several projects, including the kubenurse network-monitoring tool, available to the community.
Bi-directional, cross-company collaboration of this kind enables PostFinance and others to benefit from cutting-edge software and to collaborate on complex issues, ultimately establishing a state-of-the-art cloud platform where performance and reliability ensure a reliable user-developer experience.