High availability (HA)

Automatic, autonomous and self-healing HA

Mission-critical applications require a resilient Kubernetes cluster, which can lose any node and still provide reliable services. Zero downtime is a necessity in production environments. Fault tolerance is especially critical for remote, unattended clusters, appliances and industrial IoT, where access is limited and distributed across a country, a continent or the globe.

MicroK8s makes high availability resilient and self-healing with no administrative interventions required.

What is high availability Kubernetes?

A highly available Kubernetes cluster can withstand a failure on any component and continue serving workloads without interruption. Three factors are necessary for a highly available Kubernetes cluster:

  • There must be more than one worker node. Since MicroK8s uses every node as a worker node, there is always more than one worker if there is more than one node in the cluster.
  • The Kubernetes API services must be running on more than one node so that losing a single node would not render the cluster inoperable. Every node in the MicroK8s cluster is an API server, which simplifies load-balancing and means we can switch instantaneously to a different API endpoint if one fails.
  • The cluster state must be in a reliable datastore. By default, MicroK8s uses Dqlite, a high-availability SQLite, as its datastore.

All that is required for HA MicroK8s is three or more nodes in the cluster, at which point Dqlite is automatically highly available. If the cluster has more than three nodes, then additional nodes will be standby candidates for the datastore and promoted automatically if the datastore loses one of its nodes. The automatic promotion of standby nodes into the voting cluster of Dqlite makes MicroK8s HA autonomous and ensures that quorum is maintained even if no administrative action is taken.

HA Kubernetes cluster with MicroK8s

MicroK8s delivers a production-grade Kubernetes cluster simply by adding more MicroK8s nodes. There is no extra configuration required - install MicroK8s on three machines, run the join command to link them together and in moments you have a production-grade Kubernetes cluster with HA enabled automatically.

HA MicroK8s provides API services on all nodes. This means that any node in the cluster can be a target for kubectl. Administrators can perform tasks on any node. Three of the nodes are automatically selected to provide the datastore for the Kubernetes control plane, based on their capacity and utilisation. In the event of a datastore node failure, a different node is promoted to participate in the datastore consensus.

Autonomous HA Kubernetes

HA Kubernetes can be complex, but MicroK8s set-up and configuration is automated with a single command to install or cluster it. As soon as the cluster includes three or more nodes, Dqlite is resilient and the API services are distributed on all of them. If one node should fail or be restarted, Kubernetes keeps running and will recover itself back to full HA when the node becomes available again, with no administrative action. Autonomy combined with high availability delivers a full Kubernetes with minimal setup, able to support mission-critical workloads with operational efficiency.

Dqlite datastore for autonomous high availability

MicroK8s supports high availability using Dqlite as the datastore for cluster state. Dqlite is a fast, embedded, persistent SQL database that is perfect for fault-tolerant IoT devices and micro clouds. In conjunction with MicroK8s, it removes process overhead by embedding the database inside Kubernetes itself and reduces the overall memory footprint of the cluster.

Raft Consensus Algorithm

Fault-tolerant distributed systems need to reach consensus. Multiple servers need to ‘agree’ on replicated state values. Once consensus is reached the values are final, even in the event of failure of a minority of the servers. Raft is the best-practice consensus algorithm.

Using raft, Dqlite automatically handles leader election, elects a replacement leader if one fails and ensures that the cluster state is always preserved. Nodes that are not part of the voting group can either be standby, waiting to be added to the voting quorum, or spare, ready to be added if required. The processes of HA cluster formation, Dqlite syncing, voter and leader elections are fully autonomous to ensure minimum administration is needed.

Read more about how HA MicroK8s works in the documentation ›

Need help?

Get in touch with one of our engineers.

Contact us