Javatar81 Javatar81 - 9 months ago 56
MySQL Question

Is it recommended to run clustered database with Kubernetes in production environment?

Is it reasonable to use Kubernetes for a clustered database such as MySQL in production environment?

There are example configurations such as mysql galera example. However, most examples do not make use of persistent volumes. As far as I've understood persistent volumes must reside on some shared file system as defined here Kubernetes types of persistent volumes. A shared file system will not guarantee that the database files of the pod will be local to the machine hosting the pod. It will be accessed over network which is rather slow. Moreover, there are issues with MySQL and NFS, for example.

This might be acceptable for a test environment. However, what should I do in a production environment? Is it better to run the database cluster outside Kubernetes and run only application servers with Kubernetes?

Answer Source

The Kubernetes project introduced PetSets, a new pod management abstraction, intended to run stateful applications. It is an alpha feature at present (as of version 1.4) and moving rapidly. A list of the various issues as we move to beta are listed here. Quoting from the section on when to use petsets:

A PetSet ensures that a specified number of "pets" with unique identities are running at any given time. The identity of a Pet is comprised of:

  • a stable hostname, available in DNS
  • an ordinal index
  • stable storage: linked to the ordinal & hostname

In addition to the above, it can be coupled with several other features which help one deploy clustered stateful applications and manage them. Coupled with dynamic volume provisioning for example, it can be used to provision storage automatically.

There are several YAML configuration files available (such as the ones you referenced) using ReplicaSets and Deployments for MySQL and other databases which may be run in production and are probably being run that way as well. However, PetSets are expected to make it a lot easier to run these types of workloads, while supporting upgrades, maintenance, scaling and so on.

You can find some examples of distributed databases with petsets here.

The advantage of provisioning persistent volumes which are networked and non-local (such as GlusterFS) is realized at scale. However, for relatively small clusters, there is a proposal to allow for local storage persistent volumes in the future.