Recommended Kubernetes cluster sizing

Recommended Kubernetes cluster sizing

Rhize runs on Kubernetes.

This document provides compute recommendations for the nodes, pods services of your Rhize Install. Some services also have recommended replication factors to increase reliability.

Node recommendations

The following tables are the minimum recommended sizes to provision your cluster for Rhize 3.2.1.

Rhize nodes

For high availability, Rhize recommends a minimum of three nodes with the following specifications.

PropertyValue
Number of nodes3
CPU Speed (GHz)3.3
vCPU per Node16
Memory per node (GiB)32 (64 is better)
Persisted volumes12
Persisted Volume IOPS5000
PV Throughput (MBps)500
Total Disk Space (TB)3
Disk IOPS5000
Disk MBps500MBps

Rhize agent

The Rhize agent typically runs on the edge, outside of the cluster entirely. For the Rhize Agent, the minimum recommended specifications are as follows:

PropertyValue
CPU Speed (GHz)2.8
vCPU per Node2
Memory per node (GiB)1
Persisted volumes1

Service-level recommendations

The following table lists the minimum recommended specifications for the main services. Services with stateful PV have a persistent volume per pod.

ServicePods for HA (replica count)vCPU per PodMemory Per PodStateful PVDiskSize (GiB)Comments
baas-alpha3816 (at least)Yes750High throughput and IOPS
baas-zero322Yes350High throughput and IOPS
libre-core312NoN/AHA requires 2 pods, but 3 is to avoid hotkey issues and balance load
bpmn-engine312NoN/AHA requires 2 pods, but 3 is to avoid hotkey issues and balance load
nats312Yes100High IOPS
nats-box10.250.25NoN/A
libre-audit211NoN/A
libre-audit-postgres212Yes250Runs in pod with libre-audit
libre-ui30.250.25NoN/A
keycloak212NoN/A
keycloak-postgres212No200Runs in pod with keycloak
router212Yes<1Requires volume to compose supergraph
grafana*30.52No20-50Storage can be in host or in object bucket.

Monitoring stack

The following table provides minimal compute recommendations for the monitoring stack.

The default recommendation is to run your Rhize observability stack in the nodes that also run the Rhize application. However, some deployments prefer to separate monitoring to its own cluster.

ServicePods for HA (replica count)vCPU cores per podMemory per podDiskSize (GiB)
grafana30.5250GB
prometheus-node40.250.05N/A
prometheus-server1 per pod121
promtail40.250.2N/A
loki1111
loki-logs1 per pod0.250.1N/A
loki-canary40.250.1N/A
loki-gateway10.250.050.25
loki-grafana-operator10.250.10.25
tempo-compactor10.2520.25
tempo-ingester30.50.751.5
tempo-querier10.250.50.25
tempo-distributor10.250.50.25
tempo-query-frontend10.250.50.25
temp-memcache10.250.10.25