Skip to main content
Version: 3.11.0

Self-healing

NebulaGraph Operator calls the interface provided by NebulaGraph clusters to dynamically sense cluster service status. Once an exception is detected (for example, a component in a NebulaGraph cluster stops running), NebulaGraph Operator automatically performs fault tolerance. This topic shows how Nebular Operator performs self-healing by simulating cluster failure of deleting one Storage service Pod in a NebulaGraph cluster.

Prerequisites

Install NebulaGraph Operator

Steps

  1. Create a NebulaGraph cluster. For more information, see Create a NebulaGraph clusters.

  2. Delete the Pod named <cluster_name>-storaged-2 after all pods are in the Running status.

kubectl delete pod <cluster-name>-storaged-2 --now

<cluster_name> is the name of your NebulaGraph cluster.

  1. NebulaGraph Operator automates the creation of the Pod named <cluster-name>-storaged-2 to perform self-healing.

Run the kubectl get pods command to check the status of the Pod <cluster-name>-storaged-2.

...
nebula-cluster-storaged-1 1/1 Running 0 5d23h
nebula-cluster-storaged-2 0/1 ContainerCreating 0 1s
...
...
nebula-cluster-storaged-1 1/1 Running 0 5d23h
nebula-cluster-storaged-2 1/1 Running 0 4m2s
...

When the status of <cluster-name>-storaged-2 is changed from ContainerCreating to Running, the self-healing is performed successfully.