Self-healing
NebulaGraph Operator calls the interface provided by NebulaGraph clusters to dynamically sense cluster service status. Once an exception is detected (for example, a component in a NebulaGraph cluster stops running), NebulaGraph Operator automatically performs fault tolerance. This topic shows how Nebular Operator performs self-healing by simulating cluster failure of deleting one Storage service Pod in a NebulaGraph cluster.
Prerequisites
Steps
-
Create a NebulaGraph cluster. For more information, see Create a NebulaGraph clusters.
-
Delete the Pod named
<cluster_name>-storaged-2after all pods are in theRunningstatus.
kubectl delete pod <cluster-name>-storaged-2 --now
<cluster_name> is the name of your NebulaGraph cluster.
- NebulaGraph Operator automates the creation of the Pod named
<cluster-name>-storaged-2to perform self-healing.
Run the kubectl get pods command to check the status of the Pod <cluster-name>-storaged-2.
...
nebula-cluster-storaged-1 1/1 Running 0 5d23h
nebula-cluster-storaged-2 0/1 ContainerCreating 0 1s
...
...
nebula-cluster-storaged-1 1/1 Running 0 5d23h
nebula-cluster-storaged-2 1/1 Running 0 4m2s
...
When the status of <cluster-name>-storaged-2 is changed from ContainerCreating to Running, the self-healing is performed successfully.