How Do You Troubleshoot a Kubernetes Cluster That is Not Working at the Node Level?

Problem scenario
The nodes in a Kubernetes cluster are not working. What should you do?

Possible Solution #1
Run this command: kubectl get nodes
For the node that is not healthy, assuming its hostname is called "foobar", run this command: kubectl describe node foobar

Possible Solution #2
If you have pods and no nodes (which could be the case), run this:
kubectl get pods
Then run:
kubectl logs

Possible solution #3
If you are using EKS, look at the user guide:
https://docs.aws.amazon.com/eks/latest/userguide/eks-ug.pdf

Possible solution #4
Run this command: kubectl describe pods
If there is an error about a subnet.env file not existing, see this posting.
If the output says something like "cni config uninitialized", see this posting.

Leave a comment

Your email address will not be published. Required fields are marked *