Watch and wait for POD deletion with Golang k8s client

Issue

I need to watch (and wait) until a POD is deleted. I need to this is because I need to start a second pod (with the same name) immediately after the first one has been deleted.

This is what I’m trying:

func (k *k8sClient) waitPodDeleted(ctx context.Context, resName string) error {
    watcher, err := k.createPodWatcher(ctx, resName)
    if err != nil {
        return err
    }

    defer watcher.Stop()

    for {
        select {
        case event := <-watcher.ResultChan():

            if event.Type == watch.Deleted {
                k.logger.Debugf("The POD \"%s\" is deleted", resName)

                return nil
            }

        case <-ctx.Done():
            k.logger.Debugf("Exit from waitPodDeleted for POD \"%s\" because the context is done", resName)
            return nil
        }
    }
}

The problem with this approach is that when I get the Deleted event, is when the POD receives the event for deletion, but not when it is actually deleted. Doing some extra tests I ended debugging the process with this code:

case event := <-watcher.ResultChan():

    if event.Type == watch.Deleted {
        pod := event.Object.(*v1.Pod)
        k.logger.Debugf("EVENT %s PHASE %s MESSAGE %s", event.Type, pod.Status.Phase, pod.Status.Message)
    }

The log result for this is:

2022-02-15T08:21:51 DEBUG EVENT DELETED PHASE Running MESSAGE
2022-02-15T08:22:21 DEBUG EVENT DELETED PHASE Running MESSAGE

I’m getting two Deleted events. The first one right away I send the delete command. The last one when the pod is effectively deleted from the cluster.

My questions are:

  • Why I’m getting two Deleted events? How can I differentiate one from another? I’ve tried to compare the two events and they seems exactly the same (except the timestamps)
  • What is the best approach to watch and wait for a pod deletion, so I can immediately relaunch it? should I poll the API until the pod is not returned?

The usecase I’m trying to solve:
In my application there is a feature to replace a tool with another with different options. The feature needs to delete the pod that contains the tool and relaunch it with another set of options. In this scenario I need to wait for the pod deletion so I can start it again.

Thanks in advance!

Solution

As I said in the comments, the real problem was the watcher I was creating to watch the pod I want to get deleted. In the watcher I was creating a LabelSelector that was selecting two pods instead of one. This is the complete solution, including the watcher.

func (k *k8sClient) createPodWatcher(ctx context.Context, resName string) (watch.Interface, error) {
    labelSelector := fmt.Sprintf("app.kubernetes.io/instance=%s", resName)
    k.logger.Debugf("Creating watcher for POD with label: %s", labelSelector)

    opts := metav1.ListOptions{
        TypeMeta:      metav1.TypeMeta{},
        LabelSelector: labelSelector,
        FieldSelector: "",
    }

    return k.clientset.CoreV1().Pods(k.cfg.Kubernetes.Namespace).Watch(ctx, opts)
}

func (k *k8sClient) waitPodDeleted(ctx context.Context, resName string) error {
    watcher, err := k.createPodWatcher(ctx, resName)
    if err != nil {
        return err
    }

    defer watcher.Stop()

    for {
        select {
        case event := <-watcher.ResultChan():

            if event.Type == watch.Deleted {
                k.logger.Debugf("The POD \"%s\" is deleted", resName)

                return nil
            }

        case <-ctx.Done():
            k.logger.Debugf("Exit from waitPodDeleted for POD \"%s\" because the context is done", resName)
            return nil
        }
    }
}

func (k *k8sClient) waitPodRunning(ctx context.Context, resName string) error {
    watcher, err := k.createPodWatcher(ctx, resName)
    if err != nil {
        return err
    }

    defer watcher.Stop()

    for {
        select {
        case event := <-watcher.ResultChan():
            pod := event.Object.(*v1.Pod)

            if pod.Status.Phase == v1.PodRunning {
                k.logger.Infof("The POD \"%s\" is running", resName)
                return nil
            }

        case <-ctx.Done():
            k.logger.Debugf("Exit from waitPodRunning for POD \"%s\" because the context is done", resName)
            return nil
        }
    }
}

Answered By – Alejandro González

Answer Checked By – Dawn Plyler (GoLangFix Volunteer)

Leave a Reply

Your email address will not be published.