Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change antctl traceflow command to output results at timeout/failure #1879

Merged
merged 1 commit into from
Feb 23, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 19 additions & 10 deletions pkg/antctl/raw/traceflow/command.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"net"
"strconv"
Expand All @@ -26,7 +27,7 @@ import (

"github.com/spf13/cobra"
"gopkg.in/yaml.v2"
"k8s.io/apimachinery/pkg/api/errors"
k8serrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/rand"
"k8s.io/apimachinery/pkg/util/wait"
Expand Down Expand Up @@ -137,23 +138,31 @@ func runE(cmd *cobra.Command, _ []string) error {
return nil
}

if err := wait.Poll(1*time.Second, 15*time.Second, func() (bool, error) {
tf, err := client.OpsV1alpha1().Traceflows().Get(context.TODO(), tf.Name, metav1.GetOptions{})
var res *v1alpha1.Traceflow
err = wait.Poll(1*time.Second, 15*time.Second, func() (bool, error) {
res, err = client.OpsV1alpha1().Traceflows().Get(context.TODO(), tf.Name, metav1.GetOptions{})
if err != nil {
return false, err
}
if tf.Status.Phase != v1alpha1.Succeeded {
if res.Status.Phase != v1alpha1.Succeeded && res.Status.Phase != v1alpha1.Failed {
return false, nil
}
if err := output(tf); err != nil {
return false, fmt.Errorf("error when outputing result: %w", err)
}
return true, nil
}); err != nil {
})
if err == wait.ErrWaitTimeout {
err = errors.New("timeout waiting for Traceflow done")
Copy link
Contributor

@lzhecheng lzhecheng Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct me if my understanding is wrong.
It seems error "timeout waiting for Traceflow done" is only for L156 to return. On L162, its value is overwritten by output(). Does it really output "timeout waiting for Traceflow done" after outputting partial traceflow result?
I think we can update code like below:

        var timeoutErr error
	if err == wait.ErrWaitTimeout {
		timeoutErr = errors.New("timeout waiting for Traceflow done")
		// Still output the Traceflow results if any.
		if res == nil {
			return timeoutErr
		}
	} else if err != nil {
		return fmt.Errorf("error when retrieving Traceflow: %w", err)
	}

	if err := output(res); err != nil {
		return fmt.Errorf("error when outputing result: %w", err)
	}

	return timeoutErr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a mistake. Plesae ignore.

// Still output the Traceflow results if any.
if res == nil {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we still return a timeout error eventually after outputting the (partial?) result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that too. In your mind should antctl exit with an error in that case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I believe so

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Updated the code.

Copy link
Contributor Author

@jianjuns jianjuns Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err := output(res); err != nil { should not override err?
It is a new err variable defined in the if block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, you're right... Become careless after holiday :(

} else if err != nil {
return fmt.Errorf("error when retrieving Traceflow: %w", err)
}

return nil
if err := output(res); err != nil {
return fmt.Errorf("error when outputting result: %w", err)
}
return err
}

func newTraceflow(client kubernetes.Interface) (*v1alpha1.Traceflow, error) {
Expand Down Expand Up @@ -224,7 +233,7 @@ func dstIsPod(client kubernetes.Interface, ns string, name string) (bool, error)
defer cancel()
_, err := client.CoreV1().Pods(ns).Get(ctx, name, metav1.GetOptions{})
if err != nil {
if errors.IsNotFound(err) {
if k8serrors.IsNotFound(err) {
return false, nil
}
return false, fmt.Errorf("failed to get Pod from Kubernetes API: %w", err)
Expand Down