Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Support for enum query capability of Nexthop Group Type. #989

Merged
merged 3 commits into from
Jan 11, 2022

Conversation

abdosi
Copy link
Contributor

@abdosi abdosi commented Dec 27, 2021

What/Why I did:
Added Support for enum query capability of Nexthop Group Type.
This is needed for DVS testing of Ordered ECMP Feature.
Design Doc: sonic-net/SONiC#896

Signed-off-by: Abhishek Dosi abdosi@microsoft.com

This is needed for DVS testing of Ordered ECMP Feature.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
@abdosi
Copy link
Contributor Author

abdosi commented Dec 29, 2021

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
@abdosi
Copy link
Contributor Author

abdosi commented Jan 3, 2022

@shi-su i see vnet orch test case 7/8/9/11 are failing in the PR job. Can you please check ? I ran job again and it is still failing. Looks to be consistent failure.

cc @prsunny

@shi-su
Copy link
Contributor

shi-su commented Jan 3, 2022

@shi-su i see vnet orch test case 7/8/9/11 are failing in the PR job. Can you please check ? I ran job again and it is still failing. Looks to be consistent failure.

cc @prsunny

Ack checking

@shi-su
Copy link
Contributor

shi-su commented Jan 4, 2022

@shi-su i see vnet orch test case 7/8/9/11 are failing in the PR job. Can you please check ? I ran job again and it is still failing. Looks to be consistent failure.

cc @prsunny

This is a very weird failure. It seems that the same nexthop group gets recreated when a new route is pointing to it. This is an unexpected behavior since the nexthop group should be reused. I did not find any clue why the same nexthop group could get created multiple times. Interestingly, this failure only seems to happen in sairedis pipeline vs test while it is not observed in swss pipeline. On the other hand, I do not think there is anything in sairedis that may affect the behavior. Still investigating.

@shi-su
Copy link
Contributor

shi-su commented Jan 4, 2022

I tried to overwrite swss deb package with the one built from swss pipeline. The failure does not happen anymore after doing so. This seems to indicate that the swss package built from swss and sairedis pipelines are somewhat different. Any idea about what might cause such a difference?

@abdosi
Copy link
Contributor Author

abdosi commented Jan 4, 2022

@shi-su i see vnet orch test case 7/8/9/11 are failing in the PR job. Can you please check ? I ran job again and it is still failing. Looks to be consistent failure.
cc @prsunny

This is a very weird failure. It seems that the same nexthop group gets recreated when a new route is pointing to it. This is an unexpected behavior since the nexthop group should be reused. I did not find any clue why the same nexthop group could get created multiple times. Interestingly, this failure only seems to happen in sairedis pipeline vs test while it is not observed in swss pipeline. On the other hand, I do not think there is anything in sairedis that may affect the behavior. Still investigating.

@shi-su may be it is not binary based of your code . Can we put breakpoint and see the code flow when it is failure.

@shi-su
Copy link
Contributor

shi-su commented Jan 4, 2022

Actually, passes and failures are both pretty consistent from my observation.

@shi-su
Copy link
Contributor

shi-su commented Jan 4, 2022

Actually, passes and failures are both pretty consistent from my observation.

The test cases constantly fail with the deb package from sairedis pipeline and they constantly pass with the package from swss pipeline..

@shi-su
Copy link
Contributor

shi-su commented Jan 4, 2022

I suspect the problem is because the variable weight here https://github.com/Azure/sonic-swss/blob/master/orchagent/nexthopkey.h#L22 is not initialized for nexthops in vnetorch. Created this PR sonic-net/sonic-swss#2096 and let's check if it resolves the problem.

@abdosi
Copy link
Contributor Author

abdosi commented Jan 11, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@abdosi
Copy link
Contributor Author

abdosi commented Jan 11, 2022

I suspect the problem is because the variable weight here https://github.com/Azure/sonic-swss/blob/master/orchagent/nexthopkey.h#L22 is not initialized for nexthops in vnetorch. Created this PR Azure/sonic-swss#2096 and let's check if it resolves the problem.

@shi-su thanks the change worked..

@abdosi abdosi merged commit f36f7ce into sonic-net:master Jan 11, 2022
@abdosi abdosi deleted the ordered_ecmp branch January 11, 2022 06:32
pettershao-ragilenetworks pushed a commit to pettershao-ragilenetworks/sonic-sairedis that referenced this pull request Nov 18, 2022
…-net#989)

What/Why I did:
Added Support for enum query capability of Nexthop Group Type.
This is needed for DVS testing of Ordered ECMP Feature.
Design Doc: sonic-net/SONiC#896
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants