-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mclag enhacements support code changes. #1331
Conversation
Hi @Praveen-Brcm |
Thanks kelly for checking. Yes, we have the plan and the changes will be added soon. |
Hi @Praveen-Brcm
|
Can you please resolve conflicts |
@Praveen-Brcm Wonder if we can arrange some time for you to walk through the design/changes together? |
@gechiang : Sure we can have a call to go over the design(captured in HLD)/code changes. please suggest your convenient time this week .? Thanks. |
@Praveen-Brcm Thanks! my calendar is open from 2 to 5 PM on all workdays except Tuesday. Let me know! |
retest this please |
|
@Praveen-Brcm - please add vs test cases for mlag. See https://github.com/Azure/sonic-swss/tree/master/tests |
@rlhui Thank you for the input. |
@Praveen-Brcm 3PM 11/18/2020 works for me. Thanks! |
orchagent/fdborch.cpp
Outdated
{ | ||
observer->update(SUBJECT_TYPE_FDB_FLUSH_CHANGE, &flushUpdate); | ||
} | ||
notify(SUBJECT_TYPE_FDB_CHANGE, &flushUpdate); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SUBJECT_TYPE_FDB_FLUSH_CHANGE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pettershao-ragilenetworks: Thanks for the comment. Corrected the type.
{ | ||
//If the MAC is dynamic_local change the origin accordingly | ||
//MAC is added/updated as dynamic to allow aging. | ||
storeFdbData.origin = FDB_ORIGIN_LEARN; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when modify to FDB_ORIGIN_LEARN, removeFdbEntry() 's condition (fdbData.origin != origin) will always be true, remove remote entry will directly return, this is not expective, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pettershao-ragilenetworks: Thanks for the comment. Once the MAC is converted to Local, Remote FDB will not be expected from ICCPD and function remoteFdbEntry will not get called. MAC is supposed to be aged out via the event SAI_FDB_EVENT_AGED. Hope this clarifies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but from my testing, removeFdbEntry will be called, and remote fdb entry will remove failed:
Jul 30 07:30:15.847477 sonic INFO swss#orchagent: :- removeFdbEntry: 1633: FdbOrch RemoveFDBEntry: mac=00:10:94:00:00:33 fdb origin is different; found_origin:1 delete_origin:8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pettershao-ragilenetworks: Since the MAC is updated as local MAC, the origin is set to LOCAL. When the MAC moves from remove to local, ICCPD further updates the MCLAG_FDB_TABLE to delete the MAC entry added earlier. In the case of remove FDB called with different origin than existing origin is always ignored. This is TRUE for any MAC type L2, VxLAN, or MCLAG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so fdb entry syncd from remote peer will not be remoted immediately when remote peer do fdb flush action, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, when remote fdb added, and remote fdb deleted(remote peer flush fdb maybe), local peer will not remove the remote fdb entry by event but by age event, right?
remove
so fdb entry syncd from remote peer will not be remoted immediately when remote peer do fdb flush action, right?
@pettershao-ragilenetworks: Yes, If the ICCPD adds the MAC as local ( in certain scenarios like Iccpd session down..etc) the MAC entry is as good as Local HW learned. Remote peer flush will not cause delete , it can only be deleted by Local AGE or FLUSH events.
but the remote fdb is programed as STATIC, AGE or FLUSH will not remove them, right?
orchagent/fdborch.cpp
Outdated
|
||
m_entries[entry] = storeFdbData; | ||
|
||
string key = "Vlan" + to_string(vlan.m_vlan_info.vlan_id) + ":" + entry.mac.to_string(); | ||
|
||
if (fdbData.origin != FDB_ORIGIN_VXLAN_ADVERTIZED) | ||
status = sai_fdb_api->create_fdb_entry(&fdb_entry, (uint32_t)attrs.size(), attrs.data()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why set again since line 1362 may do the same action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pettershao-ragilenetworks: Thanks for the comments. this duplicate piece looks got introduced while rebase/merging. Corrected.
orchagent/observer.h
Outdated
@@ -2,12 +2,14 @@ | |||
#define SWSS_OBSERVER_H | |||
|
|||
#include <list> | |||
#include <algorithm> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this include
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prsunny :Thanks, Done.
orchagent/observer.h
Outdated
|
||
using namespace std; | ||
using namespace swss; | ||
|
||
enum SubjectType | ||
{ | ||
SUBJECT_TYPE_ALL_CHANGES, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this used? If not, please remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prsunny :Thanks, the code used to use was part of optimization for the subject calss, which is no more part of the PR. Removed the attribute.
@@ -16,8 +18,14 @@ enum SubjectType | |||
SUBJECT_TYPE_MIRROR_SESSION_CHANGE, | |||
SUBJECT_TYPE_INT_SESSION_CHANGE, | |||
SUBJECT_TYPE_PORT_CHANGE, | |||
SUBJECT_TYPE_BRIDGE_PORT_CHANGE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who is sending this update? Can you point me to code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prsunny : This is been processed in isolationgrouporch.cpp:233. The update is sent from portsOrch part of addBridgePort and removeBridgePort calls. thanks.
orchagent/fdborch.cpp
Outdated
|
||
m_entries[entry] = storeFdbData; | ||
|
||
string key = "Vlan" + to_string(vlan.m_vlan_info.vlan_id) + ":" + entry.mac.to_string(); | ||
|
||
if (fdbData.origin != FDB_ORIGIN_VXLAN_ADVERTIZED) | ||
if ((fdbData.origin != FDB_ORIGIN_MCLAG_ADVERTIZED) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think should be &&, right?
if ((fdbData.origin != FDB_ORIGIN_MCLAG_ADVERTIZED) &&
(fdbData.origin != FDB_ORIGIN_VXLAN_ADVERTIZED))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pettershao-ragilenetworks : Thanks for pointing to the error. You're right and is corrected now.
SWSS_LOG_INFO("FdbOrch RemoveFDBEntry: mac=%s fdb origin is different; found_origin:%d delete_origin:%d", | ||
entry.mac.to_string().c_str(), fdbData.origin, origin); | ||
if ((origin == FDB_ORIGIN_MCLAG_ADVERTIZED) && (fdbData.origin == FDB_ORIGIN_LEARN) && | ||
(port.m_oper_status == SAI_PORT_OPER_STATUS_DOWN) && (gMlagOrch->isMlagInterface(port.m_alias))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one question:
when remote fdb added, and remote fdb deleted(flush for example), local peer will not remove the remote fdb entry(sync early)(peerlink up), but by local age event, right?
show mac(sync, port already modify to peerlink)
No. Vlan MacAddress Port Type
----- ------ ----------------- ------------ ------
1 10 00:10:94:00:00:33 PortChannel2 dynamic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pettershao-ragilenetworks: If the MAC gets added by ICCPd as remote ( Sync MAC from peer MCLAG node) the entry will be programmed as STATIC ( while allowing MAC move) There is no AGE event expected, also as you rightly said the FLUSH will not let the HW remove the MAC entry. MAC gets removed only when ICCPd sends delete event for MAC entry.
{ | ||
//If the MAC is dynamic_local change the origin accordingly | ||
//MAC is added/updated as dynamic to allow aging. | ||
storeFdbData.origin = FDB_ORIGIN_LEARN; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so fdb entry syncd from remote peer will not be remoted immediately when remote peer do fdb flush action, right?
@@ -300,6 +389,47 @@ void FdbOrch::update(sai_fdb_event_t type, | |||
} | |||
} | |||
|
|||
// If MAC is MCLAG remote do not delete for age event, Add the MAC back.. | |||
if (existing_entry->second.origin == FDB_ORIGIN_MCLAG_ADVERTIZED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this can't be reach since addFdbEntry() will modify origin to FDB_ORIGIN_LEARN to store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pettershao-ragilenetworks : This is to address timing scenarios and platforms which doesnt support SAI_FDB_ENTRY_ATTR_ALLOW_MAC_MOVE. If the MAC is added by ICCPD then the origin is set as FDB_ORIGIN_MCLAG (not to age) If the FdbOrch receives an AGE event ( in timing scenarios or on the platform which doesnt have support for SAI_FDB_ENTRY_ATTR_ALLOW_MAC_MOVE) and MAC entry origin is FDB_ORIGIN_MCLAG, since the MAC is remotely leared it should not get deleted so FdbOrch re-adds the MAC entry.
so fdb entry syncd from remote peer will not be remoted immediately when remote peer do fdb flush action, right? |
@gechiang , @prsunny , @akokhan ,@qiluo-msft ,@pettershao-ragilenetworks : Can you please help approve the PR and merge. Thanks . |
@prsunny @akokhan @qiluo-msft @pettershao-ragilenetworks, please be reminded to approve the code PR today, this is blocking the other code PRs from sonic-buildimage |
@Praveen-Brcm Please submit a submodule PR for SWSS so that this change is included in submodule. |
This reverts commit 7aca82d.
attrs.push_back(attr); | ||
|
||
if ((fdbData.origin == FDB_ORIGIN_VXLAN_ADVERTIZED) && (fdbData.type == "dynamic")) | ||
if ((fdbData.origin == FDB_ORIGIN_VXLAN_ADVERTIZED) || (fdbData.origin == FDB_ORIGIN_MCLAG_ADVERTIZED) | ||
|| (fdbData.type == "dynamic")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THIS modification does not look RIGHT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lguohan : Thanks for pointing out. Thats right as part of WB, when the MAC addresses are added via the bake function, the attribute SAI_FDB_ENTRY_ATTR_ALLOW_MAC_MOVE is set wrongly.
Will raise a PR and fix this today. Thanks.
* Mclag enhacements support code changes. * Adding change to allow MCLAG remote MAC move. * Added support for adding mclag remote mac to kernel, on top of PR-1276 * Updating the change from PR1276 and PR885. * Adding new orchfiles to mock_tests * MCLAG Unique IP support changes. * Removed dependency with PR 885. * Adding observer support for mlagorch. * Fixed FDB notifiation issue * Fixing the test_mclag_fdb type attributes. * Remove as the change may not be supported on non-brcm for PortChannel settings. * Removing the isolation group handling from Mlagorch, Isolation group now will be added/updated only via mclagsyncd updates. * Added back the update function. Co-authored-by: Tapash Das <tapash.das@broadcom.com>
* Mclag enhacements support code changes. * Adding change to allow MCLAG remote MAC move. * Added support for adding mclag remote mac to kernel, on top of PR-1276 * Updating the change from PR1276 and PR885. * Adding new orchfiles to mock_tests * MCLAG Unique IP support changes. * Removed dependency with PR 885. * Adding observer support for mlagorch. * Fixed FDB notifiation issue * Fixing the test_mclag_fdb type attributes. * Remove as the change may not be supported on non-brcm for PortChannel settings. * Removing the isolation group handling from Mlagorch, Isolation group now will be added/updated only via mclagsyncd updates. * Added back the update function. Co-authored-by: Tapash Das <tapash.das@broadcom.com>
Hi @Praveen-Brcm but portsorch don't handle this field in doLagTask |
@chenkelly can you please open a new issue to track this? Please show a real example on how to configure it and produce the doLagTask issue as you described in that new issue creation and I will assign it to BRCM team to investigate/fix. |
What I did
This PR contains the code changes to support MCLAG enhancements feature #MCLAG Enhancements HLD SONiC#596
Why I did it
Implemented as per the MCLAG enhancements HLD mentioned at : MCLAG Enhancements HLD SONiC#596
How I verified it
Mclag specific VS tests included part of check-in.
Details:
The PR includes the code changes for the MCLAG enhancements support.
This PR must work with submitted PR's in other sonic repositories which are listed below.
> Iccpd: sonic-net/sonic-buildimage#4819
> swss-common: sonic-net/sonic-swss-common#405
> Mclagsyncd: #1349
> Utilities: sonic-net/sonic-utilities#1138
> Mgmt-FW: sonic-net/sonic-mgmt-framework#59
> Mgmt-Common: sonic-net/sonic-mgmt-common#25