-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[swss] L2 Forwarding Enhancements #1716
Conversation
This reverts commit 047841e.
- Invoke flush Fdb entries per port/vlan - Instead of maintaining the vlan-members inside Port structure, added a new portVlanmap for vlan members. /* Performance issue are seen during mac event processing with large number of vlans on a * port. Large data should not be added to port structure. */ - created a portOidToName map to improve performace - delay lag delete until bridge port is deleted - delay bridge port removal until all asociated fdb entries are removed. delete bridge port from portsorch instead of fdborch
@prsunny , can you take a look at this? |
orchagent/port.h
Outdated
* port. Large data should not be added to this | ||
* structure. | ||
*/ | ||
//vlan_members_t m_vlan_members; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the commented section. You can mention it in the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
orchagent/portsorch.cpp
Outdated
} | ||
} | ||
auto itr = portOidToName.find(id); | ||
if (itr == portOidToName.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
orchagent/portsorch.cpp
Outdated
auto itr = portOidToName.find(id); | ||
if (itr == portOidToName.end()) | ||
return false; | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{
-> next line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
orchagent/portsorch.cpp
Outdated
} | ||
else | ||
/* Cannot locate the VLAN */ | ||
if ((ret == true) && m_portVlanMember[port.m_alias].empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the logic changed here?. Just keep the previous one and use m_portVlanMember
instead of port.m_vlan_members
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also checking the return value of removeBridgePort() and retry if it returns false
orchagent/portsorch.cpp
Outdated
@@ -3671,6 +3646,36 @@ bool PortsOrch::addBridgePort(Port &port) | |||
|
|||
if (port.m_bridge_port_id != SAI_NULL_OBJECT_ID) | |||
{ | |||
/* If the port is being added to the first VLAN, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't quite understand this section. Is this for the previous bridge port? In my understanding, this API is for the first bridge port addition, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous bridge port delete is still not done, as fdb flush is in progress. But the port is added to a vlan again and now we want to reuse the previous bridgeport, so just make the admin state up.
When removing bridgeport, if fdb entries exist, we just disable admin state of the bridgeport and keep retrying until no more references exist before deleting the bridgeport.
orchagent/portsorch.cpp
Outdated
|
||
sai_status_t status = sai_bridge_api->set_bridge_port_attribute(port.m_bridge_port_id, &attr); | ||
if (status != SAI_STATUS_SUCCESS) | ||
if (port.m_bridge_port_admin_state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this flow, in which case the m_bridge_port_admin_state
would be false
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bridgeport deletion is not done if fdb entries exist, only admin state is set to down. bridgeport deletion will be retried and deleted when fdb entries are deleted.
orchagent/portsorch.cpp
Outdated
@@ -3876,21 +3888,31 @@ bool PortsOrch::addVlan(string vlan_alias) | |||
} | |||
} | |||
|
|||
SWSS_LOG_NOTICE("Create an empty VLAN %s vid:%hu", vlan_alias.c_str(), vlan_id); | |||
SWSS_LOG_NOTICE("Create an empty VLAN %s vid:%hu vlan_oid 0x%lx", vlan_alias.c_str(), vlan_id, (long unsigned int)vlan_oid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use PRIx64. You can refer other logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@prsunny please help complete the review and merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments
orchagent/portsorch.cpp
Outdated
{ | ||
getPort(itr->second, port); | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls fix alignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
orchagent/portsorch.cpp
Outdated
auto itr = portOidToName.find(bridge_port_id); | ||
if (itr == portOidToName.end()) | ||
return false; | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ -> new line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
orchagent/portsorch.cpp
Outdated
return true; | ||
} | ||
auto itr = portOidToName.find(bridge_port_id); | ||
if (itr == portOidToName.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@dgsudharsan , can you please review? |
orchagent/portsorch.cpp
Outdated
@@ -4236,6 +4217,36 @@ bool PortsOrch::addBridgePort(Port &port) | |||
|
|||
if (port.m_bridge_port_id != SAI_NULL_OBJECT_ID) | |||
{ | |||
/* If the port is being added to the first VLAN, | |||
* set bridge port admin status to UP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't the iterator in the remove_vlan member come back and delete this since it loops until last FDB is removed and then does remove_bridge_port?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @dgsudharsan that there could be race condition since the removeBridgePort
is not complete and the iterator is retried for removal after this addition. I think this logic is slightly complicated. Since the main motivation is to separate the vlan member map from the Port class, could you please make that change and get it merged for this release. We can have the bridge port handling as a different PR.
@@ -4894,6 +4931,16 @@ bool PortsOrch::addLag(string lag_alias, uint32_t spa_id, int32_t switch_id) | |||
{ | |||
SWSS_LOG_ENTER(); | |||
|
|||
auto lagport = m_portList.find(lag_alias); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some comments. I believe this is the scenario where removeBridgePort is not yet successful due to fdb flush right?
orchagent/portsorch.cpp
Outdated
if (!setHostIntfsStripTag(port, SAI_HOSTIF_VLAN_TAG_KEEP)) | ||
{ | ||
SWSS_LOG_ERROR("Failed to set %s for hostif of port %s", | ||
hostif_vlan_tag[SAI_HOSTIF_VLAN_TAG_KEEP], port.m_alias.c_str()); | ||
return false; | ||
} | ||
m_portList[port.m_alias] = port; | ||
portOidToName[port.m_bridge_port_id] = port.m_alias; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we saving both port object id to alias an bridge pot id to alias in data structure? It also has vlan and lag. Shouldn't we rename this to something like saiOidToAlias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. will change the name to saiOidToAlias.
@dr412113 , @anilkpan , @anilkpandey , please check the compilation issue. |
@prsunny the compilation error does not seem to be related to any changes in this PR. There is no change in aclorch.
/usr/include/c++/8/bits/stl_map.h: In member function 'virtual void AclRuleDTelFlowWatchListEntry::update(SubjectType, void*)': |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
not sure why its consistently failing for this PR.. could you please rebase once and lets retry? |
The GCC 7.1 warnings are not causing the failure, failure is due to a type mismatch in logging. I just fixed and updated. |
/azpw run |
1 similar comment
/azpw run |
vs test passed earlier. The only change in my last commit was to fix a type mismatch in SWSS_LOG. |
/AzurePipelines run |
Commenter does not have sufficient privileges for PR 1716 in repo Azure/sonic-swss |
This seems to be the failure: [2021-11-19 00:47:19] [build-stderr] saiaclschema_ut.cpp:1:10: fatal error: gmock/gmock.h: No such file or directory |
@anilkpan , retrying now! |
@anilkpan , could you please rebase and push? |
@prsunny Its passing now after rebase |
What I did
/* Performance issue are seen during mac event processing with large number of vlans on a
removed. delete bridge port from portsorch instead of fdborch
Why I did it
To improve FDB performance.
How I verified it
Verified all mac operations.
Details if related