Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieve subport from CONFIG_DB to enable breakout support #342

Merged
merged 12 commits into from
Apr 4, 2023

Conversation

mihirpat1
Copy link
Contributor

@mihirpat1 mihirpat1 commented Mar 7, 2023

The changes with this PR needs to be merged along with sonic-net/sonic-platform-common#352

Description

Currently, xcvrd has a fixed logic to calculate active host lanes for a port. This needs to be changed to support ports with breakout cable.

Motivation and Context

The fixed host lane mapping will now be replaced by reading the subport from the CONFIG_DB. The subport will then be used to calculate active host lanes by reading the Application Descriptors.

Finding the desired application
The HostInterfaceID and HostLaneCount from Application Descriptors will be used to find the desired application for the port with breakout cable by comparing these with speed and lanes retrieved from CONFIG_DB.

Finding active host lane
Once the desired application is found, the HostLaneAssignmentOptions (bitmap of active host lanes) for the corresponding application and the subport from CONFIG_DB will be used to find the active host lanes.

In addition to the above, check_datapath_init_pending is now being called after ConfigStatusLane register has the value "ConfigSuccess". This has been done to be inline with the CMIS spec.

Note
Once the port_config.ini file has a new column for subport, the CONFIG_DB will be updated with the corresponding subport values (requires minigraph reload though).
The field "subport" will have the below value:
0 - This means the xcvr in non-breakout type
non-zero - The value here corresponds to the subport group (formed by a group of active lanes) to which the port belongs to

How Has This Been Tested?

Testcase summary
The value of host_lanes can be seen in the logs by finding lanemask.
400G related TCs

  1. SFP EEPROM dump
  2. show int status
  3. LLDP o/p
  4. Log dump for CMIS init

200G breakout related TCs - 200G side

  1. SFP EEPROM dump
  2. show int status
  3. LLDP o/p
  4. Log dump for CMIS init

200G breakout related TCs - 100G side

  1. SFP EEPROM dump

  2. show int status

  3. LLDP o/p

  4. Log dump for CMIS init

  5. Shut/no shut on subports (200G and 100G sides)

  6. config reload (of Fixed chassis) with both native and break-out optics inserted

  7. reboot (of Fixed chassis) with both native and break-out optics inserted

  8. Remove and re-insert 2x100G break-out optic at run-time (system running)

  9. xcvrd restart - Ensure link flap is not seen

  10. PMON restart - Ensure link flap is not seen

  11. SWSS restart

  12. SYNCD restart

  13. CONFIG_DB for a port after adding subport to port_config.ini

For additional details, please refer to Unit-test_breakout_channel_v5.txt

The code changes have been tested on a chassis with 200G AOC breakout cable.
Also, the code changes have been tested on a chassis with 400ZR transceiver and 10GBase-SR

Additional Information (Optional)

@mihirpat1 mihirpat1 changed the title Channel breakout support Retrieve channel from CONFIG_DB to enable breakout support Mar 7, 2023
@mihirpat1 mihirpat1 requested a review from prgeor March 7, 2023 07:50
@mihirpat1 mihirpat1 marked this pull request as ready for review March 7, 2023 07:50
@mihirpat1
Copy link
Contributor Author

@prgeor @jaganbal-a @keboliu @shyam77git - It will be great if you can help in reviewing this PR

@prgeor prgeor requested a review from keboliu March 7, 2023 18:09
@prgeor
Copy link
Collaborator

prgeor commented Mar 7, 2023

@shyam77git @jaganbal-a please review

@prgeor
Copy link
Collaborator

prgeor commented Mar 7, 2023

@keboliu please review

@prgeor prgeor added the CMIS label Mar 7, 2023
@jaganbal-a
Copy link

how the 4x10 or 4x25 breakout is taken cared for 100G/40G module ?

@mihirpat1
Copy link
Contributor Author

how the 4x10 or 4x25 breakout is taken cared for 100G/40G module ?

The API get_cmis_application_desired will return the application ID by matching the speed and number of active host lanes from CONFIG_DB. This application ID will then be used by get_cmis_host_lanes function to find the appropriate active host lanes.

@jaganbal-a
Copy link

I get_cmis_application_desired will return the application ID by matching the speed and number of active host lanes from CONFIG_DB. This application ID will then be used by get_cmis_host_lanes function to find the appropriate active host lanes.

get_cmis_host_lanes() is not supposed to be invoked for non-CMIS module isn't it ?
the question is for breakout 4x25G and 4x10G on non-cmis (100G/40G modules)

sonic-xcvrd/xcvrd/xcvrd.py Outdated Show resolved Hide resolved
sonic-xcvrd/xcvrd/xcvrd.py Outdated Show resolved Hide resolved
sonic-xcvrd/xcvrd/xcvrd.py Outdated Show resolved Hide resolved
sonic-xcvrd/xcvrd/xcvrd.py Outdated Show resolved Hide resolved
sonic-xcvrd/xcvrd/xcvrd.py Outdated Show resolved Hide resolved
sonic-xcvrd/xcvrd/xcvrd.py Outdated Show resolved Hide resolved
sonic-xcvrd/xcvrd/xcvrd.py Outdated Show resolved Hide resolved
@mihirpat1
Copy link
Contributor Author

hannel > host_lane_

I get_cmis_application_desired will return the application ID by matching the speed and number of active host lanes from CONFIG_DB. This application ID will then be used by get_cmis_host_lanes function to find the appropriate active host lanes.

get_cmis_host_lanes() is not supposed to be invoked for non-CMIS module isn't it ? the question is for breakout 4x25G and 4x10G on non-cmis (100G/40G modules)

Yes, get_cmis_host_lanes() will not be invoked for non-CMIS module.
Since 4x25G and 4x10G are non-CMIS modules, datapath initialization from xcvrd is not required and hence, the current changes are not needed for such modules.
The expectation is that existing bring-up sequence outside of xcvrd should take care of initial reset for such modules and then, it should work as is without the intervention of xcvrd.

@shyam77git
Copy link

hannel > host_lane_

I get_cmis_application_desired will return the application ID by matching the speed and number of active host lanes from CONFIG_DB. This application ID will then be used by get_cmis_host_lanes function to find the appropriate active host lanes.

get_cmis_host_lanes() is not supposed to be invoked for non-CMIS module isn't it ? the question is for breakout 4x25G and 4x10G on non-cmis (100G/40G modules)

Yes, get_cmis_host_lanes() will not be invoked for non-CMIS module. Since 4x25G and 4x10G are non-CMIS modules, datapath initialization from xcvrd is not required and hence, the current changes are not needed for such modules. The expectation is that existing bring-up sequence outside of xcvrd should take care of initial reset for such modules and then, it should work as is without the intervention of xcvrd.

@mihirpat1 , @prgeor
For non-CMIS compliant modules (i.e. SFF* spec based ones), one cannot expect them to work seamlessly.
Beside xcvrd, following areas to be checked from code-flow and validation standpoint:

  • syncd vertical (syncd-> SAI-> SDK) w.r.t handling the port brea-out
  • use case: shut, no-shut on a particular channel (broken-down/logical port) of an interface
  • though xcvrd vertical seems to have no role (per above-mentioned explanation), still its part of end-to-end validation to ensure no open item/issue
    I'd suggest to file a separate git issue for this

@shyam77git
Copy link

@mihirpat1
Besides the UT cases you mentioned in description section and attached UT logs, could you validate the following

  1. config reload (of Fixed chassis) with both native and break-out optics inserted
  2. reboot (of Fixed chassis) with both native and break-out optics inserted
  3. Mode change: Change existing (inserted) optics to 4x100G or 2x100G breakout mode
  4. Mode change: Change/revert from breakout mode to native mode
  5. Insert break-out optics at run-time (system running): 4x100G or 2x100G break-out optics

@shyam77git
Copy link

Wanted to keep following comment separate from above UT comment:
How about port break-out validation on Modular chassis's LC?
IMO, should be covered here to ensure no issue

@shyam77git
Copy link

@mihirpat1
Don't see mention and/or validation of SONiC breakout CLI !
Could you configure and validate following SONiC breakout CLIs?

#config interface breakout <interface_name>
show interfaces breakout
show interfaces breakout current-mode
show interfaces breakout current-mode <interface_name>

@@ -1121,7 +1159,7 @@ def is_cmis_application_update_required(self, api, channel, speed):
if speed == 0 or channel == 0 or api.is_flat_memory():
return False

app_new = self.get_cmis_application_desired(api, channel, speed)
app_new = self.get_cmis_application_desired(api, host_lane_count, speed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. current logic checks for all 8 host lanes whereas we should care for the requested host lanes only
  2. Also we should make use of the selected/desired application during init state and compare it with the current active application
  3. Also we should check for DP != Activated early in the function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have handled #2 now. As discussed, we are already handling #1 and #3 and have changed CMIS_NUM_CHANNELS to CMIS_MAX_HOST_LANES.

2. Removed unwanted parameter for is_cmis_application_update_required
3. Changed channel to host_lanes_mask in multiple functions to avoid confusion
4. Reading application code and host_lanes_mask as part of just CMIS_STATE_INSERTED
@keboliu
Copy link
Collaborator

keboliu commented Mar 23, 2023

In the Notes, "Once the port_config.ini file has a new column for channel, the CONFIG_DB will be updated with the corresponding channel values (requires minigraph reload though)."

I am a bit of concern that to extend "port_config.ini" for this purpose, AFAIK, "platform.json" was introduced for the dynamic port breakout, I think we should consider extending it instead of "port_config.ini".
by the way, have you already had a PR for this part?

@prgeor
Copy link
Collaborator

prgeor commented Mar 23, 2023

In the Notes, "Once the port_config.ini file has a new column for channel, the CONFIG_DB will be updated with the corresponding channel values (requires minigraph reload though)."

I am a bit of concern that to extend "port_config.ini" for this purpose, AFAIK, "platform.json" was introduced for the dynamic port breakout, I think we should consider extending it instead of "port_config.ini". by the way, have you already had a PR for this part?

@keboliu we can remove the reference to port_config.ini as its not relevant to this Xcvrd changes. Please see my email.

@keboliu
Copy link
Collaborator

keboliu commented Mar 23, 2023

As I commented in PR, sonic-net/SONiC#1290

Using 'channel' to refer to a 'sub-port' is misleading. XCVRD is mainly dealing with optical cables. In the optical cables context, 'channel' is a concept which already widely used and has a clear definition, we even have SONiC platform APIs defined to handle optical cable 'channels'. I suggest to use use another name to avoid confusion.

Another point is that not sure the 'channel' number should be provided explicitly from some configuration, can it be deduced by some logic? It's shouldn't be too complicated.

@mihirpat1 mihirpat1 changed the title Retrieve channel from CONFIG_DB to enable breakout support Retrieve subport from CONFIG_DB to enable breakout support Mar 27, 2023
keboliu
keboliu previously approved these changes Mar 28, 2023
@rajann
Copy link
Contributor

rajann commented Mar 29, 2023

I believe that instead of explicitly giving the subport/channel info in port_config.ini it should be derived from the existing information in port_config.ini (i.e.) we have defined the rules for "name" , "alias" and "index". Same convention should be following in DPB as well. If there is difference in DPB with the above rules, it should be addressed in the DPB document to match it. If we diverge here then we will have to do the same additional information in DPB as well.

@mihirpat1
Copy link
Contributor Author

I believe that instead of explicitly giving the subport/channel info in port_config.ini it should be derived from the existing information in port_config.ini (i.e.) we have defined the rules for "name" , "alias" and "index". Same convention should be following in DPB as well. If there is difference in DPB with the above rules, it should be addressed in the DPB document to match it. If we diverge here then we will have to do the same additional information in DPB as well.

As discussed, having a logic in xcvrd to calculate subport info introduces dependencies on neighboring subports belonging to the same physical group. This also relies on the fact that redis-db will always populate info for all subports in a physical port group before xcvrd actually starts using info for a subport.
In addition to this, the current physical to logical port list is not in sorted order and will need to be sorted in appropriate order before using it.

host_lane_count:
Integer, number of lanes on the host side
subport:
Integer, subport id of the group which the host lanes belong to (1 based),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihirpat1 subport:
1-based logical port number of the physical port after breakout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the new diff now.

prgeor
prgeor previously approved these changes Apr 4, 2023
@mihirpat1 mihirpat1 dismissed stale reviews from prgeor and keboliu via 9b61a80 April 4, 2023 02:02
@mihirpat1 mihirpat1 merged commit 9f3a124 into sonic-net:master Apr 4, 2023
@prgeor
Copy link
Collaborator

prgeor commented Apr 4, 2023

@yxieca please help cherry pick to 202205
@StormLiangMS please help cherry pick to 202211

yxieca pushed a commit that referenced this pull request Apr 5, 2023
Retrieve subport from CONFIG_DB to enable breakout support

---------

Signed-off-by: Mihir Patel <patelmi@microsoft.com>
lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Apr 6, 2023
Based on the port breakout HLD, we are now using subport instead of channel in the CONFIG_DB PORT table to handle port breakout. The yang schema needs to be modified accordingly to handle the corresponding change.
The corresponding code changes have been merged through sonic-net/sonic-platform-daemons/pull/342 merged

Signed-off-by: Mihir Patel <patelmi@microsoft.com>
StormLiangMS pushed a commit that referenced this pull request Apr 11, 2023
Retrieve subport from CONFIG_DB to enable breakout support

---------

Signed-off-by: Mihir Patel <patelmi@microsoft.com>
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Apr 19, 2023
…14519)

Based on the port breakout HLD, we are now using subport instead of channel in the CONFIG_DB PORT table to handle port breakout. The yang schema needs to be modified accordingly to handle the corresponding change.
The corresponding code changes have been merged through sonic-net/sonic-platform-daemons/pull/342 merged

Signed-off-by: Mihir Patel <patelmi@microsoft.com>
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Apr 20, 2023
…14519)

Based on the port breakout HLD, we are now using subport instead of channel in the CONFIG_DB PORT table to handle port breakout. The yang schema needs to be modified accordingly to handle the corresponding change.
The corresponding code changes have been merged through sonic-net/sonic-platform-daemons/pull/342 merged

Signed-off-by: Mihir Patel <patelmi@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants