Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic_patch: AddRack: Port should be turned as admin up as last change #1930

Closed
renukamanavalan opened this issue Nov 16, 2021 · 2 comments · Fixed by #1998
Closed

Generic_patch: AddRack: Port should be turned as admin up as last change #1930

renukamanavalan opened this issue Nov 16, 2021 · 2 comments · Fixed by #1998

Comments

@renukamanavalan
Copy link
Contributor

renukamanavalan commented Nov 16, 2021

Description

In AddRack scenarion where it adds a port and related config, the port should be marked admin-up, only upon completing all other configuration

Steps to reproduce the issue

  1. Create a config-db.json w/o a port & related settings
  2. Create a patch from diff between config-db.json w/o port and existing config_db.json that has all the config.
  3. Apply it

Describe the results you received

Patch Applier: The patch was sorted into 48 changes:
Patch Applier:   * [{"op": "add", "path": "/ACL_TABLE/EVERFLOW/ports/8", "value": "Ethernet64"}]
Patch Applier:   * [{"op": "add", "path": "/ACL_TABLE/EVERFLOWV6/ports/8", "value": "Ethernet64"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/10.0.0.33", "value": {"admin_status": "up"}}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/10.0.0.33/asn", "value": "64001"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/10.0.0.33/holdtime", "value": "10"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/10.0.0.33/keepalive", "value": "3"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/10.0.0.33/local_addr", "value": "10.0.0.32"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/10.0.0.33/name", "value": "ARISTA01T0"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/10.0.0.33/nhopself", "value": "0"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/10.0.0.33/rrclient", "value": "0"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/fc00::42", "value": {"admin_status": "up"}}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/fc00::42/asn", "value": "64001"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/fc00::42/holdtime", "value": "10"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/fc00::42/keepalive", "value": "3"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/fc00::42/local_addr", "value": "fc00::41"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/fc00::42/name", "value": "ARISTA01T0"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/fc00::42/nhopself", "value": "0"}]
Patch Applier:   * [{"op": "add", "path": "/BGP_NEIGHBOR/fc00::42/rrclient", "value": "0"}]
Patch Applier:   * [{"op": "add", "path": "/BUFFER_PG/Ethernet64|3-4", "value": {"profile": "pg_lossless_40000_40m_profile"}}]
Patch Applier:   * [{"op": "add", "path": "/BUFFER_PG/Ethernet64|0", "value": {"profile": "ingress_lossy_profile"}}]
Patch Applier:   * [{"op": "add", "path": "/BUFFER_QUEUE/Ethernet64|0-2", "value": {"profile": "egress_lossy_profile"}}]
Patch Applier:   * [{"op": "add", "path": "/BUFFER_QUEUE/Ethernet64|3-4", "value": {"profile": "egress_lossless_profile"}}]
Patch Applier:   * [{"op": "add", "path": "/BUFFER_QUEUE/Ethernet64|5-6", "value": {"profile": "egress_lossy_profile"}}]
Patch Applier:   * [{"op": "add", "path": "/DEVICE_NEIGHBOR/Ethernet64", "value": {"name": "ARISTA01T0"}}]
Patch Applier:   * [{"op": "add", "path": "/DEVICE_NEIGHBOR/Ethernet64/port", "value": "Ethernet1"}]
Patch Applier:   * [{"op": "add", "path": "/DEVICE_NEIGHBOR_METADATA/ARISTA01T0", "value": {"hwsku": "Arista-VM"}}]
Patch Applier:   * [{"op": "add", "path": "/DEVICE_NEIGHBOR_METADATA/ARISTA01T0/lo_addr", "value": "None"}]
Patch Applier:   * [{"op": "add", "path": "/DEVICE_NEIGHBOR_METADATA/ARISTA01T0/mgmt_addr", "value": "172.16.131.40"}]
Patch Applier:   * [{"op": "add", "path": "/DEVICE_NEIGHBOR_METADATA/ARISTA01T0/type", "value": "ToRRouter"}]
Patch Applier:   * [{"op": "add", "path": "/INTERFACE/Ethernet64", "value": {}}]
Patch Applier:   * [{"op": "add", "path": "/INTERFACE/Ethernet64|10.0.0.32~131", "value": {}}]
Patch Applier:   * [{"op": "add", "path": "/INTERFACE/Ethernet64|FC00::41~1126", "value": {}}]
Patch Applier:   * [{"op": "replace", "path": "/PORT/Ethernet64/description", "value": "ARISTA01T0:Ethernet1"}]
Patch Applier:   * [{"op": "add", "path": "/PORT/Ethernet64/admin_status", "value": "up"}]
Patch Applier:   * [{"op": "add", "path": "/PORT_QOS_MAP/Ethernet64", "value": {"dscp_to_tc_map": "AZURE"}}]
Patch Applier:   * [{"op": "add", "path": "/PORT_QOS_MAP/Ethernet64/pfc_enable", "value": "3,4"}]
Patch Applier:   * [{"op": "add", "path": "/PORT_QOS_MAP/Ethernet64/pfc_to_queue_map", "value": "AZURE"}]
Patch Applier:   * [{"op": "add", "path": "/PORT_QOS_MAP/Ethernet64/tc_to_pg_map", "value": "AZURE"}]
Patch Applier:   * [{"op": "add", "path": "/PORT_QOS_MAP/Ethernet64/tc_to_queue_map", "value": "AZURE"}]
Patch Applier:   * [{"op": "add", "path": "/QUEUE/Ethernet64|1", "value": {"scheduler": "scheduler.0"}}]
Patch Applier:   * [{"op": "add", "path": "/QUEUE/Ethernet64|2", "value": {"scheduler": "scheduler.0"}}]
Patch Applier:   * [{"op": "add", "path": "/QUEUE/Ethernet64|5", "value": {"scheduler": "scheduler.0"}}]
Patch Applier:   * [{"op": "add", "path": "/QUEUE/Ethernet64|4", "value": {"scheduler": "scheduler.1"}}]
Patch Applier:   * [{"op": "add", "path": "/QUEUE/Ethernet64|4/wred_profile", "value": "AZURE_LOSSLESS"}]
Patch Applier:   * [{"op": "add", "path": "/QUEUE/Ethernet64|6", "value": {"scheduler": "scheduler.0"}}]
Patch Applier:   * [{"op": "add", "path": "/QUEUE/Ethernet64|3", "value": {"scheduler": "scheduler.1"}}]
Patch Applier:   * [{"op": "add", "path": "/QUEUE/Ethernet64|3/wred_profile", "value": "AZURE_LOSSLESS"}]
Patch Applier:   * [{"op": "add", "path": "/QUEUE/Ethernet64|0", "value": {"scheduler": "scheduler.0"}}]

Describe the results you expected

Same as above with exception that line that marked port up should be at the end.

Patch Applier: * [{"op": "add", "path": "/PORT/Ethernet64/admin_status", "value": "up"}]

Additional information you deem important (e.g. issue happens only occasionally)

Sample files to help with repro

issue_repro.tar.gz

@ghooo
Copy link
Contributor

ghooo commented Nov 19, 2021

[from email thread]
Subject: Generic Patch Updater: Port admin status change

Hi @mohamed Ghoneim,

The below topic needs vetting by SMEs, @prince Sunny, @neetha John, @guohan Lu

Request:
Whenever a patch has admin status down for a port, order it first
“[{"op": "add", "path": "/PORT/Ethernet64/admin_status", "value": "down"}]”

Vice versa, whenever it has it up, order it last.
“[{"op": "add", "path": "/PORT/Ethernet64/admin_status", "value": "up"}]”

Additionally, whenever a patch has only “port – up” but not down, insert a down at the start explicitly.

Reason:
When a patch has “admin down” for a port, likely it could be associated with more changes related to that port. For example, cable length, qos-parameters or alike. If these changes reach consumers, while port is still up, it could be a complex challenge to handle these changes for an active port. So marking the port as down at the start and then send the rest of the changes, would help consumers to process other updates on that port, with ease.

In the same line, when a port is marked up (like in AddRack case --- sample here), the patch is often accompanied by other changes related to the port. So send a patch to ensure, it is explicitly marked down, if it is not currently (or it could be redundant too – benign no-op). Then send all other changes and at the end , send the patch to mark it up. This way you let the consumers configure the port in inactive state and make it go active at the end.

Even in the case, where patch does not have other updates for port, as the port admin-status is like a main switch, turning it down at the start & up at the end, will work in all scenarios.

Thank you,
Renuka

@ghooo
Copy link
Contributor

ghooo commented Nov 19, 2021

[replying to the email thread i.e. last comment]
I have thought about this problem for a little bit and I think the solution would be to.

  • If there is a change to a “port-critical” config that needs the port to be down
    • we can inject port admin down before if the port is not already admin down
    • then the port will be brought back up afterwards only if this is the final state in the patch

This will take care of the 3 scenarios mentioned.

  1. Whenever a patch has admin status down for a port, order it first
    “[{"op": "add", "path": "/PORT/Ethernet64/admin_status", "value": "down"}]”

If the patch contains "port-critical" config, then we will inject port down first if the port is not already down. This means the port admin down happened first.

  1. Vice versa, whenever it has it up, order it last.
    “[{"op": "add", "path": "/PORT/Ethernet64/admin_status", "value": "up"}]”

If the patch contains "port-critical" config, we will start by checking if the port is admin down, I think the port is most probably already down. In this case we will just turn it back up after the "port-critical" config. This means the port admin up happened last.

  1. Additionally, whenever a patch has only “port – up” but not down, insert a down at the start explicitly.
    If the patch contains "port-critical" config already, then we will turn the port-down first and then turn it back up afterwards.

Identifying port-critical changes

  • Optimally we should mark them in YANG models using a YANG extension
    • Similar to the CreateOnly flag ext that we have the patch-sorter design doc here
  • Alternatively we can just handle identifying these configs in the code, maybe hard-coded into a python list or something

ghooo added a commit that referenced this issue Feb 18, 2022
#### What I did
Fixes #1930 

Issue can be summarized into these 2 case:
* A patch can contain an operation to turn port `admin up` -- do at the end
* A patch that modifies config that needs the port to be down -- start by bringing the port down then do the update then bring it back up

#### How I did it
The configs that need the port to be down, we will call them `critical-port` configs.

* Added a move validator to validate `critical-port` configs, which does the following
  * If a port is up, make sure the move does not make changes to related `critical-port` configs
  * If the move is turning a port up, make sure there is no `critical-port` config are still left in the patch

* Added a move extender to `critical-port` changes:
  * If a port is up, bring down if there are critical-port changes
  * If the move is turning a port up, flip it to down if there are still `critical-port` configs left. In other words, do not turn the port back up until all `critical-port` changes are in done.

#### How to verify it
* Added `AddRack` unit-test to `tests/generic_config_updater/files/patch_sorter_test_success.json`
* Other unit-tests

#### Previous command output (if the output of a command-line utility has changed)
Check issue #1930 for old sorting order

#### New command output (if the output of a command-line utility has changed)
Check  `AddRack` unit-test to `tests/generic_config_updater/files/patch_sorter_test_success.json` for new sorting-order
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants