Skip to content

Commit

Permalink
Updated with chassisd config-db changes
Browse files Browse the repository at this point in the history
  • Loading branch information
mprabhu-nokia committed Sep 2, 2020
1 parent 106c511 commit 7e1e1d8
Showing 1 changed file with 25 additions and 5 deletions.
30 changes: 25 additions & 5 deletions doc/pmon/pmon-chassis-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,16 +157,16 @@ Modular Chassis has control-cards, line-cards and fabric-cards along with other
* The UP/DOWN events will be added to syslog.

This comment has been minimized.

Copy link
@shyam77git

shyam77git Oct 1, 2020

Contributor

Went through the doc and these revised changes. Can you please update/confirm on following understanding?
ChassisD is a daemon and part of PMON container.
From platform stack standpoint, there would be total 2 DBs on every card (CC aka supervisor, LC).
One DB is referred as local REDIS-DB and another one as global REDIS-DB.

local REDIS-DB content/ownership:
The below-mentioned state DB and config DB, both are part of local REDIS-DB on CC/Supervisor.
show platform output would come from local REDIS-DB of respective card (CC/Supervisor , LC)

Only one global REDIS-DB and it would reside on CC/Supervisor. Its content/ownership:
local environmental sensors of CC (and FCs on it) + sensors of all remote LCs

This comment has been minimized.

Copy link
@shyam77git

shyam77git Oct 1, 2020

Contributor

Follow-on question:

  • 'show platform psustatus' and 'show platform fan' data is stored in local REDIS-DB or global REDIS-DB ? of PMON/ChassisD
* Vendor-specific API will be provided to take action on any change event.

#### Schema
#### State-DB Schema

The schema for CHASSIS_CARD_INFO table in State DB is:
The schema for CHASSIS_MODULE_TABLE in State DB is:
```
key = CHASSIS_CARD_INFO | <card index>;
key = CHASSIS_MODULE_INFO | <card index>;
; field = value
name = STRING ; name of the card
slot = 1*2DIGIT ; slot number in the chassis
instance = 1*2DIGIT ; slot number in the chassis for cards
status = "Empty" | "Online" | "Offline" ; status of the card
type = "control"| "line" | "fabric" ; card-type
device-type = "CONTROL-CARD"| "LINE-CARD" | "FABRIC-CARD" ; card-type
```

#### Prototype Code
Expand Down Expand Up @@ -242,6 +242,26 @@ class LineCard(LineCardBase):

Additionally, *get_change_event()* can be implemented to handle asynchronous notification of the line-card UP/DOWN events.

This comment has been minimized.

Copy link
@shyam77git

shyam77git Oct 9, 2020

Contributor

get_change_event() async notification from platform to PMON (chassisD) to be leverage-able beyond card Up/DOWN events.
For e.g.
a) some platform LC(s) or LC variant(s) may come up after get_card_name() call from chassisD to platform is invoked. As a result, platform to update these LC(s) card state later via get_change_event() async call.
b) power-zone failure is an event. an opaque string to accompany as to which voltage rail(s) failed.
c) Also, platform to notify expected_action to take on detecting fault

Current signature of get_change_event is a dictionary of {'device_type': {'device_id':'device_event', 'device_id':'device_event', ...},

In order to cater to aforementioned use-cases, recommend updating it to:
dictionary of {'device_type': {'device_id':'device_event':'opaque_string':'expected_action',
'device_id':'device_event': 'opaque_string':'expected_action',
...},
}
device_event enum list : insert, remove, fault, card_name

  • with device_event as card_name: opaque_string would be filled with card_name i
  • possible expected_action (on 'fault' as device_event) could be: none, reload CPU, power-cycle/reload whole-board, shutdown board etc.

Basically, building the hierarchy based on device_event (enum) list so as it make it extensible/scalable for current and future use-cases


#### Configuration
Configuration will be provided to administratively bring down a line-card or fabric-card. This can be further extended to other components of the chassis like PSUs and FANs.

```
Configuration to administratively bring down the module
#config chassis_modules admin_down <module_name> <device_type> <instance_number>
Configuration to remove the adminstrative down state of module
#config chassis_modules del <module_name>
```

#### Config-DB Schema
The schema for CHASSIS_MODULE table in Config DB is:
```
key = CHASSIS_MODULE | <unique-name>;
; field = value
instance = 1*2DIGIT ; instance number of the device-type
device-type = "LINE-CARD" | "FABRIC-CARD" |"PSU" | "FAN" ; device-type
admin-status = "up" | "down" ; admin-status

This comment has been minimized.

Copy link
@shyam77git

shyam77git Oct 2, 2020

Contributor

status = "Empty" | "Online" | "Offline" ; status of the card

  1. Would status field hold the current state of the card?
    a) Online state meaning card presence is detected but does not imply anything further w.r.t card's bootup / current state - booting/ fully booted-up/ sonic running ?
    b) empty meaning card (LC/FC) not present?
    c) Does 'offline' card state imply card is present in the chassis but brought down/ powered-off due to HW or SW detected fault?
    d) Who would decided this and on what criteria? ChassisD on its own? or through input from platform (via get_status() and get_change_event() APIs)?

  2. admin-status = "up" | "down" ; admin-status
    This means administrative bring-down of the card (LC/FC) i.e. user initiated/CLI driven bring down of the card.
    I believe this config would persist in the local redis-DB of the CC/Supervisor - right?

Can you please confirm the following?
Without this admin config (i.e. default case), card state = online ; card admin-status = up;
On doing this admin config, card state = online ; card admin-status = down;
On removing/deleting this config, card state = online ; card admin-status = up;

```

This comment has been minimized.

Copy link
@shyam77git

shyam77git Oct 1, 2020

Contributor

config-DB has all its fields same as state-DB except an additional one - admin-status
Trying to see..the reason behind maintaining these two as separate DBs?
Wondering if config-DB could be enhanced to add 'admin-status', then only one DB (with all required fields) to be maintained!

#### Show command

This comment has been minimized.

Copy link
@shyam77git

shyam77git Oct 1, 2020

Contributor
  1. "The show platform command is enhanced to show chassis information"
    Beside cards (FCs, LCs), show platform would display PSUs and Fans information(as shown in DB schema above)
    Is this right?
    In that case, what additional info show platform psustatus would have/show?

  2. The below-mentioned show platform output shows optics modules as well - one per row.
    No optics module entry would show up on Supervisor/CC as each LC would take care and display its own/local optics module - Is this the right understanding ?

The *show platform* command is enhanced to show chassis information

This comment has been minimized.

Copy link
@shyam77git

shyam77git Oct 5, 2020

Contributor

Assuming that local REDIS-DB is initialized and maintained by chassisD.

  1. ChassisD, as part of its initialization, would first determine which all cards/device_type (FCs, LCs etc.) present in the chassis and which all slots, via calls to platform/vendor. Then, populate its local REDIS-DB.
  2. Post that, make an iterative call for each card_type towards platform?vendor to gather card inventory details (card_type, state, description etc.) and keep populating its local REDIS-DB entry corresponding to each card/device_type.
  3. platform/vendor not to deal with local REDIS-DB of ChassisD/PMON
  4. show platform (whenever invoked by the user), would simply pick the data/entries from this local REDIS-DB
  5. Any further update (such as change in card state) would be updated from platform/vendor to PMON/chassisD via get_change_event() call

Can you confirm/update, this is how the work-flow would be or different?


Expand Down

0 comments on commit 7e1e1d8

Please sign in to comment.