Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlnx-sfp-plugin] enhancement to support transceiver sensor monitoring #1839

Merged
merged 4 commits into from
Aug 2, 2018
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 62 additions & 1 deletion device/mellanox/x86_64-mlnx_lssn2700-r0/plugins/sfputil.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,34 @@
except ImportError as e:
raise ImportError("%s - required module not found" % str(e))

def determine_in_docker():
with open('/proc/self/cgroup') as f:
lines = [line.rstrip('\n') for line in f]
for line in lines:
words = line.split('/')
if len(words) > 1 and words[1] == 'docker':
return True

return False

INSIDE_DOCKER = determine_in_docker()

if INSIDE_DOCKER:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only connect to the Redis DB if running inside a Docker container? If this is to determine whether or not the plugin is being used by xcvrd, it's not an intuitive check. I think it would be better to create a function to connect to the State DB and call that function in get_transceiver_change_event().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sfputil plugin is also called by 'show interface transceiver' CLI on the host, but the swsscommon is not available on the host, to make it work both on the host and the docker, need to conditionally import swsscommon.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revised.

from swsscommon import swsscommon
Copy link
Collaborator

@qiluo-msft qiluo-msft Jul 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swsscommon [](start = 9, length = 10)

Changing behaviors based on library availability or docker environment is not a good idea. #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revised. the swsscommon only imported in the new add function 'get_transceiver_change_event', since it will only be called from pmon container.

REDIS_HOSTNAME = "localhost"
REDIS_PORT = 6379
REDIS_TIMEOUT_USECS = 0

state_db = swsscommon.DBConnector(swsscommon.STATE_DB,
REDIS_HOSTNAME,
REDIS_PORT,
REDIS_TIMEOUT_USECS)

# Subscribe to state table for SFP change notifications
sel = swsscommon.Select()
sel_tbl = swsscommon.NotificationConsumer(state_db, 'TRANSCEIVER_NOTIFY')
Copy link
Collaborator

@qiluo-msft qiluo-msft Jul 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TRANSCEIVER_NOTIFY [](start = 57, length = 18)

Just curious which component provide the notification? As https://github.com/Azure/SONiC/blob/gh-pages/doc/OIDsforSensorandTransciver.MD, why let the provider write STATE DB directly? #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft on top of the previous comments, on mlnx platform, this notification is exposed from another daemon inside syncd(##1841), and that daemon post the notification to the DB. For other vendors, they may have their implementation, for example, get this notification directly from sysfs.

sel.addSelectable(sel_tbl)


class SfpUtil(SfpUtilBase):
"""Platform-specific SfpUtil class"""
Expand Down Expand Up @@ -39,7 +67,7 @@ def port_to_eeprom_mapping(self):
return self._port_to_eeprom_mapping

def __init__(self):
eeprom_path = "/bsp/qsfp/qsfp{0}"
eeprom_path = "/sys/class/i2c-adapter/i2c-2/2-0048/hwmon/hwmon7/qsfp{0}_eeprom"

for x in range(0, self.port_end + 1):
self._port_to_eeprom_mapping[x] = eeprom_path.format(x + self.EEPROM_OFFSET)
Expand All @@ -64,6 +92,21 @@ def get_presence(self, port_num):
return True

return False

def get_presence_status_file_path(self, port_num):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the code added in this PR is duplicated among all the plugins. Is there a reason this functionality can't live in sfputilbase.py? Will it not be shared by other platforms?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the file path is specific for each platform, it's even various among platforms from one vendor, so this has to be specific for each platform.

Copy link
Collaborator

@qiluo-msft qiluo-msft Jul 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_presence_status_file_path [](start = 8, length = 29)

Who is calling this function? #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, not called anymore, call in the original solution. removed.

# Check for invalid port_num
if port_num < self.port_start or port_num > self.port_end:
return ""

try:
path = "/sys/class/i2c-adapter/i2c-2/2-0048/hwmon/hwmon7/qsfp%d_status" % (port_num+1)
reg_file = open(path)
except IOError as e:
print "Error: unable to open file: %s" % str(e)
return ""

reg_file.close()
return path

def get_low_power_mode(self, port_num):
# Check for invalid port_num
Expand Down Expand Up @@ -149,3 +192,21 @@ def reset(self, port_num):
return False

return False

def get_transceiver_change_event(self, timeout=0):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_transceiver_change_event [](start = 8, length = 28)

This is vendor independent code. Let's move it out of plugin.
Who is using it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft this function is intended to let each vendor have their own implementation, because each vendor may have different ways to get the SFP plug in/out notification. Here is the implementation on mlnx platform.

Copy link
Collaborator

@qiluo-msft qiluo-msft Jul 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue https://github.com/Azure/SONiC/blob/gh-pages/doc/transceiver-monitor-hld.md is not a good design, at least for Mellanox platform.

Mellanox syncd docker could directly get SFP notification and write to STATE_DB. There is no benefit to write to redis once, blocking waiting on it in another container (pmon) and write redis again. The disadvantages are new dependencies (like in this file), fragile system, and hard to debug.

If this is true for Mellanox, how could it benefit other vendors?


In reply to: 200867307 [](ancestors = 200867307)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree with you if we only consider mlnx platform, and at the very beginning, my design is to post the transceiver info to DB from syncd since mlnx can get the sfp change event from there. But after discussing with Guohan and community, we should have a common solution for all the vendors, design as common sfp change event API and left it to be implemented by each vendor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally understood. This is a perfect to time to revisit the design.


In reply to: 201230542 [](ancestors = 201230542)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking in -- are we or are we not revisiting the design? I believe the current design is the common solution for all vendors that @keboliu mentioned above. Therefore, I don't believe there is a need to revisit the design.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jleveque Yes, this the common solution for all vendors that have discussed.
@qiluo-msft would like to get your view on this.
more background information about this, for other vendors, may possible to get this event from sysfs, which will be very easy for them to implement this common API since sysfs is accessible from pmon. you may check with Hui who have some investigation on other platforms.

phy_port_dict = {}
if INSIDE_DOCKER:
status = True
(state, c) = sel.select(timeout)

if state == swsscommon.Select.TIMEOUT:
status = true
elif state != swsscommon.Select.OBJECT:
status = False
else:
(key, op, fvp) = sel_tbl.pop()
phy_port_dict[key] = op

return status, phy_port_dict
else:
return False, phy_port_dict
63 changes: 62 additions & 1 deletion device/mellanox/x86_64-mlnx_msn2100-r0/plugins/sfputil.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,34 @@
except ImportError as e:
raise ImportError("%s - required module not found" % str(e))

def determine_in_docker():
with open('/proc/self/cgroup') as f:
lines = [line.rstrip('\n') for line in f]
for line in lines:
words = line.split('/')
if len(words) > 1 and words[1] == 'docker':
return True

return False

INSIDE_DOCKER = determine_in_docker()

if INSIDE_DOCKER:
from swsscommon import swsscommon
REDIS_HOSTNAME = "localhost"
REDIS_PORT = 6379
REDIS_TIMEOUT_USECS = 0

state_db = swsscommon.DBConnector(swsscommon.STATE_DB,
REDIS_HOSTNAME,
REDIS_PORT,
REDIS_TIMEOUT_USECS)

# Subscribe to state table for SFP change notifications
sel = swsscommon.Select()
sel_tbl = swsscommon.NotificationConsumer(state_db, 'TRANSCEIVER_NOTIFY')
sel.addSelectable(sel_tbl)


class SfpUtil(SfpUtilBase):
"""Platform-specific SfpUtil class"""
Expand Down Expand Up @@ -39,7 +67,7 @@ def port_to_eeprom_mapping(self):
return self._port_to_eeprom_mapping

def __init__(self):
eeprom_path = "/bsp/qsfp/qsfp{0}"
eeprom_path = "/sys/class/i2c-adapter/i2c-2/2-0048/hwmon/hwmon4/qsfp{0}_eeprom"

for x in range(0, self.port_end + 1):
self._port_to_eeprom_mapping[x] = eeprom_path.format(x + self.EEPROM_OFFSET)
Expand All @@ -64,6 +92,21 @@ def get_presence(self, port_num):
return True

return False

def get_presence_status_file_path(self, port_num):
# Check for invalid port_num
if port_num < self.port_start or port_num > self.port_end:
return ""

try:
path = "/sys/class/i2c-adapter/i2c-2/2-0048/hwmon/hwmon4/qsfp%d_status" % (port_num+1)
reg_file = open(path)
except IOError as e:
print "Error: unable to open file: %s" % str(e)
return ""

reg_file.close()
return path

def get_low_power_mode(self, port_num):
# Check for invalid port_num
Expand Down Expand Up @@ -149,3 +192,21 @@ def reset(self, port_num):
return False

return False

def get_transceiver_change_event(self, timeout=0):
phy_port_dict = {}
if INSIDE_DOCKER:
status = True
(state, c) = sel.select(timeout)

if state == swsscommon.Select.TIMEOUT:
status = true
elif state != swsscommon.Select.OBJECT:
status = False
else:
(key, op, fvp) = sel_tbl.pop()
phy_port_dict[key] = op

return status, phy_port_dict
else:
return False, phy_port_dict
68 changes: 66 additions & 2 deletions device/mellanox/x86_64-mlnx_msn2410-r0/plugins/sfputil.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,43 @@
except ImportError as e:
raise ImportError("%s - required module not found" % str(e))

def determine_in_docker():
with open('/proc/self/cgroup') as f:
lines = [line.rstrip('\n') for line in f]
for line in lines:
words = line.split('/')
if len(words) > 1 and words[1] == 'docker':
return True

return False

INSIDE_DOCKER = determine_in_docker()

if INSIDE_DOCKER:
from swsscommon import swsscommon
REDIS_HOSTNAME = "localhost"
REDIS_PORT = 6379
REDIS_TIMEOUT_USECS = 0

state_db = swsscommon.DBConnector(swsscommon.STATE_DB,
REDIS_HOSTNAME,
REDIS_PORT,
REDIS_TIMEOUT_USECS)

# Subscribe to state table for SFP change notifications
sel = swsscommon.Select()
sel_tbl = swsscommon.NotificationConsumer(state_db, 'TRANSCEIVER_NOTIFY')
sel.addSelectable(sel_tbl)


class SfpUtil(SfpUtilBase):
"""Platform-specific SfpUtil class"""


PORT_START = 0
PORT_END = 55
PORTS_IN_BLOCK = 56
QSFP_PORT_START = 48

EEPROM_OFFSET = 1

Expand All @@ -32,20 +62,21 @@ def port_end(self):

@property
def qsfp_ports(self):
return range(0, self.PORTS_IN_BLOCK + 1)
return range(QSFP_PORT_START, self.PORTS_IN_BLOCK + 1)

@property
def port_to_eeprom_mapping(self):
return self._port_to_eeprom_mapping

def __init__(self):
eeprom_path = "/bsp/qsfp/qsfp{0}"
eeprom_path = "/sys/class/i2c-adapter/i2c-2/2-0048/hwmon/hwmon6/qsfp{0}_eeprom"

for x in range(0, self.port_end + 1):
self._port_to_eeprom_mapping[x] = eeprom_path.format(x + self.EEPROM_OFFSET)

SfpUtilBase.__init__(self)


def get_presence(self, port_num):
# Check for invalid port_num
if port_num < self.port_start or port_num > self.port_end:
Expand All @@ -64,6 +95,21 @@ def get_presence(self, port_num):
return True

return False

def get_presence_status_file_path(self, port_num):
# Check for invalid port_num
if port_num < self.port_start or port_num > self.port_end:
return ""

try:
path = "/sys/class/i2c-adapter/i2c-2/2-0048/hwmon/hwmon6/qsfp%d_status" % (port_num+1)
reg_file = open(path)
except IOError as e:
print "Error: unable to open file: %s" % str(e)
return ""

reg_file.close()
return path

def get_low_power_mode(self, port_num):
# Check for invalid port_num
Expand Down Expand Up @@ -149,3 +195,21 @@ def reset(self, port_num):
return False

return False

def get_transceiver_change_event(self, timeout=0):
phy_port_dict = {}
if INSIDE_DOCKER:
status = True
(state, c) = sel.select(timeout)

if state == swsscommon.Select.TIMEOUT:
status = true
elif state != swsscommon.Select.OBJECT:
status = False
else:
(key, op, fvp) = sel_tbl.pop()
phy_port_dict[key] = op

return status, phy_port_dict
else:
return False, phy_port_dict
63 changes: 62 additions & 1 deletion device/mellanox/x86_64-mlnx_msn2700-r0/plugins/sfputil.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,34 @@
except ImportError as e:
raise ImportError("%s - required module not found" % str(e))

def determine_in_docker():
with open('/proc/self/cgroup') as f:
lines = [line.rstrip('\n') for line in f]
for line in lines:
words = line.split('/')
if len(words) > 1 and words[1] == 'docker':
return True

return False

INSIDE_DOCKER = determine_in_docker()

if INSIDE_DOCKER:
from swsscommon import swsscommon
REDIS_HOSTNAME = "localhost"
REDIS_PORT = 6379
REDIS_TIMEOUT_USECS = 0

state_db = swsscommon.DBConnector(swsscommon.STATE_DB,
REDIS_HOSTNAME,
REDIS_PORT,
REDIS_TIMEOUT_USECS)

# Subscribe to state table for SFP change notifications
sel = swsscommon.Select()
sel_tbl = swsscommon.NotificationConsumer(state_db, 'TRANSCEIVER_NOTIFY')
sel.addSelectable(sel_tbl)


class SfpUtil(SfpUtilBase):
"""Platform-specific SfpUtil class"""
Expand Down Expand Up @@ -39,7 +67,7 @@ def port_to_eeprom_mapping(self):
return self._port_to_eeprom_mapping

def __init__(self):
eeprom_path = "/bsp/qsfp/qsfp{0}"
eeprom_path = "/sys/class/i2c-adapter/i2c-2/2-0048/hwmon/hwmon7/qsfp{0}_eeprom"

for x in range(0, self.port_end + 1):
self._port_to_eeprom_mapping[x] = eeprom_path.format(x + self.EEPROM_OFFSET)
Expand All @@ -64,6 +92,21 @@ def get_presence(self, port_num):
return True

return False

def get_presence_status_file_path(self, port_num):
# Check for invalid port_num
if port_num < self.port_start or port_num > self.port_end:
return ""

try:
path = "/sys/class/i2c-adapter/i2c-2/2-0048/hwmon/hwmon7/qsfp%d_status" % (port_num+1)
reg_file = open(path)
except IOError as e:
print "Error: unable to open file: %s" % str(e)
return ""

reg_file.close()
return path

def get_low_power_mode(self, port_num):
# Check for invalid port_num
Expand Down Expand Up @@ -149,3 +192,21 @@ def reset(self, port_num):
return False

return False

def get_transceiver_change_event(self, timeout=0):
phy_port_dict = {}
if INSIDE_DOCKER:
status = True
(state, c) = sel.select(timeout)

if state == swsscommon.Select.TIMEOUT:
status = true
elif state != swsscommon.Select.OBJECT:
status = False
else:
(key, op, fvp) = sel_tbl.pop()
phy_port_dict[key] = op

return status, phy_port_dict
else:
return False, phy_port_dict
Loading