Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Auto Techsupport] Event driven Techsupport Changes #1796

Merged
merged 72 commits into from
Nov 16, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
999983c
Commit for Ref
vivekrnv Aug 7, 2021
104a412
TechSupport Tests Completed
vivekrnv Aug 8, 2021
6bcfb5d
auto_techsupport helper added
vivekrnv Aug 8, 2021
6b5ba3f
coredump_gen_script in progress
vivekrnv Aug 8, 2021
843d329
CoredumpHandler UT's completd and script fixed
vivekrnv Aug 9, 2021
7050c38
Merge branch 'master' of https://github.com/Azure/sonic-utilities int…
vivekrnv Aug 9, 2021
4cc830d
Merge branch 'Azure:master' into event_driven_ts
vivekrnv Aug 9, 2021
e32ea79
Merge branch 'event_driven_ts' of https://github.com/vivekreddynv/son…
vivekrnv Aug 9, 2021
c89f15f
Added original Setup.py
vivekrnv Aug 9, 2021
4b1faa2
Test Changes
vivekrnv Aug 11, 2021
4e9a2a2
Merge branch 'master' of https://github.com/Azure/sonic-utilities int…
vivekrnv Aug 11, 2021
7be9ee4
CLI GEN-1 merged
vivekrnv Aug 11, 2021
4fdf805
CLI GEN-2 merged
vivekrnv Aug 11, 2021
7def4b7
Removed a few tests
vivekrnv Aug 11, 2021
6bfb465
Added setup.py
vivekrnv Aug 11, 2021
eba8261
Revert "Removed a few tests"
vivekrnv Aug 11, 2021
e89fac3
Revert "CLI GEN-2 merged"
vivekrnv Aug 11, 2021
7b08c27
Revert "CLI GEN-1 merged"
vivekrnv Aug 11, 2021
a708f06
config feature added
vivekrnv Aug 12, 2021
e6122e9
Revert "[config][generic-update] Implementing patch sorting (#1599)"
vivekrnv Aug 13, 2021
6c4f96c
Added new state table schema
vivekrnv Aug 14, 2021
d911b67
Merge branch 'master' of https://github.com/Azure/sonic-utilities int…
vivekrnv Aug 14, 2021
1fbe04e
UT's updated for new design
vivekrnv Aug 16, 2021
195b5ad
Updated the Script and UT's
vivekrnv Aug 17, 2021
9a50c0f
Minor Change to test
vivekrnv Aug 17, 2021
3e66b70
Auto GEN CLI's added
vivekrnv Aug 17, 2021
b6ae7bb
Updated Setup.py
vivekrnv Aug 17, 2021
e8bfd2e
Tests Updated to use default mock infra
vivekrnv Aug 18, 2021
68f7e5c
scripts updated
vivekrnv Aug 18, 2021
c446e5f
CLI added
vivekrnv Aug 19, 2021
a5cf16e
show auto_ts history added
vivekrnv Aug 19, 2021
903a2f0
Revert "Revert "[config][generic-update] Implementing patch sorting (…
vivekrnv Aug 19, 2021
16fa940
generate_dump updated
vivekrnv Aug 19, 2021
556756b
setup.py updated
vivekrnv Aug 19, 2021
c08644f
Beautifier changes
vivekrnv Aug 20, 2021
f407171
Minor Changes
vivekrnv Aug 20, 2021
89bbc36
Merge branch 'master' of https://github.com/Azure/sonic-utilities int…
vivekrnv Aug 30, 2021
b2ff906
Handled comments
vivekrnv Aug 31, 2021
5d01336
Updated auto-gen CLI
vivekrnv Aug 31, 2021
7457607
CLI/UT Updated based on schme changes
vivekrnv Aug 31, 2021
d14fd6d
Added Newline Termintor
vivekrnv Aug 31, 2021
ac17a0a
Added Newline Termintor
vivekrnv Aug 31, 2021
469b9d1
Minor indentation issues handled
vivekrnv Aug 31, 2021
9250b20
named declaration removed
vivekrnv Aug 31, 2021
bc8b8b3
Auto GEN CLI updated based on updated YANG
vivekrnv Aug 31, 2021
f447137
CLI changes based on schema completed
vivekrnv Sep 1, 2021
643f316
Minor Fixes
vivekrnv Sep 1, 2021
38a3333
Backend Updated based on schema
vivekrnv Sep 1, 2021
b850efe
Auto GEN CLI Updated for the new YANG
vivekrnv Sep 1, 2021
35f80d7
FEATURE table dependency removed
vivekrnv Sep 1, 2021
64d53f7
syslog msg's updated
vivekrnv Sep 1, 2021
3757219
Removed mock data file
vivekrnv Sep 2, 2021
8a4bd9d
Removed Redundant Constants
vivekrnv Sep 2, 2021
790d3fe
Merge branch 'master' of https://github.com/Azure/sonic-utilities int…
vivekrnv Sep 8, 2021
e3628d4
plugin updated
vivekrnv Sep 9, 2021
1ba9174
Migrated from exit event approach and removed dependency
vivekrnv Sep 14, 2021
2b4d1b5
Minor update in coredump-compress
vivekrnv Sep 14, 2021
468c930
Modified Auto-Gen Plugins
vivekrnv Sep 14, 2021
e983744
Changes after testing
vivekrnv Sep 14, 2021
5f3a958
masic docker instances handled
vivekrnv Sep 14, 2021
a0011ff
Added units info
vivekrnv Sep 14, 2021
bae3d56
Handled LGTM comment
vivekrnv Sep 14, 2021
652ed92
Removed redundant code
vivekrnv Sep 29, 2021
7249ae1
Merge branch 'master' into event_driven_ts
vivekrnv Sep 29, 2021
df4aef1
comment handled
vivekrnv Oct 16, 2021
64ed590
Merge branch 'event_driven_ts' of https://github.com/vivekreddynv/son…
vivekrnv Oct 16, 2021
466a8aa
Added Python2.7 to the pythonpath
vivekrnv Oct 26, 2021
5a4dcfc
Merge branch 'master' of https://github.com/Azure/sonic-utilities int…
vivekrnv Oct 26, 2021
86ea3fd
env vars related and other minor comments addressed
vivekrnv Oct 28, 2021
0fc5391
Merge branch 'Azure:master' into event_driven_ts
vivekrnv Nov 9, 2021
3d37043
Merge branch 'Azure:master' into event_driven_ts
vivekrnv Nov 12, 2021
12f75f1
Update coredump-compress
vivekrnv Nov 14, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
350 changes: 350 additions & 0 deletions config/plugins/auto_techsupport.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,350 @@
"""
Autogenerated config CLI plugin.
"""

import click
import utilities_common.cli as clicommon
import utilities_common.general as general
from config import config_mgmt


# Load sonic-cfggen from source since /usr/local/bin/sonic-cfggen does not have .py extension.
sonic_cfggen = general.load_module_from_source('sonic_cfggen', '/usr/local/bin/sonic-cfggen')


def exit_with_error(*args, **kwargs):
""" Print a message and abort CLI. """

click.secho(*args, **kwargs)
raise click.Abort()


def validate_config_or_raise(cfg):
""" Validate config db data using ConfigMgmt """

try:
cfg = sonic_cfggen.FormatConverter.to_serialized(cfg)
config_mgmt.ConfigMgmt().loadData(cfg)
except Exception as err:
raise Exception('Failed to validate configuration: {}'.format(err))


def add_entry_validated(db, table, key, data):
""" Add new entry in table and validate configuration """

cfg = db.get_config()
cfg.setdefault(table, {})
if key in cfg[table]:
raise Exception(f"{key} already exists")

cfg[table][key] = data

validate_config_or_raise(cfg)
db.set_entry(table, key, data)


def update_entry_validated(db, table, key, data, create_if_not_exists=False):
""" Update entry in table and validate configuration.
If attribute value in data is None, the attribute is deleted.
"""

cfg = db.get_config()
cfg.setdefault(table, {})

if create_if_not_exists:
cfg[table].setdefault(key, {})

if key not in cfg[table]:
raise Exception(f"{key} does not exist")

for attr, value in data.items():
if value is None and attr in cfg[table][key]:
cfg[table][key].pop(attr)
else:
cfg[table][key][attr] = value

validate_config_or_raise(cfg)
db.set_entry(table, key, cfg[table][key])


def del_entry_validated(db, table, key):
""" Delete entry in table and validate configuration """

cfg = db.get_config()
cfg.setdefault(table, {})
if key not in cfg[table]:
raise Exception(f"{key} does not exist")

cfg[table].pop(key)

validate_config_or_raise(cfg)
db.set_entry(table, key, None)


def add_list_entry_validated(db, table, key, attr, data):
""" Add new entry into list in table and validate configuration"""

cfg = db.get_config()
cfg.setdefault(table, {})
if key not in cfg[table]:
raise Exception(f"{key} does not exist")
cfg[table][key].setdefault(attr, [])
for entry in data:
if entry in cfg[table][key][attr]:
raise Exception(f"{entry} already exists")
cfg[table][key][attr].append(entry)

validate_config_or_raise(cfg)
db.set_entry(table, key, cfg[table][key])


def del_list_entry_validated(db, table, key, attr, data):
""" Delete entry from list in table and validate configuration"""

cfg = db.get_config()
cfg.setdefault(table, {})
if key not in cfg[table]:
raise Exception(f"{key} does not exist")
cfg[table][key].setdefault(attr, [])
for entry in data:
if entry not in cfg[table][key][attr]:
raise Exception(f"{entry} does not exist")
cfg[table][key][attr].remove(entry)
if not cfg[table][key][attr]:
cfg[table][key].pop(attr)

validate_config_or_raise(cfg)
db.set_entry(table, key, cfg[table][key])


def clear_list_entry_validated(db, table, key, attr):
""" Clear list in object and validate configuration"""

update_entry_validated(db, table, key, {attr: None})


@click.group(name="auto-techsupport",
cls=clicommon.AliasedGroup)
def AUTO_TECHSUPPORT():
""" AUTO_TECHSUPPORT part of config_db.json """

pass


@AUTO_TECHSUPPORT.group(name="global",
cls=clicommon.AliasedGroup)
@clicommon.pass_db
def AUTO_TECHSUPPORT_GLOBAL(db):
""" """

pass


@AUTO_TECHSUPPORT_GLOBAL.command(name="state")
@click.argument(
"state",
nargs=1,
required=True,
)
@clicommon.pass_db
def AUTO_TECHSUPPORT_GLOBAL_state(db, state):
""" Knob to make techsupport invocation event-driven based on core-dump generation """

table = "AUTO_TECHSUPPORT"
key = "GLOBAL"
data = {
"state": state,
}
try:
update_entry_validated(db.cfgdb, table, key, data, create_if_not_exists=True)
except Exception as err:
exit_with_error(f"Error: {err}", fg="red")


@AUTO_TECHSUPPORT_GLOBAL.command(name="rate-limit-interval")
@click.argument(
"rate-limit-interval",
nargs=1,
required=True,
)
@clicommon.pass_db
def AUTO_TECHSUPPORT_GLOBAL_rate_limit_interval(db, rate_limit_interval):
""" Minimum time in seconds between two successive techsupport invocations. Configure 0 to explicitly disable """

table = "AUTO_TECHSUPPORT"
key = "GLOBAL"
data = {
"rate_limit_interval": rate_limit_interval,
}
try:
update_entry_validated(db.cfgdb, table, key, data, create_if_not_exists=True)
except Exception as err:
exit_with_error(f"Error: {err}", fg="red")


@AUTO_TECHSUPPORT_GLOBAL.command(name="max-techsupport-limit")
@click.argument(
"max-techsupport-limit",
nargs=1,
required=True,
)
@clicommon.pass_db
def AUTO_TECHSUPPORT_GLOBAL_max_techsupport_limit(db, max_techsupport_limit):
""" Max Limit in percentage for the cummulative size of ts dumps.
No cleanup is performed if the value isn't configured or is 0.0
"""

table = "AUTO_TECHSUPPORT"
key = "GLOBAL"
data = {
"max_techsupport_limit": max_techsupport_limit,
}
try:
update_entry_validated(db.cfgdb, table, key, data, create_if_not_exists=True)
except Exception as err:
exit_with_error(f"Error: {err}", fg="red")


@AUTO_TECHSUPPORT_GLOBAL.command(name="max-core-limit")
@click.argument(
"max-core-limit",
nargs=1,
required=True,
)
@clicommon.pass_db
def AUTO_TECHSUPPORT_GLOBAL_max_core_limit(db, max_core_limit):
""" Max Limit in percentage for the cummulative size of core dumps.
No cleanup is performed if the value isn't congiured or is 0.0
"""

table = "AUTO_TECHSUPPORT"
key = "GLOBAL"
data = {
"max_core_limit": max_core_limit,
}
try:
update_entry_validated(db.cfgdb, table, key, data, create_if_not_exists=True)
except Exception as err:
exit_with_error(f"Error: {err}", fg="red")


@AUTO_TECHSUPPORT_GLOBAL.command(name="since")
@click.argument(
"since",
nargs=1,
required=True,
)
@clicommon.pass_db
def AUTO_TECHSUPPORT_GLOBAL_since(db, since):
""" Only collect the logs & core-dumps generated since the time provided.
A default value of '2 days ago' is used if this value is not set explicitly or a non-valid string is provided """

table = "AUTO_TECHSUPPORT"
key = "GLOBAL"
data = {
"since": since,
}
try:
update_entry_validated(db.cfgdb, table, key, data, create_if_not_exists=True)
except Exception as err:
exit_with_error(f"Error: {err}", fg="red")


@click.group(name="auto-techsupport-feature",
cls=clicommon.AliasedGroup)
def AUTO_TECHSUPPORT_FEATURE():
""" AUTO_TECHSUPPORT_FEATURE part of config_db.json """
pass


@AUTO_TECHSUPPORT_FEATURE.command(name="add")
@click.argument(
"feature-name",
nargs=1,
required=True,
)
@click.option(
"--state",
help="Enable auto techsupport invocation on the processes running inside this feature",
)
@click.option(
"--rate-limit-interval",
help="Rate limit interval for the corresponding feature. Configure 0 to explicitly disable",
)
@clicommon.pass_db
def AUTO_TECHSUPPORT_FEATURE_add(db, feature_name, state, rate_limit_interval):
""" Add object in AUTO_TECHSUPPORT_FEATURE. """

table = "AUTO_TECHSUPPORT_FEATURE"
key = feature_name
data = {}
if state is not None:
data["state"] = state
if rate_limit_interval is not None:
data["rate_limit_interval"] = rate_limit_interval

try:
add_entry_validated(db.cfgdb, table, key, data)
except Exception as err:
exit_with_error(f"Error: {err}", fg="red")


@AUTO_TECHSUPPORT_FEATURE.command(name="update")
@click.argument(
"feature-name",
nargs=1,
required=True,
)
@click.option(
"--state",
help="Enable auto techsupport invocation on the processes running inside this feature",
)
@click.option(
"--rate-limit-interval",
help="Rate limit interval for the corresponding feature. Configure 0 to explicitly disable",
)
@clicommon.pass_db
def AUTO_TECHSUPPORT_FEATURE_update(db, feature_name, state, rate_limit_interval):
""" Add object in AUTO_TECHSUPPORT_FEATURE. """

table = "AUTO_TECHSUPPORT_FEATURE"
key = feature_name
data = {}
if state is not None:
data["state"] = state
if rate_limit_interval is not None:
data["rate_limit_interval"] = rate_limit_interval

try:
update_entry_validated(db.cfgdb, table, key, data)
except Exception as err:
exit_with_error(f"Error: {err}", fg="red")


@AUTO_TECHSUPPORT_FEATURE.command(name="delete")
@click.argument(
"feature-name",
nargs=1,
required=True,
)
@clicommon.pass_db
def AUTO_TECHSUPPORT_FEATURE_delete(db, feature_name):
""" Delete object in AUTO_TECHSUPPORT_FEATURE. """

table = "AUTO_TECHSUPPORT_FEATURE"
key = feature_name
try:
del_entry_validated(db.cfgdb, table, key)
except Exception as err:
exit_with_error(f"Error: {err}", fg="red")


def register(cli):
cli_node = AUTO_TECHSUPPORT
if cli_node.name in cli.commands:
raise Exception(f"{cli_node.name} already exists in CLI")
cli.add_command(AUTO_TECHSUPPORT)
cli_node = AUTO_TECHSUPPORT_FEATURE
if cli_node.name in cli.commands:
raise Exception(f"{cli_node.name} already exists in CLI")
cli.add_command(AUTO_TECHSUPPORT_FEATURE)
17 changes: 17 additions & 0 deletions scripts/coredump-compress
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,28 @@ while [[ $# > 1 ]]; do
shift
done

CONTAINER_ID=""
if [ $# > 0 ]; then
CONTAINER_ID=$(xargs -0 -L1 -a /proc/${1}/cgroup | grep -oP "pids:/docker/\K\w+")
ns=`xargs -0 -L1 -a /proc/${1}/environ | grep -e "^NAMESPACE_ID" | cut -f2 -d'='`
if [ ! -z ${ns} ]; then
PREFIX=${PREFIX}${ns}.
fi
fi

/bin/gzip -1 - > /var/core/${PREFIX}core.gz

if [[ ! -z $CONTAINER_ID ]]; then
CONTAINER_NAME=$(docker inspect --format='{{.Name}}' ${CONTAINER_ID} | cut -c2-)
if [[ ! -z ${CONTAINER_NAME} ]]; then
# coredump_gen_handler invokes techsupport if all the other required conditions are met
# explicitly passing in the env vars because coredump-compress's namespace doesn't have these set by default
for path in $(find /usr/local/lib/python3*/dist-packages -maxdepth 0); do
PYTHONPATH=$PYTHONPATH:$path
done
setsid $(echo > /tmp/coredump_gen_handler.log;
export PYTHONPATH=$PYTHONPATH;
python3 /usr/local/bin/coredump_gen_handler.py ${PREFIX}core.gz ${CONTAINER_NAME} &>> /tmp/coredump_gen_handler.log) &
Copy link
Contributor

@qiluo-msft qiluo-msft Nov 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python3

If add a shebang to the py file, do you still need to use python3 and PYTHONPATH? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can avoid python3 but PYTHONPATH is still required.

fi
fi

Loading