Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ASAN]Increasing switch create timeout for ASAN images #2777

Merged
merged 5 commits into from
May 30, 2023

Conversation

dgsudharsan
Copy link
Collaborator

@dgsudharsan dgsudharsan commented May 15, 2023

What I did
When running ASAN images with low CPU systems, due to the additional overhead associated with ASAN, it results in switch create timeout as seen in logs below. Hence increasing the switch create timeout to 2x for ASAN builds. e.g. For regular switch scenario, the timeout will be 120 seconds instead of 60 seconds.

Apr 18 20:23:46.749074 arc-switch1004 NOTICE swss#orchagent: :- create: request switch create with context 0
Apr 18 20:23:46.749074 arc-switch1004 NOTICE swss#orchagent: :- allocateNewSwitchObjectId: created SWITCH VID oid:0x21000000000000 for hwinfo: ''
Apr 18 20:24:46.816998 arc-switch1004 ERR swss#orchagent: :- wait: SELECT operation result: TIMEOUT on getresponse
Apr 18 20:24:46.817243 arc-switch1004 ERR swss#orchagent: :- wait: failed to get response for getresponse
Apr 18 20:24:46.817405 arc-switch1004 ERR swss#orchagent: :- create: create status: SAI_STATUS_FAILURE
Apr 18 20:24:46.817552 arc-switch1004 ERR swss#orchagent: :- main: Failed to create a switch, rv:-1

Why I did it
To avoid timeout to create switch when testing ASAN builds with lower CPU systems

How I verified it
Loaded the build with changes and verified no issues are seen

Details if related

@@ -585,7 +585,13 @@ int main(int argc, char **argv)
attr.value.u64 = gSwitchId;
attrs.push_back(attr);

auto delay_factor = 1;

#ifdef ASAN_ENABLED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer not use #ifdefs in code. @xumia do you've any suggestions?

@prsunny prsunny requested a review from xumia May 17, 2023 23:34
Copy link
Collaborator

@prsunny prsunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. @xumia to review as well.

@dgsudharsan
Copy link
Collaborator Author

@xumia Can you please help to close the review?

@prsunny
Copy link
Collaborator

prsunny commented May 27, 2023

@dgsudharsan , can you please update the description as to what was the previous timeout and what is the new value in seconds?

@dgsudharsan
Copy link
Collaborator Author

@dgsudharsan , can you please update the description as to what was the previous timeout and what is the new value in seconds?

Done

@qiluo-msft qiluo-msft merged commit 90b34d4 into sonic-net:master May 30, 2023
StormLiangMS pushed a commit that referenced this pull request Jun 10, 2023
**What I did**
When running ASAN images with low CPU systems, due to the additional overhead associated with ASAN, it results in switch create timeout as seen in logs below. Hence increasing the **switch create timeout to 2x for ASAN builds.** e.g. For regular switch scenario, the timeout will be 120 seconds  instead of 60 seconds.

```
Apr 18 20:23:46.749074 arc-switch1004 NOTICE swss#orchagent: :- create: request switch create with context 0
Apr 18 20:23:46.749074 arc-switch1004 NOTICE swss#orchagent: :- allocateNewSwitchObjectId: created SWITCH VID oid:0x21000000000000 for hwinfo: ''
Apr 18 20:24:46.816998 arc-switch1004 ERR swss#orchagent: :- wait: SELECT operation result: TIMEOUT on getresponse
Apr 18 20:24:46.817243 arc-switch1004 ERR swss#orchagent: :- wait: failed to get response for getresponse
Apr 18 20:24:46.817405 arc-switch1004 ERR swss#orchagent: :- create: create status: SAI_STATUS_FAILURE
Apr 18 20:24:46.817552 arc-switch1004 ERR swss#orchagent: :- main: Failed to create a switch, rv:-1

```

**Why I did it**
To avoid timeout to create switch when testing ASAN builds with lower CPU systems

**How I verified it**
Loaded the build with changes and verified no issues are seen
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 20, 2023
**What I did**
When running ASAN images with low CPU systems, due to the additional overhead associated with ASAN, it results in switch create timeout as seen in logs below. Hence increasing the **switch create timeout to 2x for ASAN builds.** e.g. For regular switch scenario, the timeout will be 120 seconds  instead of 60 seconds.

```
Apr 18 20:23:46.749074 arc-switch1004 NOTICE swss#orchagent: :- create: request switch create with context 0
Apr 18 20:23:46.749074 arc-switch1004 NOTICE swss#orchagent: :- allocateNewSwitchObjectId: created SWITCH VID oid:0x21000000000000 for hwinfo: ''
Apr 18 20:24:46.816998 arc-switch1004 ERR swss#orchagent: :- wait: SELECT operation result: TIMEOUT on getresponse
Apr 18 20:24:46.817243 arc-switch1004 ERR swss#orchagent: :- wait: failed to get response for getresponse
Apr 18 20:24:46.817405 arc-switch1004 ERR swss#orchagent: :- create: create status: SAI_STATUS_FAILURE
Apr 18 20:24:46.817552 arc-switch1004 ERR swss#orchagent: :- main: Failed to create a switch, rv:-1

```

**Why I did it**
To avoid timeout to create switch when testing ASAN builds with lower CPU systems

**How I verified it**
Loaded the build with changes and verified no issues are seen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants