Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swss: flush g_asicState after each event is done #570

Merged

Conversation

dzhangalibaba
Copy link
Collaborator

  • add flush() after event is handled in case some entries are still in buffer, don't wait

  • with the changes in sairedis and swss-common, route performance improved by 200~300 routes/sec

    Signed-off-by: Dong Zhang d.zhang@alibaba-inc.com

What I did

Why I did it

How I verified it

Details if related

* add flush() after event is handled in case some entries are still in buffer, don't wait
* with the changes in sairedis and swss-common, route performance improved by 200~300 routes/sec

  Signed-off-by: Dong Zhang d.zhang@alibaba-inc.com
@lguohan
Copy link
Contributor

lguohan commented Aug 10, 2018

you have a few PR submitted in sairedis, common, can you give dependencies on these PR? are they all independent or some pr requires other pr to be merge first?

@dzhangalibaba
Copy link
Collaborator Author

dzhangalibaba commented Aug 10, 2018

swss-common#218, this one should be merged first. sairedis#335 depends on swss-common#218. swss-common#218 and sairedis#335 are one feature.

The other three, swss#570 and sairedis#336 depends on swss-common#219, but they are independent in compiling point of view. So swss-common#219 should be merged first for these feature.

@@ -313,5 +313,6 @@ void OrchDaemon::start()
for (Orch *o : m_orchList)
o->doTask();

flush(); //flush after each event is handled, don't wait
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//flush after e [](start = 17, length = 15)

Please help follow the coding style of other comments. #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will do

@qiluo-msft
Copy link
Contributor

qiluo-msft commented Aug 10, 2018

        flush();

So the flush() is not needed at all? #Closed


Refers to: orchagent/orchdaemon.cpp:302 in 8fa353c. [](commit_id = 8fa353c, deletion_comment = False)

@@ -313,5 +313,6 @@ void OrchDaemon::start()
for (Orch *o : m_orchList)
o->doTask();

flush(); //flush after each event is handled, don't wait
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush [](start = 8, length = 5)

About the performance "route performance improved by 200~300 routes/sec". Could you provide more details?

  1. What is the test environment and test steps?
  2. What is the performance before this PR and after?
  3. If it is a small scope test, could you add a unit test or vs test case to automate it?
    #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. the environment and test steps are listed on the sildes we discussed about two weeks ago . I already sent the slides that time internally, if you didn't get it, I can resent it to you.
  2. The performance is listed in the slides as well. Only one PR didn't make sense, it involved many PRs for each optimization. We need to look at them together.
  3. We tested the routing performance on physical switch , it matched what I listed on the slides.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. the environment and test steps are listed on the sildes we discussed about two weeks ago . I already sent the slides that time internally, if you didn't get it, I can resent it to you.
  2. The performance is listed in the slides as well. Only one PR didn't make sense, it involved many PRs for each optimization. We need to look at them together.
  3. We tested the routing performance on physical switch , it matched what I listed on the slides.

Copy link
Contributor

@qiluo-msft qiluo-msft Aug 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I just want to know performance before this PR and after? You only mentioned improved by 200~300 routes/sec. Rough numbers are ok. #Closed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before changes it is 1300 routes/sec, platform is brcm and SAI is 3.1, CPU is Intel(R) Atom(TM) CPU C2558 @ 2.40GHz. After this enabling pipeline changes only, it is about 1500-1600 routes/sec, if plus the syncd buffer changes in other PR, it is about 1700-1800 routes/sec

Copy link
Contributor

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As comments

@dzhangalibaba
Copy link
Collaborator Author

dzhangalibaba commented Aug 10, 2018

Also , the flush() is needed, currently it will flush() when there is nothing to select, the SELECT_WAIT time is 1000ms, this is too long for routes, we need to flush them right away, don't wait the timeout

@qiluo-msft
Copy link
Contributor

qiluo-msft commented Aug 10, 2018

I mean the above flush() for timeout, since you already flush() after execute(). #Closed

@dzhangalibaba
Copy link
Collaborator Author

got it, then that flush is not necessary, we can remove it.

…mment

  *remove unnecessary flush() in timeout case and update comment

  Signed-off-by: Dong Zhang d.zhang@alibaba-inc.com
…mment

  *remove unnecessary flush() in timeout case and update comment

  Signed-off-by: Dong Zhang d.zhang@alibaba-inc.com
@qiluo-msft qiluo-msft merged commit fa7008f into sonic-net:master Aug 11, 2018
@dzhangalibaba dzhangalibaba deleted the enable-g_asicState-pipeline branch September 27, 2019 23:07
qiluo-msft added a commit to qiluo-msft/sonic-swss that referenced this pull request Oct 23, 2020
qiluo-msft added a commit that referenced this pull request Nov 5, 2020
**What I did**
Revert #570
We should only `flush` the orchagent/syncd communication channel when there is no active tasks in orchagent. This will not influence the end-to-end performance in long run but introduce SELECT_TIMEOUT (1 s) latency if there is remaining data left inside the orchagent/syncd communication channel after previous `flush`, which is not a big deal.

Fix sonic-net/sonic-buildimage#5570
daall pushed a commit to daall/sonic-swss that referenced this pull request Dec 7, 2020
sonic-net#1478)

**What I did**
Revert sonic-net#570
We should only `flush` the orchagent/syncd communication channel when there is no active tasks in orchagent. This will not influence the end-to-end performance in long run but introduce SELECT_TIMEOUT (1 s) latency if there is remaining data left inside the orchagent/syncd communication channel after previous `flush`, which is not a big deal.

Fix sonic-net/sonic-buildimage#5570
oleksandrivantsiv pushed a commit to oleksandrivantsiv/sonic-swss that referenced this pull request Mar 1, 2023
* [syncd] Fix RPC compilation issues
* Add missing rpc link for vssyncd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants