diff --git a/docs/articles/casestudy.html b/docs/articles/casestudy.html index 8959c1b..1192d72 100644 --- a/docs/articles/casestudy.html +++ b/docs/articles/casestudy.html @@ -213,8 +213,8 @@

  1. Have I used profvis to profile my code and determine what’s actually taking so long? (Human intuition is a notoriously bad profiler!)
  2. Can I perform any calculations, summarizations, and aggregations offline, when my Shiny app isn’t even running, and save the results to .rds files to be read by the app?
  3. -
  4. Are there any opportunities to cache–that is, save the results of my calculations and use them if I get the same request later? (See memoise, or roll your own.)
  5. -
  6. Am I effectively leveraging reactive programming to make sure my reactives are doing as little work as possible?
  7. +
  8. Are there any opportunities to cache–that is, save the results of my calculations and use them if I get the same request later? (See memoise, or roll your own.)
  9. +
  10. Am I effectively leveraging reactive programming to make sure my reactives are doing as little work as possible?
  11. When deploying my app, am I load balancing across multiple R processes and/or multiple servers? (Shiny Server Pro, RStudio Connect, ShinyApps.io)

These options are more generally useful than using async techniques because they can dramatically speed up the performance of an app even if only a single user is using it. While it obviously depends on the particulars of the app itself, a few lines of precomputation or caching logic can often lead to 10X-100X better performance. Async, on the other hand, generally doesn’t help make a single session faster. Instead, it helps a single Shiny process support more concurrent sessions without getting bogged down.

@@ -537,7 +537,7 @@

Load testing with Shiny (coming soon)

-

At the time of this writing, we are working on a suite of load testing tools for Shiny that is not publicly available yet, but was previewed by Sean Lopp during his epic rstudio::conf 2018 talk about running a Shiny load test with 10,000 simulated concurrent users.

+

At the time of this writing, we are working on a suite of load testing tools for Shiny that is not publicly available yet, but was previewed by Sean Lopp during his epic rstudio::conf 2018 talk about running a Shiny load test with 10,000 simulated concurrent users.

You use these tools to easily record yourself using your Shiny app, which creates a test script; then play back that test script, but multiplied by dozens/hundreds/thousands of simulated concurrent users; and finally, analyze the timing data generated during the playback step to see what kind of latency the simulated users experienced.

To examine the effects of my async refactor, I recorded a simple test script by loading up the app, waiting for the first tab to appear, then clicking through each of the other tabs, pausing for several seconds each time before moving on to the next. When using the app without any other visitors, the homepage fully loads in less than a second, and the initial loading of data and rendering of the plot on the default tab takes about 7 seconds. After that, each tab takes no more than a couple of seconds to load. Overall, the entire test script, including time where the user is thinking, takes about 40 seconds under ideal settings (i.e. only a single concurrent user).

I then used this test script to generate load against the Shiny app running in my local RStudio. With the settings I chose, the playback tool introduced one new “user” session every 5 seconds, until 50 sessions total had been launched; then it waited until all the sessions were complete. I ran this test on both the sync and async versions in turn, which generated the following results.

@@ -567,8 +567,8 @@

Take a look at the code diff for async vs. async2. While the code has not changed very dramatically, it has lost a little elegance and maintainability: the code for each of the affected outputs now has one foot in the the render function and one foot in the future. If your app’s total audience is a team of a hundred analysts and execs, you may choose to forgo the extra performance and stick with the original async (or even sync) code. But if you have serious scaling needs, the refactoring is probably a small price to pay.

-

Let’s get real for a second, though. If this weren’t an example app written for exposition purposes, but a real production app that was intended to scale to thousands of concurrent users across dozens of R processes, we wouldn’t download and parse CSV files on the fly. Instead, we’d establish a proper ETL procedure to run every night and put the results into a properly indexed database table, or RDS files with just the data we need. As I said earlier, a little precomputation and caching can make a huge difference!

-

Much of the remaining latency for the async2 branch is from ggplot2 plotting. Sean’s talk alluded to some upcoming plot caching features we’re adding to Shiny, and I imagine they will have as dramatic an effect for this test as they did for Sean.

+

Let’s get real for a second, though. If this weren’t an example app written for exposition purposes, but a real production app that was intended to scale to thousands of concurrent users across dozens of R processes, we wouldn’t download and parse CSV files on the fly. Instead, we’d establish a proper ETL procedure to run every night and put the results into a properly indexed database table, or RDS files with just the data we need. As I said earlier, a little precomputation and caching can make a huge difference!

+

Much of the remaining latency for the async2 branch is from ggplot2 plotting. Sean’s talk alluded to some upcoming plot caching features we’re adding to Shiny, and I imagine they will have as dramatic an effect for this test as they did for Sean.

diff --git a/vignettes/casestudy.Rmd b/vignettes/casestudy.Rmd index 6e4988d..95aaeba 100644 --- a/vignettes/casestudy.Rmd +++ b/vignettes/casestudy.Rmd @@ -133,8 +133,8 @@ While this article is specifically about async, this is a good time to remind yo 1. Have I used [profvis](https://rstudio.github.io/profvis/) to **profile my code** and determine what's actually taking so long? (Human intuition is a notoriously bad profiler!) 2. Can I perform any **calculations, summarizations, and aggregations offline**, when my Shiny app isn't even running, and save the results to .rds files to be read by the app? -3. Are there any opportunities to **cache**--that is, save the results of my calculations and use them if I get the same request later? (See [memoise](https://cran.r-project.org/web/packages/memoise/index.html), or roll your own.) -4. Am I effectively leveraging [reactive programming](https://rstudio.com/resources/videos/effective-reactive-programming/) to make sure my reactives are doing as little work as possible? +3. Are there any opportunities to **cache**--that is, save the results of my calculations and use them if I get the same request later? (See [memoise](https://cran.r-project.org/package=memoise), or roll your own.) +4. Am I effectively leveraging [reactive programming](https://resources.rstudio.com/shiny-developer-conference/shinydevcon-reactivity-joecheng-part-1-1080p) to make sure my reactives are doing as little work as possible? 5. When deploying my app, am I load balancing across multiple R processes and/or multiple servers? ([Shiny Server Pro](http://docs.rstudio.com/shiny-server/#utilization-scheduler), [RStudio Connect](http://docs.rstudio.com/connect/admin/appendix-configuration.html#appendix-configuration-scheduler), [ShinyApps.io](https://shiny.rstudio.com/articles/scaling-and-tuning.html)) These options are more generally useful than using async techniques because they can dramatically speed up the performance of an app even if only a single user is using it. While it obviously depends on the particulars of the app itself, a few lines of precomputation or caching logic can often lead to 10X-100X better performance. Async, on the other hand, generally doesn't help make a single session faster. Instead, it helps a single Shiny process support more concurrent sessions without getting bogged down. @@ -555,7 +555,7 @@ It was a fair amount of work to do the sync-to-async conversion. Now we'd like t ### Load testing with Shiny (coming soon) -At the time of this writing, we are working on a suite of load testing tools for Shiny that is not publicly available yet, but was previewed by Sean Lopp during his [epic rstudio::conf 2018 talk](https://rstudio.com/resources/videos/scaling-shiny/) about running a Shiny load test with 10,000 simulated concurrent users. +At the time of this writing, we are working on a suite of load testing tools for Shiny that is not publicly available yet, but was previewed by Sean Lopp during his [epic rstudio::conf 2018 talk](https://resources.rstudio.com/shiny-2/scaling-shiny-sean-lopp) about running a Shiny load test with 10,000 simulated concurrent users. You use these tools to easily **record** yourself using your Shiny app, which creates a test script; then **play back** that test script, but multiplied by dozens/hundreds/thousands of simulated concurrent users; and finally, **analyze** the timing data generated during the playback step to see what kind of latency the simulated users experienced. @@ -599,9 +599,9 @@ To get a visceral sense for what it feels like to use the app under load, here's Take a look at the [code diff for async vs. async2](https://github.com/rstudio/cranwhales/compare/async...async2?diff=split). While the code has not changed very dramatically, it has lost a little elegance and maintainability: the code for each of the affected outputs now has one foot in the the render function and one foot in the future. If your app's total audience is a team of a hundred analysts and execs, you may choose to forgo the extra performance and stick with the original async (or even sync) code. But if you have serious scaling needs, the refactoring is probably a small price to pay. -Let's get real for a second, though. If this weren't an example app written for exposition purposes, but a real production app that was intended to scale to thousands of concurrent users across dozens of R processes, we wouldn't download and parse CSV files on the fly. Instead, we'd establish a proper [ETL procedure](https://solutions.rstudio.com/twitter_etl/) to run every night and put the results into a properly indexed database table, or RDS files with just the data we need. As I said [earlier](#improving-performance-and-scalability), a little precomputation and caching can make a huge difference! +Let's get real for a second, though. If this weren't an example app written for exposition purposes, but a real production app that was intended to scale to thousands of concurrent users across dozens of R processes, we wouldn't download and parse CSV files on the fly. Instead, we'd establish a proper [ETL procedure](https://solutions.rstudio.com/examples/apps/twitter-etl/) to run every night and put the results into a properly indexed database table, or RDS files with just the data we need. As I said [earlier](#improving-performance-and-scalability), a little precomputation and caching can make a huge difference! -Much of the remaining latency for the async2 branch is from ggplot2 plotting. [Sean's talk](https://rstudio.com/resources/videos/scaling-shiny/) alluded to some upcoming plot caching features we're adding to Shiny, and I imagine they will have as dramatic an effect for this test as they did for Sean. +Much of the remaining latency for the async2 branch is from ggplot2 plotting. [Sean's talk](https://resources.rstudio.com/shiny-2/scaling-shiny-sean-lopp) alluded to some upcoming plot caching features we're adding to Shiny, and I imagine they will have as dramatic an effect for this test as they did for Sean. ## Summing up