Increase benchmark iters for Android jobs #6297

guangy10 · 2024-10-16T18:21:38Z

Per the test here the data variance from run-to-run are still very large on Android.

Test several runs from different commits:

1k iter (2336f0d): https://github.com/pytorch/executorch/actions/runs/11375393544
1k iter (529c161): https://github.com/pytorch/executorch/actions/runs/11376915381
2k iter (7723492): https://github.com/pytorch/executorch/actions/runs/11379065750
2k iter (df0db16): https://github.com/pytorch/executorch/actions/runs/11379252072

Metrics Comparison on the dashboard:

https://hud.pytorch.org/benchmark/llms?startTime=Thu%2C%2010%20Oct%202024%2005%3A31%3A43%20GMT&stopTime=Thu%2C%2017%20Oct%202024%2005%3A31%3A43%20GMT&granularity=hour&lBranch=increase_benchmark_iter&lCommit=529c1619ef609aeda14f1724d20733b5c57dbbfc&rBranch=increase_benchmark_iter&rCommit=2336f0d4a1046c5a37861623ac19cf075109a852&repoName=pytorch%2Fexecutorch&modelName=All%20Models&dtypeName=All%20DType&deviceName=All%20Devices

pytorch-bot · 2024-10-16T18:21:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6297

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit df0db16 with merge base 5e44991 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

guangy10 · 2024-10-16T18:23:09Z

@huydhn I scheduled 4 runs from different commits, and ensure all 4 runs finished successfully. However, the dashboard only shows 2 commits (expect 4).

huydhn · 2024-10-16T23:11:27Z

Hmm, I'm seeing 4 commits on your branch?

Also, GitHub data from around 11AM to 4PM today was lost due to the HUD outage. I think I'm seeing the same thing on my test branch.

As the issue is mitigated now, you probably need to schedule new run (or just rerun those missing one I think)

guangy10 · 2024-10-16T23:17:56Z

Also, GitHub data from around 11AM to 4PM today was lost due to the HUD outage. I think I'm seeing the same thing on my test branch.

Expect 4 points from today. Maybe due to the HUD outage

guangy10 · 2024-10-16T23:32:55Z

@kirklandsign It's a surprise that the total time in Running state varies so much. Take dl3 for example, you can see in job_1 (link), it's in the Running state (execute 100 iterations) for only 3 mins, and in job_2 (link), it's in the Running state for 12 mins; in job_3 (link), it's in the Running state for 1 mins.

In the 12mins job (seems to be an outlier, maybe due to context switching during execution??), the data is not shown in the dashboard, it's execution latency is 881.29ms, far different from 752.77 and 712.32.

1-3 mins are expected given 752.77ms x 100 / 60000 = 1.25mins for execution and additional 1-2 mins overhead for setup&teardown.

kirklandsign · 2024-10-16T23:38:55Z

In the 12mins job (seems to be an outlier, maybe due to context switching during execution??), the data is not shown in the dashboard, it's execution latency is 881.29ms, far different from 752.77 and 712.32.

Good point. I also see something wrong in the log

2024-10-16T19:08:35.7452548Z Starting: Intent { cmp=org.pytorch.minibench/.BenchmarkActivity (has extras) }
2024-10-16T19:08:35.7453249Z Status: timeout
2024-10-16T19:08:35.7453625Z LaunchState: UNKNOWN (-1)
2024-10-16T19:08:35.7454097Z Activity: org.pytorch.minibench/.BenchmarkActivity
2024-10-16T19:08:35.7454617Z WaitTime: 10129

guangy10 · 2024-10-16T23:58:02Z

Good point. I also see something wrong in the log

2024-10-16T19:08:35.7452548Z Starting: Intent { cmp=org.pytorch.minibench/.BenchmarkActivity (has extras) }
2024-10-16T19:08:35.7453249Z Status: timeout
2024-10-16T19:08:35.7453625Z LaunchState: UNKNOWN (-1)
2024-10-16T19:08:35.7454097Z Activity: org.pytorch.minibench/.BenchmarkActivity
2024-10-16T19:08:35.7454617Z WaitTime: 10129

That's another issue. All jobs report that actually, for dl3 xnnpack.
What does that error mean? "Status: timeout" even if the job is running for 1 or 3 mins? If this is a state we should track in the benchmark result to exclude bad results, we should add it.

guangy10 · 2024-10-17T05:41:03Z

W/ 1000 iters, still end up having many green|red spots. Since it's still taking less then 10mins, could experiment by bump up iters to 2k

kirklandsign · 2024-10-17T06:03:13Z

That's another issue. All jobs report that actually, for dl3 xnnpack.

Sorry let me fix the app. Probably the run itself is good, but I ran it on UI thread, which we shouldn't. Let me use a background thread for it. Try #6320

kirklandsign · 2024-10-17T23:13:29Z

...n/benchmark/android/benchmark/app/src/main/java/org/pytorch/minibench/BenchmarkActivity.java

@@ -42,7 +42,7 @@ protected void onCreate(Bundle savedInstanceState) {
            .findFirst()
            .get();

-    int numIter = intent.getIntExtra("num_iter", 50);
+    int numIter = intent.getIntExtra("num_iter", 2000);


Honestly when I tried locally I see 100 is already stable :p

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 16, 2024

guangy10 force-pushed the increase_benchmark_iter branch 3 times, most recently from b962c9b to 4855002 Compare October 16, 2024 18:35

guangy10 had a problem deploying to upload-benchmark-results October 16, 2024 18:58 — with GitHub Actions Failure

guangy10 temporarily deployed to upload-benchmark-results October 16, 2024 20:33 — with GitHub Actions Inactive

guangy10 had a problem deploying to upload-benchmark-results October 16, 2024 21:13 — with GitHub Actions Failure

guangy10 temporarily deployed to upload-benchmark-results October 16, 2024 21:31 — with GitHub Actions Inactive

kirklandsign approved these changes Oct 16, 2024

View reviewed changes

guangy10 force-pushed the increase_benchmark_iter branch from 4855002 to 2336f0d Compare October 16, 2024 23:20

guangy10 marked this pull request as ready for review October 16, 2024 23:33

guangy10 force-pushed the increase_benchmark_iter branch from 2336f0d to 47a8aed Compare October 16, 2024 23:41

guangy10 temporarily deployed to upload-benchmark-results October 17, 2024 00:19 — with GitHub Actions Inactive

guangy10 force-pushed the increase_benchmark_iter branch 3 times, most recently from 11a15cf to 529c161 Compare October 17, 2024 01:15

guangy10 temporarily deployed to upload-benchmark-results October 17, 2024 02:33 — with GitHub Actions Inactive

guangy10 force-pushed the increase_benchmark_iter branch from 529c161 to 7723492 Compare October 17, 2024 05:41

Increase benchmark iters for Android jobs

df0db16

guangy10 force-pushed the increase_benchmark_iter branch from 7723492 to df0db16 Compare October 17, 2024 06:01

guangy10 temporarily deployed to upload-benchmark-results October 17, 2024 06:38 — with GitHub Actions Inactive

kirklandsign reviewed Oct 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase benchmark iters for Android jobs #6297

Increase benchmark iters for Android jobs #6297

guangy10 commented Oct 16, 2024 •

edited

Loading

pytorch-bot bot commented Oct 16, 2024 •

edited

Loading

guangy10 commented Oct 16, 2024 •

edited

Loading

huydhn commented Oct 16, 2024 •

edited

Loading

guangy10 commented Oct 16, 2024 •

edited

Loading

guangy10 commented Oct 16, 2024 •

edited

Loading

kirklandsign commented Oct 16, 2024

guangy10 commented Oct 16, 2024 •

edited

Loading

guangy10 commented Oct 17, 2024

kirklandsign commented Oct 17, 2024 •

edited

Loading

kirklandsign Oct 17, 2024

Increase benchmark iters for Android jobs #6297

Are you sure you want to change the base?

Increase benchmark iters for Android jobs #6297

Conversation

guangy10 commented Oct 16, 2024 • edited Loading

pytorch-bot bot commented Oct 16, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6297

✅ No Failures

guangy10 commented Oct 16, 2024 • edited Loading

huydhn commented Oct 16, 2024 • edited Loading

guangy10 commented Oct 16, 2024 • edited Loading

guangy10 commented Oct 16, 2024 • edited Loading

kirklandsign commented Oct 16, 2024

guangy10 commented Oct 16, 2024 • edited Loading

guangy10 commented Oct 17, 2024

kirklandsign commented Oct 17, 2024 • edited Loading

kirklandsign Oct 17, 2024

Choose a reason for hiding this comment

guangy10 commented Oct 16, 2024 •

edited

Loading

pytorch-bot bot commented Oct 16, 2024 •

edited

Loading

guangy10 commented Oct 16, 2024 •

edited

Loading

huydhn commented Oct 16, 2024 •

edited

Loading

guangy10 commented Oct 16, 2024 •

edited

Loading

guangy10 commented Oct 16, 2024 •

edited

Loading

guangy10 commented Oct 16, 2024 •

edited

Loading

kirklandsign commented Oct 17, 2024 •

edited

Loading