New blog: Midterm report for Static-Python (SoR) (#645)

* mrigankpawagi: updated profile description * mrigankpawagi: first blog for SoR24 - project introduction * Update index.md * new blog: midterm report for mrigankpawagi * fixed typos --------- Co-authored-by: Carlos Maltzahn <carlosm@ucsc.edu>
ucsc-ospo · Sep 16, 2024 · 7e4f93f · 7e4f93f
1 parent ab40073
commit 7e4f93f
Show file tree

Hide file tree

Showing 2 changed files with 43 additions and 0 deletions.
diff --git a/content/report/osre24/uutah/static-python-perf/20240909-mrigankpawagi/featured.png b/content/report/osre24/uutah/static-python-perf/20240909-mrigankpawagi/featured.png
diff --git a/content/report/osre24/uutah/static-python-perf/20240909-mrigankpawagi/index.md b/content/report/osre24/uutah/static-python-perf/20240909-mrigankpawagi/index.md
@@ -0,0 +1,43 @@
+---
+title: "Midterm Report: Deriving Realistic Performance Benchmarks for Python Interpreters"
+# subtitle: "YOUR SUBTITLE (OPTIONAL)"
+summary:
+authors: [mrigankpawagi]
+#   - USERNAME1
+#   - USERNAME2
+tags: ["osre24", reproducibility, python, "performance benchmarks", "load testing"]
+categories: []
+date: 2024-08-17
+lastmod: 2024-08-17
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+  caption: "Snapshot from a run of load testing with Locust"
+  focal_point: "Center"
+  preview_only: false
+---
+Hi, I am Mrigank. As a _Summer of Reproducibility 2024_ fellow, I am working on [deriving realistic performance benchmarks for Python interpreters](/report/osre24/uutah/static-python-perf/20240817-mrigankpawagi/) with {{% mention bennn %}} from the University of Utah. In this post, I will provide an update on the progress we have made so far.
+
+## Creating a Performance Benchmark
+
+We are currently focusing on applications built on top of Django, a widely used Python web framework. For our first benchmark, we chose [Wagtail](https://github.com/wagtail/wagtail), a popular content management system. We created a pipeline with locust to simulate real-world load on the application. All of our work is open-sourced and available on our [GitHub repository](https://github.com/utahplt/static-python-perf/blob/main/Benchmark/wagtail/locustfile.py). 
+
+This load-testing pipeline creates hundreds of users who independently create many blog posts on a Wagtail blog site. At the same time, thousands of users are spawned to view these blog posts. Wagtail does not have a built-in API and so it took some initial effort to figure out the endpoints to hit, which I did by inspecting the network logs in the browser while interacting with the Wagtail admin interface.
+
+A snapshot from a run of the load test with Locust is shown in the featured image above. This snapshot was generated by spawning users from 24 different parallel locust processes. This was done on a local server, and we plan to perform the same experiments on CloudLab soon.
+
+## Profiling
+
+On running the load tests with a profiler, we found that the bottlenecks in the performance arose not from the Wagtail codebase but from the Django codebase. In particular, we identified three modules in Django that consumed the most time during the load tests: `django.db.backends.sqlite3._functions`, `django.utils.functional`, and `django.views.debug`. [Dibri](https://github.com/dibrinsofor), a graduate student in Ben's lab, is helping us add types to these modules.
+
+## Next Steps
+
+Based on these findings, we are now working on typing these modules to see if we can improve the performance of the application by using Static Python. Typing Django is a non-trivial task, and while there have been some efforts to do so, previous attempts like [django-stubs](https://github.com/typeddjango/django-stubs) are incomplete for our purpose.
+
+We are also writing scripts to mix untyped, shallow-typed, and advanced-typed versions of a Python file, and run each mixed version several times to obtain a narrow confidence interval for the performance of each version.
+
+We will be posting more updates as we make progress. Thank you for reading!