Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Memory] Add options to reduce memory consumed by RawSource #155

Merged
merged 6 commits into from
Jul 22, 2024

Conversation

mknichel
Copy link
Contributor

@mknichel mknichel commented May 20, 2024

This PR adds two options to reduce memory used by RawSource: disable dual caching of string and buffer content and intern strings across source objects. These issues were observed while investigating memory usage of Next.js applications where Webpack had duplicate copies of strings/buffers in memory.

These options are optional so that they should not affect existing users but can be enabled for users who wish to reduce memory usage at the potential of a slight compilation time increase.

Disable dual caching of string and buffer contents

Upon creation, RawSource stores either a string or buffer. When requested, it will create the other type and then store it in memory for any future calls:

		if (this._valueAsBuffer === undefined) {
			this._valueAsBuffer = Buffer.from(this._value, "utf-8");
		}
		return this._valueAsBuffer;

This will cause the same content to be stored twice in memory, roughly doubling the memory consumption for RawSource objects (a buffer will be slightly bigger than the string, but not by too much).

This will save any future conversion that has to be done. However, these conversions aren't too expensive CPU wise (Buffer.from runs at 34000-36000 ops/sec on an Apple M1 Pro on a 150kb file), so it makes sense for large applications or memory constrained applications to disable this with only slight increases in compile times.

Intern strings across Raw Sources

Files that are repeated in multiple Webpack layers will be loaded into multiple RawSource objects and repeat the file contents in memory. Next.js App Router uses Webpack layers and therefore we are seeing file contents in next build appear duplicate times.

An option to intern strings is added in this pull request. With string interning, the same reference will be used across multiple RawSources so the memory isn't duplicated. v8 doesn't support string interning for dynamically allocated strings, but this can be implemented in JavaScript with a Map. The caller is responsible for deleting the contents of the string intern cache when it is no longer needed (such as at the end of a compilation).

Impact

I tested next build for 3 different Next.js applications with these changes: nodejs.org, and two private Next.js applications of medium and large sizes.

  • Nodejs.org would save ~20 MB of heap by enabling these optimizations out of ~315MB total used by the compilation and increase compile times by < 1 second
  • Internal medium sized Next.js application would save ~270MB of memory with these optimizations out of ~2.1GB heap total and would increase compile times by ~1-2 seconds.
  • Internal large Next.js application would save ~330MB of memory with these optimizations and would increase compile times by ~10 seconds.

Copy link

linux-foundation-easycla bot commented May 20, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Member

@alexander-akait alexander-akait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, good catch, maybe we should review out codebase and try to only strings when it is possible, it can decrease build time for all too (feel free to do it), also can you accept CLA?

*
* @returns {void}
*/
function disableStringInterning() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disable sounds too much like it would stop interning unconditionally and doesn't mention the ref counting.
Maybe call it start/endStringInterningRange or something else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Switched to enter/exitStringInterningRange

ijjk pushed a commit to vercel/next.js that referenced this pull request May 22, 2024
…s in webpack-sources (#66003)

This PR adds a flag to Next.js to enable Webpack options to improve
memory usage. See webpack/webpack-sources#155
for a full description of the changes and impact on memory.

This PR adds a patch to `webpack-sources` temporarily that contains the
fixes as the real changes are iterated on to merge upstream in the
`webpack/webpack-sources` repository. After that is done, the patch will
be reverted and the latest `webpack-sources` version will be updated in
Next.js.
@mknichel
Copy link
Contributor Author

I see, good catch, maybe we should review out codebase and try to only strings when it is possible, it can decrease build time for all too (feel free to do it), also can you accept CLA?

Thank you for the review. I signed the CLA and I applied the optimization to the other Source classes that cached copies of string and buffer. I didn't apply the string interning optimization to those classes since it can increase memory usage for some Source objects if they aren't stored for the entire Compilation so I wanted to limit it to just RawSource right now.

@mknichel mknichel requested review from alexander-akait and sokra May 23, 2024 23:01
@alexander-akait
Copy link
Member

/cc @sokra can you review it?

Copy link

codecov bot commented Jun 7, 2024

Codecov Report

Attention: Patch coverage is 95.45455% with 7 lines in your changes missing coverage. Please review.

Project coverage is 93.14%. Comparing base (19d4e8e) to head (dc41538).

Files Patch % Lines
lib/RawSource.js 83.33% 3 Missing and 1 partial ⚠️
lib/SourceMapSource.js 97.36% 2 Missing ⚠️
lib/CachedSource.js 94.11% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #155      +/-   ##
==========================================
- Coverage   93.48%   93.14%   -0.34%     
==========================================
  Files          23       24       +1     
  Lines        1703     1779      +76     
  Branches      539      566      +27     
==========================================
+ Hits         1592     1657      +65     
- Misses        104      112       +8     
- Partials        7       10       +3     
Flag Coverage Δ
integration 93.14% <95.45%> (-0.34%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alexander-akait
Copy link
Member

/cc @mknichel Hello, What is status?

@mknichel
Copy link
Contributor Author

Hi @alexander-akait, Sorry I have been OOO and I believe @sokra has too. I pinged @sokra to take a look.

lib/RawSource.js Outdated Show resolved Hide resolved
lib/RawSource.js Outdated Show resolved Hide resolved
lib/RawSource.js Outdated
Comment on lines 71 to 83
if (this._value === undefined && isDualStringBufferCachingEnabled()) {
this._value = Buffer.from(this._valueAsBuffer, "utf-8");
}
if (this._valueAsString === undefined) {
this._valueAsString =
let strValue = this._valueAsString;
if (strValue === undefined) {
strValue =
typeof this._value === "string"
? this._value
: this._value.toString("utf-8");
: internString(this._value.toString("utf-8"));
if (isDualStringBufferCachingEnabled()) {
this._valueAsString = strValue;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like we want to call const strValue = this.source()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is a bit strange since source() can return a Buffer (when convertToString = false) but it looks a bit cleaner now.

@@ -9,6 +9,9 @@ const splitIntoLines = require("./helpers/splitIntoLines");
const getGeneratedSourceInfo = require("./helpers/getGeneratedSourceInfo");
const Source = require("./Source");
const splitIntoPotentialTokens = require("./helpers/splitIntoPotentialTokens");
const {
isDualStringBufferCachingEnabled
} = require("./helpers/stringBufferUtils");

class OriginalSource extends Source {
constructor(value, name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to internal OriginalSource value too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally limited string interning just to RawSource for now since during my experiments, it seemed like memory may increase if applying the interning in other source objects (if it retains memory beyond the lifetime of the source object). I would need to go back and more rigorously determine the behavior for each class to understand where the memory is coming from, but at least for RawSource this was an improvement.

@@ -8,6 +8,9 @@ const Source = require("./Source");
const streamChunksOfSourceMap = require("./helpers/streamChunksOfSourceMap");
const streamChunksOfCombinedSourceMap = require("./helpers/streamChunksOfCombinedSourceMap");
const { getMap, getSourceAndMap } = require("./helpers/getFromStreamChunks");
const {
isDualStringBufferCachingEnabled
} = require("./helpers/stringBufferUtils");

class SourceMapSource extends Source {
constructor(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to internal value and originalSource here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sokra sokra merged commit 88d0e04 into webpack:main Jul 22, 2024
19 of 22 checks passed
@alexander-akait
Copy link
Member

@sokra Do we need to make release?

ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Aug 15, 2024
…s in webpack-sources (#66003)

This PR adds a flag to Next.js to enable Webpack options to improve
memory usage. See webpack/webpack-sources#155
for a full description of the changes and impact on memory.

This PR adds a patch to `webpack-sources` temporarily that contains the
fixes as the real changes are iterated on to merge upstream in the
`webpack/webpack-sources` repository. After that is done, the patch will
be reverted and the latest `webpack-sources` version will be updated in
Next.js.
ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Aug 16, 2024
…s in webpack-sources (#66003)

This PR adds a flag to Next.js to enable Webpack options to improve
memory usage. See webpack/webpack-sources#155
for a full description of the changes and impact on memory.

This PR adds a patch to `webpack-sources` temporarily that contains the
fixes as the real changes are iterated on to merge upstream in the
`webpack/webpack-sources` repository. After that is done, the patch will
be reverted and the latest `webpack-sources` version will be updated in
Next.js.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants