Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align output to buffer size in rcfile writer #9001

Merged
merged 1 commit into from
Sep 19, 2017
Merged

Conversation

highker
Copy link
Contributor

@highker highker commented Sep 19, 2017

Flushing data in arbitrary sizes can be a problem for memory
fragmentation. This is generally fine unless the output stream can
allocate native memory (e.g., gzip streams). Native memory allocator may
not be able to compact fragmentation, which can cause native memory OOM.

Resolves #8993

length -= flushLength;
}

// line up the chuck to buffer size and flush directly to OutputStream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

@highker highker force-pushed the rcfile branch 2 times, most recently from a188bec to 887e3a9 Compare September 19, 2017 18:35

// line up the chunk to chunk size and flush directly to OutputStream
int flushLength = length - length % CHUNK_SIZE;
for (int i = 0; i < flushLength / CHUNK_SIZE; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be simpler to do:

while (length <= CHUNK_SIZE) {
    writeToOutputStream(source, sourceIndex, CHUNK_SIZE);
    sourceIndex += CHUNK_SIZE;
    length -= CHUNK_SIZE;
}

then after the loop you don't need to touch length of bufferOffset.

}

// line up the chunk to chunk size and flush directly to OutputStream
int flushLength = length - length % CHUNK_SIZE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

// test some different buffer sizes
for (int bufferSize : new int[] {4096, 4345, 65535, 65536, 65537, 100000}) {
// check byte array version
ByteArrayOutputStream byteOutputStream = new ByteArrayOutputStream(length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a subclass that captures the flush sizes and then validate you don't get flushes > 4k

Copy link
Contributor

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, but otherwise looks good

}

// buffer the remaining data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrapper this with if (length != 0)

}

// buffer the remaining data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Flushing data in arbitrary sizes can be a problem for memory
fragmentation. This is generally fine unless the output stream can
allocate native memory (e.g., gzip streams). Native memory allocator may
not be able to compact fragmentation, which can cause native memory OOM.
@highker highker merged commit 7e0c47d into prestodb:master Sep 19, 2017
@highker highker deleted the rcfile branch September 22, 2017 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants