-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Compression and File utilities for Zip and GZ handling #22
Conversation
I think I want to add one or two more options for the utilities. give me a min |
cd6228b
to
253e643
Compare
Added more |
253e643
to
309955e
Compare
get --> build in the docs shouldn't be a change. Will fix |
309955e
to
3da9158
Compare
I'm adding a few more cases, closing until they are in |
3da9158
to
280fe0a
Compare
83437d3
to
994a976
Compare
} | ||
} | ||
|
||
public static InputStream fix7036144(final InputStream in) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we call this something else and reference the bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug is referenced below, but I'll put it in the method comments and change the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
It was asked which parts are new. A very large part is new. Here's the prior CompressionUtil which did not really have retries or data consistency checks. FileUtils is new |
b676362
to
95776df
Compare
1ea906d
to
15151b0
Compare
import java.util.zip.ZipInputStream; | ||
import java.util.zip.ZipOutputStream; | ||
|
||
public class CompressionUtils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all the functions in this class need at least a little bit of documentation, especially around what their return values mean, when they will or won't throw exceptions, and what the retry behavior is.
Also, do we actually need all of these? More functions are more convenient but also more code to test and maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
8a22f18
to
406a579
Compare
} | ||
|
||
// Use unzip(ByteStream, File) if possible | ||
public static FileUtils.FileCopyResult unzip(InputStream in, File outDir) throws IOException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one's missing javadocs for some reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some of the "older" functions I did not add java docs, but rather copied them directly from druid-api. I can add for these though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to document those as well, since they are new to java-util. Otherwise, we would just leave them in druid-api. We also made some changes to the return values.
406a579
to
50f7642
Compare
@@ -35,32 +39,36 @@ | |||
// The default buffer size to use (from IOUtils) | |||
private static final int DEFAULT_BUFFER_SIZE = 1024 * 4; | |||
|
|||
public static void copyToFileAndClose(InputStream is, File file) throws IOException | |||
// It is highly advised to use FileUtils.retryCopy whenever possible, and not use a raw `InputStream` | |||
// This should only be used if there is absolutely no way to replay the InputStream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make those javadocs as well?
50f7642
to
7f3239d
Compare
7f3239d
to
368f7c3
Compare
Add Compression and File utilities for Zip and GZ handling
* Requires druid-io/druid-api#37 * Requires metamx/java-util#22 * Moves the puller logic to use a more standard workflow going through java-util helpers instead of re-writing the handlers for each impl * General workflow goes like this: 1) LoadSpec makes sure the correct Puller is called with the correct parameters. 2) The Puller sets up general information like how to make an InputStream, how to find a file name (for .gz files for example), and when to retry. 3) CompressionUtils does most of the heavy lifting when it can
* Requires druid-io/druid-api#37 * Requires metamx/java-util#22 * Moves the puller logic to use a more standard workflow going through java-util helpers instead of re-writing the handlers for each impl * General workflow goes like this: 1) LoadSpec makes sure the correct Puller is called with the correct parameters. 2) The Puller sets up general information like how to make an InputStream, how to find a file name (for .gz files for example), and when to retry. 3) CompressionUtils does most of the heavy lifting when it can
I moved these from druid-api and added / expanded the functionality.