Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use URI scheme instead of FileSystem type to write the correct value in the DB #726

Closed
wants to merge 1 commit into from
Closed

Conversation

davideanastasia
Copy link
Contributor

On EMR, the default FileSystem for S3 is com.amazon.ws.emr.hadoop.fs.EmrFileSystem, which is automatically configured with Key and SecretKey during the cluster startup. However, Druid is expecting to use NativeS3FileSystem to write into S3, which fails on EMR with the following stack trace:

Error: com.metamx.common.ISE: Unknown file system[class com.amazon.ws.emr.hadoop.fs.EmrFileSystem] at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.serializeOutIndex(IndexGeneratorJob.java:453) at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:387) at io.druid.indexer.IndexGeneratorJob$IndexGeneratorReducer.reduce(IndexGeneratorJob.java:252) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:635) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

While this can be corrected using

-Dhadoop.fs.s3n.impl=org.apache.hadoop.fs.s3native.NativeS3FileSystem

and many other option to setup the access key and secret key, this patch will remove the problem

loadSpec = ImmutableMap.<String, Object>of(
"type", "hdfs",
"path", indexOutURI.getPath()
);
} else {
throw new ISE("Unknown file system[%s]", outputFS.getClass());
throw new ISE("Unknown file system[%s] for URI[%]", outputFS.getClass(), indexOutURI.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we now check URI scheme, it might be more useful to indicate the unknown scheme as opposed to the filesystem in the error message, i.e "Unknown scheme[%s] for output URI[%s]"

@fjy
Copy link
Contributor

fjy commented Sep 16, 2014

Hi @davideanastasia, this is great. We have your personal CLA, do you know if you need to have a corporate one signed? If not, we can merge this I think.

@davideanastasia
Copy link
Contributor Author

Hi, my employer is happy for my patches to go under my personal CLA. Hence, I will keep using my personal email address to do the commits/pull requests. Thanks :)

if (outputFS instanceof NativeS3FileSystem) {
String indexOutScheme = indexOutURI.getScheme();

if ((indexOutScheme == null) || (indexOutScheme.compareToIgnoreCase("file") == 0)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it always true that a null scheme always maps to local filesystem? Is it possible that this may depend on what the default hadoop filesystem is set to? It would be good to double-check the behavior there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is null when the string is of the type /home/davide/hadoop. To maintain back-compatibility, I have to assume that no scheme means local filesystem.

Further, it shouldn't depend on Hadoop, as the URI class is not using the Configuration class in any way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hadoop interprets null schemes using fs.defaultFS, which might not be local, so I think this patch will break things for users that expect to provide "/foo/bar" and have the segments actually go to HDFS.

It should work to do FileSystem.get(new Path(schema.getIOConfig().getSegmentOutputPath()), context.getConfiguration() to get the right fileSystem for the base segment output path, then call config.makeSegmentOutputPath on that, and keep the instanceof checks that were originally in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outputFS instanceof NativeS3FileSystem

is false in AWS EMR even though you want to write on S3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with gian, for us "null" uri scheme means putting segments on the hdfs and not local. we never explicitly set "hdfs://.../" in the segmentOutputPath but just "/foo/bar".

@fjy fjy force-pushed the master branch 2 times, most recently from 8b0ec82 to d05032b Compare February 1, 2015 04:57
@gianm
Copy link
Contributor

gianm commented Apr 27, 2015

@davideanastasia, would it work to add some or'ed checks to the s3_zip conditional saying that we should do that if it's a S3NativeFileSystem or if finalIndexZipFilePath's scheme is "s3" or "s3n"? I think that should fix your problem but also not break the behavior for null schemes.

@davideanastasia
Copy link
Contributor Author

@gianm what you are proposing should work. Would you like me to do the change?

@gianm
Copy link
Contributor

gianm commented Apr 27, 2015

@davideanastasia, if you could, that would be awesome.

@drcrallen
Copy link
Contributor

I have also been pushing for more and better URI support. This patch went in recently which started adding more URI standard usage in the code:

#1132

@davideanastasia
Copy link
Contributor Author

@gianm Done! Sorry for the long wait, I have been quite busy in the last week or so at work :(

@nishantmonu51
Copy link
Member

can you also fix the merge conflicts ?

@davideanastasia
Copy link
Contributor Author

@nishantmonu51 wow, didn't realized this piece of code had change in this way. I have changed my patch to accommodate the new code structure with the fully qualified class name (which I have take straight from a valid EMR cluster I currently have live).

@drcrallen
Copy link
Contributor

I actually started looking into this code path for other tasks, and I would like to propose using outputFs.getScheme() to determine the type of FS

@drcrallen
Copy link
Contributor

Ideally then omitting any sort of scheme in the URI will simply use whatever method Hadoop uses to resolve URIs, and we can simply use the appropriate known schemes until Druid has more URI support.

@gianm
Copy link
Contributor

gianm commented Jun 24, 2015

@davideanastasia, sorry for letting this fall by the wayside. The patch looks good and I'd like to merge it. Can you please fill out the Druid CLA at http://druid.io/community/cla.html?

Don't worry about the merge conflicts- I can fix those when merging the patch.

@davideanastasia
Copy link
Contributor Author

@gianm you already have a CLA that I have signed when I made a small pull request into the PyDruid. Is that enough?

@drcrallen
Copy link
Contributor

I believe this is solved by #1428 which uses scheme instead of class name.

@gianm
Copy link
Contributor

gianm commented Jun 24, 2015

Ah, okay, I see: druid-io/pydruid#12 (comment). That's enough. If you don't mind filling out the form, that'd still be nice, but it's not required.

@drcrallen actually just pointed out to me that #1428 was recently merged and should solve this same problem a different way. So, I'm going to close this PR, but please let us know if you still have problems after running a version of Druid that includes #1428.

@gianm gianm closed this Jun 24, 2015
paul-rogers pushed a commit to paul-rogers/druid that referenced this pull request Jan 17, 2022
…e#726)

* make clarity-emitter http client worker pool size configurable

* add default worker pool size of min of 10 or 2 * number cores
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants