Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multipart resources. #15

Closed
georgeslabreche opened this issue Nov 12, 2017 · 7 comments
Closed

Support for multipart resources. #15

georgeslabreche opened this issue Nov 12, 2017 · 7 comments

Comments

@georgeslabreche
Copy link

Support for multipart resources.

@georgeslabreche
Copy link
Author

georgeslabreche commented Nov 13, 2017

@roll this is already supported unless I am missing something? Please have a look at the README where I document how to instanciate this datapackage with multipart resources. We can also add and remove Resources.

@roll
Copy link
Member

roll commented Nov 13, 2017

@georgeslabreche
This means having an individual resource split into a few parts:

  • data-part1.csv -> headers/row1/row2
  • data-part2.csv -> row3/row4/etc

http://frictionlessdata.io/specs/data-resource/#data-in-multiple-files


This is the lowest priority feature from the specs-v1.

Now it's implemented only for Python - https://github.com/frictionlessdata/datapackage-py/blob/master/datapackage/resource.py#L505 - and kinda tricky for some platforms.

@georgeslabreche
Copy link
Author

@roll IteratorChain saved the day!
https://commons.apache.org/proper/commons-collections/javadocs/api-2.1.1/org/apache/commons/collections/iterators/IteratorChain.html

Is it OK that we have to explicitly set a base path for the Resource object if the datapackage is using relative file paths?

@roll
Copy link
Member

roll commented Nov 30, 2017

@georgeslabreche
Great! (heh at first place I was trying to figure out who has the nickname IteratorChain=)

Base path is even a part of the reference - https://github.com/frictionlessdata/implementations#datapackage

I think in Python/JavaScript now we use current directory as a default base path.

@georgeslabreche
Copy link
Author

@roll so current directory as in the directory where in the file system the library is residing or the directory where the app using the library is residing? I'm not 100% convinced this is the best approach, with java at least, because of the potential different possibilities where that library will end up. I'm not yet sure which path is retrieved when using something like System.getProperty("user.dir"); from within the library when it is invoke by some other Main application. Hmm.

@roll
Copy link
Member

roll commented Nov 30, 2017

@georgeslabreche
Can't say for Java but the idea that user has a working directory with e.g. datapackage.json. So instead of pushing him to write something like this package = Package('descriptor.json', base_path='.') other libs just use current working directory by default.

@georgeslabreche
Copy link
Author

georgeslabreche commented Nov 30, 2017

@roll I will test the path retrieval behaviour in Java to make sure I use the proper way of getting the working directory in a predictable manner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants