Skip to content
This repository has been archived by the owner on Jan 11, 2021. It is now read-only.

Update documentation #81

Merged
merged 32 commits into from
Apr 13, 2018
Merged

Update documentation #81

merged 32 commits into from
Apr 13, 2018

Conversation

sadikovi
Copy link
Collaborator

This PR updates crate documentation and makes some very minor changes, either in code or comments.

Following is done:

  • Updated docs, so now we can generate more or less usable docs with cargo doc. I learned that it is good to run cargo doc --no-deps, to generate docs for this crate only when developing.
  • Added small example in README + usage + link to documentation.
  • Added comment for column/reader.rs tests to give an overview on how setup works.

I would encourage to view changes as generated documentation as well as code. You can check out this branch and run cargo doc --no-deps --open to open documentation in your browser.

Closes #58
Closes #77
Closes #78

@sadikovi
Copy link
Collaborator Author

@sunchao Could you review this PR? The goal of this work is having documentation/some starting point for people who want to read Parquet files.

Publishing docs

I added badge in README that points to docs.rs for parquet (it is broken currently). I thought about this and I think that publishing on docs.rs is arguably the best way to do it right now, which is what you also mentioned on the issue #78 .

My main points are:

  • No setup required, it is sort of generated automatically as we release.
  • Most of the people publish on docs.rs.

What stops us from doing this currently is generating parquet.rs file. Though I understand why you would want to have it generated (this is how it works right now), I would suggest that we actually commit the parquet.rs as part of the repository alongside parquet.thrift.

When we need to update format, we will create a special PR that updates parquet.rs only and potentially fixes code to work with updated format. We will document or have a script that would generate parquet.thrift -> parquet.rs and will run manually when we decide to update the format.

Reasons for committing parquet.rs into repository:

  • Fixes documentation build.
  • Simplifies travis setup, and speeds up build time (it is not really a big deal, tests run reasonably fast anyway).
  • parquet.rs does not change often, actually only changes when we change format.
  • Rarely thrift repo may be broken, so in this case we would not notice the impact of those breaking changes.
  • Assumption that parquet.rs is generated identically regardless of OS.
  • Helps other people to import the crate, without requirement of setting up thrift (and we would only use it to generate single file), all they would need is just adding parquet to their Cargo.toml.

Let me know your thoughts, as you are more familiar with overall setup and parquet.thrift generation. If so, I will be more than happy to open other pull requests to update the code.

@coveralls
Copy link

coveralls commented Apr 10, 2018

Coverage Status

Coverage decreased (-0.02%) to 94.763% when pulling 1a11586 on sadikovi:update-documentation into 1fd277a on sunchao:master.

Copy link
Owner

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sadikovi . This is excellent work!

README.md Outdated
}
```
See crate documentation on available API.

## Requirements
- Rust nightly
- Thrift 0.11.0 or higher
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite related, but we may want to also point out the parquet format version that we support - currently it is 2.3.2?

src/basic.rs Outdated
/// - FLOAT - 4 bytes per value, stored as little-endian.
/// - DOUBLE - 8 bytes per value, stored as little-endian.
/// - BYTE_ARRAY - 4 byte length stored as little endian, followed by bytes.
/// FIXED_LEN_BYTE_ARRAY - just the bytes are stored.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missed - at the beginning.

src/lib.rs Outdated
#[macro_use]
pub mod errors;
pub mod basic;
pub mod data_type;
mod parquet_thrift;

#[macro_use]
pub mod util;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break cargo bench. We need to find a way to expose only the necessary APIs.

/// Conversion from Thrift equivalents

/// Conversion from Thrift equivalents.
/// Should be considered private.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why pub can't be removed? add a TODO?

@sadikovi
Copy link
Collaborator Author

I just need to document encoding and decoding modules, and review again.

@sadikovi
Copy link
Collaborator Author

@sunchao I think it is ready for review now. Would you mind having a look again? Thanks!

Copy link
Owner

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! with a few minor comments.

README.md Outdated
```
See crate documentation on available API.

## Versions
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of Versions, maybe Supported Parquet Version?
Also, the version we support right now is 2.4.0

// Creates a new `PrimitiveType` instance from the gathered attributes.
// This also checks various illegal conditions and returns `Err` if that that happen.
/// Creates a new `PrimitiveType` instance from the gathered attributes.
/// Also checks various illegal conditions and returns `Err` if that that happen.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra that. happen -> happens.

@sadikovi
Copy link
Collaborator Author

I updated the code, can you have a look again? Thanks!

Copy link
Owner

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sunchao sunchao merged commit 5e4758c into sunchao:master Apr 13, 2018
@sadikovi sadikovi deleted the update-documentation branch April 13, 2018 22:45
@sadikovi
Copy link
Collaborator Author

Thanks!

@sunchao sunchao mentioned this pull request Apr 14, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants