-
Notifications
You must be signed in to change notification settings - Fork 156
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Do not do rereading step in fread (#2558)
Since the very early days, the design of fread was such that if any "type-bumps" happen in the middle of a file, then the corresponding column(s) will be marked as "require re-reading", and then at the end we would re-parse the entire file using the new column type. This approach has obvious drawbacks: - the amount of time necessary to read the file almost doubles (and type bumps happen more often than you'd think); - the logic behind type-bumping is very error-prone; - it is impossible to read a stream-like input, where the data cannot be arbitrarily rewound; The new approach for handling type-bumping is the following: - When a type-bump occurs while reading a chunk, we enter the ordered section and temporarily suspend execution of other threads; - While all other threads are paused, we: - "archive" the column which was type-bumped; - update the global types array with the new column parse types; - After that, the parallel execution resumes from the start of the type-bumped chunk; - (the process above may occur multiple times with different columns or different chunks); - In the end, when the final frame is constructed, the columns that were comprised of multiple chunks are rbound together with automatic type promotion. Closes #1843 Closes #1446
- Loading branch information
Showing
29 changed files
with
759 additions
and
450 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.