Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] io: Speedup write_data #3115

Merged
merged 1 commit into from
Jul 6, 2018
Merged

Conversation

ales-erjavec
Copy link
Contributor

Issue

Speed write_data.

Description of changes

Assemble column formatters beforehand, avoid inline logic deciding the
correct format for every cell value.

Includes
  • Code changes
  • Tests
  • Documentation

@ales-erjavec
Copy link
Contributor Author

In [1]: import Orange
In [2]: adult = Orange.data.Table("adult")
In [3]: %timeit adult.save("test-save-temp.tab")

before:
1.07 s ± 7.55 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
after:
386 ms ± 5.88 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Assemble column formatters beforehand, avoid inline logic deciding the
correct format for every cell value.
@BlazZupan
Copy link
Contributor

This is a great speed-up. I have tested it on data with over 20,000 features and 100 data instances, where speed-up was 6-fold.

@BlazZupan BlazZupan merged commit 6dce0df into biolab:master Jul 6, 2018
@ales-erjavec ales-erjavec deleted the io-write-speed branch July 10, 2018 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants