Skip to content

Tips on speeding up database build times? #1110

Discussion options

You must be logged in to vote

@why-does-ie-still-exist there are a few things you can do to improve this.

Ludwig actually builds a cache of processed data after you run it the first time specifically to avoid this phenomenon, although there's an open issue (will solve it soon) about a bug that makes it recreate the cache when it is not needed #1078 . So when that issue is solved, this should not happen anymore (unless you change the preprocessing in your model definition).

Once Ludwig does preprocessing it creates a .hdf5 and .json file with the same name of the dataset, if in subsequent runs you provide those instead of the csv as inputs you should not pay the cost of the preprocessing as those files are the actual c…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Answer selected by w4nderlust
Comment options

You must be logged in to vote
1 reply
@w4nderlust
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants