Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Qualification tool really slow writing to DBFS #64

Open
Tracked by #367
tgravescs opened this issue Jan 19, 2023 · 0 comments
Open
Tracked by #367

[BUG] Qualification tool really slow writing to DBFS #64

tgravescs opened this issue Jan 19, 2023 · 0 comments
Labels
bug Something isn't working core_tools Scope the core module (scala)

Comments

@tgravescs
Copy link
Collaborator

Describe the bug
I was running qualification tool and writing to DBFS and it takes a really long time to write the csv files. It took like 10 hours to run qualification tool on 7 different batches of 1000 event logs, where the parsing of event log itself was very fast but writing of the summary csv, exec and stages csv took a very long time. I'm not sure if this is any different than S3, we may want to measure. Perhaps it would be faster to write to local disk and then copy into DBFS or S3.

investigate more and make changes or document best practices

@tgravescs tgravescs added bug Something isn't working ? - Needs Triage labels Jan 19, 2023
@mattahrens mattahrens added the core_tools Scope the core module (scala) label Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core_tools Scope the core module (scala)
Projects
None yet
Development

No branches or pull requests

3 participants