Provide more helpful error messages to the GitHub user #146

LilithHafner · 2022-12-03T06:56:46Z

Nanosoldier currently replies with a pretty unhelpful message when anything goes wrong: JuliaLang/julia#47788 (comment). This behavior was introduced in #114, because the error logging code had inadvertently logged environmental details, including AWS tokens. That's really bad, so we ripped out the functionality that reports errors back to the user.

Some of that functionality should be brought back, or a safe version of it at least. For example, we should be able to safely report when a failure happened (during parsing of the invocation, during test execution, etc).

maleadt · 2022-12-03T07:54:32Z

There is an error message? It clearly says your job failed 🙂 We intentionally removed additional details, see #114, because otherwise we risk leaking environmental details into the reply comment (as has happened before). Easiest solution is to have an admin take a look.

Of course, I'm inferring that is what you're complaining about. It doesn't hurt to include a word or two what you're actually filing an issue about.

LilithHafner · 2022-12-03T08:38:30Z

Is there documentation anywhere on what to do when you receive the message "Your job failed."? How can I figure out why the job failed? How can I get it to not fail? It seems to me that having jobs fail without explanation is an issue.

maleadt · 2022-12-03T08:43:27Z

How can I figure out why the job failed?

You can't, that's what I linked to above. If the job fails, an admin should be pinged (so that is missing from the comment here) to investigate. We could probably improve this, but the case is so rare that I don't think it's worth the development effort.

maleadt · 2022-12-03T10:09:27Z

We could probably improve this, but the case is so rare that I don't think it's worth the development effort.

Might be worth keeping the issue open though, in case anybody wants to help out.

LilithHafner · 2022-12-07T03:34:21Z

One workaround for folks without access to the nanosoldier machines is to utilize the CI at BaseBenchmarks.jl which provide nice stack traces (I suspect that the errors here fixed here are at the root of my use case)

vtjnash · 2022-12-07T05:22:20Z

Fwiw, I think the driver script segfaulted, not just the test failing on nanosoldier. The stacktrace generated on nanosoldier seemed pretty useless (I have posted it elsewhere somewhere)

LilithHafner · 2022-12-07T05:50:36Z

Yes. The bug is an out of bounds array access in an @inbounds. It needs to run with --checkbounds=yes to get a usable stacktrace.

vtjnash · 2022-12-07T16:34:14Z

This line is throwing errors, which is preventing us from finishing cleanup and uploading the intended logs:

Nanosoldier.jl/src/jobs/BenchmarkJob.jl

Line 468 in 65a1848

run(sudo(`$cset shield -e -- sudo -n -u $(cfg.user) -- $(shscriptpath)`))

While we should avoid trying to look for the comparison data if that line fails (and abort the run), we also should not throw a a Julia exception there if the script fails since then we also bypass all cleanup.

It shouldn't be possible for any secret data to leak into our log files, since the user task runs as a separate unprivileged user now, so we should try hard to upload them always.

maleadt closed this as completed Dec 3, 2022

maleadt reopened this Dec 3, 2022

maleadt changed the title ~~runtests job failed without error message~~ Provide more helpful error messages to the GitHub user Dec 3, 2022

maleadt mentioned this issue Jan 9, 2023

Support for invoking Nanosoldier's runtests from package repositories #154

Merged

maleadt closed this as completed in #154 Jan 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide more helpful error messages to the GitHub user #146

Provide more helpful error messages to the GitHub user #146

LilithHafner commented Dec 3, 2022 •

edited by maleadt

Loading

maleadt commented Dec 3, 2022

LilithHafner commented Dec 3, 2022

maleadt commented Dec 3, 2022

maleadt commented Dec 3, 2022

LilithHafner commented Dec 7, 2022

vtjnash commented Dec 7, 2022

LilithHafner commented Dec 7, 2022

vtjnash commented Dec 7, 2022

Provide more helpful error messages to the GitHub user #146

Provide more helpful error messages to the GitHub user #146

Comments

LilithHafner commented Dec 3, 2022 • edited by maleadt Loading

maleadt commented Dec 3, 2022

LilithHafner commented Dec 3, 2022

maleadt commented Dec 3, 2022

maleadt commented Dec 3, 2022

LilithHafner commented Dec 7, 2022

vtjnash commented Dec 7, 2022

LilithHafner commented Dec 7, 2022

vtjnash commented Dec 7, 2022

LilithHafner commented Dec 3, 2022 •

edited by maleadt

Loading