-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stack size analysis to CI #459
base: main
Are you sure you want to change the base?
Conversation
This patch adds a CI job that prints the stack sizes for the functions in the stable NK3 firmware as well as the necessary tooling. Fixes: #313
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not include the functions that perform dynamic stack allocations. I don't think rust code would have any but it might be interesting thing to add the list of functions where the stack size is not available. Not sure if this can be obtained just by listing the function in the binary and finding the ones not mentioned in the stack-sizes section.
max/total/average/median stack size
n-th percentile stack size
sum of top-k or top-n-% functions
Maybe having a high limit, reporting all function with a stack size higher than 2kB? Or maybe report any "new" function higher than 2kB.
I'm not sure I understand the interest of the sum. For me it only makes sense for a specific stacktrace.
The sum is not useful for functional changes, but I think it could be used to analyze refactoring. We saw that minor changes that look like a no-op can have drastic effects on stack usage (e. g. trussed-dev/iso7816#20). And sometimes, inlining decisions change during refactoring. Both should be observable via the sum. Of course a change would then require manual investigation, but it could potentially be a first indicator. Tracking individual functions would also be an option, but it requires us to keep much more state to perform the comparison against, and non-trivial changes can introduce a lot of noise. |
Ok I see the appeal of the sum, but the issue I have with it is that if the sum of code-paths that cannot happen at the same time increases it's not really relevant. I also prefer the idea of tracking functions individually, but tracking it across commits would be much harder (and without even taking into accounts generics/name-mangling). I think on the other hand that tracking the count of function that have stack usage higher than some threshold is great, and wouldn't be that complex. We could manually set the limit, ensuring that any new stack-intensive function is detected and intentional. No need to keep state in the CI itself, but we can detect the introduction of any outlier. |
For me this would just be informational. It is then up to the author to decide if that change is expected for the current PR or if it is something to investigate. Tracking the count above a threshold sounds interesting. But I would still like to combine this with some quantitative metric – maybe average of functions above the threshold? |
fn main() { | ||
let mut stdin = io::stdin().lock(); | ||
let mut buffer = Vec::new(); | ||
stdin.read_to_end(&mut buffer).unwrap(); | ||
let functions = analyze_executable(&buffer).unwrap(); | ||
let mut sorted: Vec<_> = functions.defined.into_values().collect(); | ||
sorted.sort_by_key(|f| f.stack().unwrap_or(0)); | ||
for f in sorted.into_iter().rev() { | ||
if let Some(stack) = f.stack() { | ||
print!("{:6}", stack); | ||
} else { | ||
print!("None"); | ||
} | ||
print!(" {:6} ", f.size()); | ||
for (i, name) in f.names().into_iter().enumerate() { | ||
if i > 0 { | ||
print!(", "); | ||
} | ||
print!("{}", demangle::demangle(name)); | ||
} | ||
println!(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After demangling there is a lot of duplication. Maybe adding some deduplication would be nice.
Deduplication would keep the max of the stack usage, sum the binary size, and show how many time the given function is duplicated.
2424 64 trussed::client::FutureResult<T,C>::poll
2424 64 trussed::client::FutureResult<T,C>::poll
2424 96 trussed::client::FutureResult<T,C>::poll
2424 64 trussed::client::FutureResult<T,C>::poll
2424 64 trussed::client::FutureResult<T,C>::poll
2424 64 trussed::client::FutureResult<T,C>::poll
2424 64 trussed::client::FutureResult<T,C>::poll
2416 68 trussed::client::FutureResult<T,C>::poll
2416 58 trussed::client::FutureResult<T,C>::poll
2416 58 trussed::client::FutureResult<T,C>::poll
2416 58 trussed::client::FutureResult<T,C>::poll
2416 58 trussed::client::FutureResult<T,C>::poll
2416 80 trussed::client::FutureResult<T,C>::poll
2416 44 heapless_bytes::Bytes<_>::from_slice
2408 50 trussed::client::FutureResult<T,C>::poll
2408 54 trussed::client::FutureResult<T,C>::poll
2408 54 trussed::client::FutureResult<T,C>::poll
2408 50 trussed::client::FutureResult<T,C>::poll
2408 50 trussed::client::FutureResult<T,C>::poll
I tend to prefer to see the top 10 percentiles and then deciles over averages. |
This patch adds a CI job that prints the stack sizes for the functions in the stable NK3 firmware as well as the necessary tooling.
Fixes: #313
Currently, this just prints the stack sizes. We should try to transform this into some metrics, but I’m not sure what would be useful:
Ideally we would have the max stack usage per call path but AFAIR we did not get useful results with the existing tooling.