Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Documentation for report identification #3070

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions analyzer/codechecker_analyzer/cmd/analyze.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,9 @@ def get_argparser_ctor_args():
hash will not be changed so easily for example on code indentation or when a
checker is renamed.

For more information see:
https://github.com/Ericsson/codechecker/blob/master/docs/analyzer/report_identification.md

Exit status
------------------------------------------------
0 - Successful analysis and no new reports
Expand Down
3 changes: 3 additions & 0 deletions analyzer/codechecker_analyzer/cmd/check.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,9 @@ def get_argparser_ctor_args():
hash will not be changed so easily for example on code indentation or when a
checker is renamed.

For more information see:
https://github.com/Ericsson/codechecker/blob/master/docs/analyzer/report_identification.md

Exit status
------------------------------------------------
0 - Successful analysis and no new reports
Expand Down
90 changes: 66 additions & 24 deletions docs/analyzer/report_identification.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,68 @@
# Analyzer report identification

Unique report identifiers are required to compare analysis results.
If the analyzers (`Clang Static Analyzer`, `Clang Tidy`) do not generate
a unique report identifier, CodeChecker tries to generate one.

It is recommended to use the identifier generated by the analyzers
because they can include the semantic context of the result better into the
unique id.

## Experimental context-free hash generation method

This method is used for Clang Tidy reports.
The current hash generation algorithm in CodeChecker uses this information to
generate the unique id.
__NOTE: As the main diagnostic section the last element from the bug path is used!__

Unique id content:

* `file_name` from the main diag section.
* `checker message`
* `line content` from the source file if can be read up
* `column numbers` from the main diag section
* whitespaces are removed from the source lines used
for hash generation to make the hash independent of the
indentation
Unique report identifiers are required to compare analysis results. If an
analyzer (`Clang Static Analyzer`, `Clang Tidy`) does not generate a unique
report identifier (for example `Clang Tidy`), CodeChecker tries to generate
one.

It is recommended to use the identifier generated by the analyzers because they
can include the semantic context of the result better into the unique id.

**!!! WARNING !!!**: the same hash method should be used consistently for a
product. Mixing them can cause a lot of confusion when the compare or other
features are used.

## Hash generation methods

### Context sensitive
This is the default hash method for Clang Static Analyzer. The hashes are
calculated based on the following information:
- **signature** of the enclosing function declaration, type declaration or
namespace.
- **content of the line** where the bug is.
- **checker name** (e.g.: *core.DivideZero*, *core.NullDereference*, etc.).
- **position** (column) within the line.

### Context insensitive
This is the default hash method for *Clang Tidy* reports. The hashes are
calculated based on the following information:

- **file name** (e.g.: *main.cpp*, *lib.cpp*, etc.).
- **checker name** (e.g.: *bugprone-infinite-loop*,
*clang-diagnostic-uninitialized* etc.).
- **checker message** (e.g: *variable 'ptr' is uninitialized when used here*,
etc.).
- **content of the line** where the bug is if it can be read up.
- **range column numbers** where the bug is.
- **range column numbers** of the bug steps.

### Context free
With this method the hash will not be changed so easily for example on code
indentation or when a checker is renamed.

You can override the default hash calculation method for reports by using the
`--report-hash` option of the `CodeChecker analyze` or the `CodeChecker check`
commands.

The following calculation methods are supported:
- [context-free](#context-free)
- [context-free-v2](#context-free-v2)

#### context-free
- In case of `Clang Tidy` the [Context insensitive](#context-sensitive) hash
method is used.
- In case of `Clang Static Analyzer` the [context-free-v2](#context-free-v2)
hash method is used.

**Important**: there was a bug and for `Clang Tidy` the default hash was
generated and not the context free hash. We kept this only for backward
compatibility reason. Use [context-free-v2](#context-free-v2) hash method
instead of this.

#### context-free-v2
The hashes are calculated based on the following information:
- **file name** (e.g.: *main.cpp*, *lib.cpp*, etc.).
- **checker name** (e.g.: *bugprone-infinite-loop*, *core.DivideZero* etc.).
- **content of the line** where the bug is if it can be read up. All the
whitespaces from the source content are removed.
- **range column numbers** where the bug is.
3 changes: 3 additions & 0 deletions docs/analyzer/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,9 @@ OUR RECOMMENDATION: we recommend you to use 'context-free-v2' hash because the
hash will not be changed so easily for example on code indentation or when a
checker is renamed.

For more information see:
https://github.com/Ericsson/codechecker/blob/master/docs/analyzer/report_identification.md

Exit status
------------------------------------------------
0 - Successful analysis and no new reports
Expand Down
3 changes: 3 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -565,6 +565,9 @@ Each report has a unique (hash) identifier generated from checker name
and the location of the finding: column number, textual content of the line,
enclosing scope of the bug location (function signature, class, namespace).

You can find more information how these hashes are calculated
[here](analyzer/report_identification.md).

## Listing and Counting Reports <a name="listing-reports"></a>

See a more detailed description in the [analyzer report identification
Expand Down