Skip to content

Commit

Permalink
[report-converter] Support sarif format and gcc analyzer
Browse files Browse the repository at this point in the history
Fixes #1797. Based on a commit authored by @csordasmarton. Credit goes
to him!

We've long wanted to support sarif
(https://sarifweb.azurewebsites.net/), and finally, this is the first
real step towards it!

This patch can both parse and export to sarif.

My intent is that the code is self explanatory (because I explained
things in the code!), there are two things I'd like to highlight:

1. I strugged a LOT with mypy, which lead me to express a things things
   in a rather cumbersome manner. I left comments around these parts
2. I copied all example tests from https://github.com/microsoft/sarif-tutorials/
   to tools/report-converter/tests/unit/parser/sarif/sarif_test_files/.
   These examples come with an MIT licence, which I also copied over.
  • Loading branch information
csordasmarton authored and Szelethus committed Oct 10, 2023
1 parent 6d7351e commit 1d56191
Show file tree
Hide file tree
Showing 64 changed files with 3,621 additions and 80 deletions.
1 change: 1 addition & 0 deletions analyzer/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@ lxml==4.9.2
portalocker==2.2.1
psutil==5.8.0
PyYAML==6.0.1
sarif-tools==1.0.0
mypy_extensions==0.4.3
1 change: 1 addition & 0 deletions analyzer/requirements_py/dev/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ mkdocs==1.2.3
PyYAML==6.0.1
mypy_extensions==0.4.3
coverage==5.5.0
sarif-tools==1.0.0
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ For details see
Useful tools that can also be used outside CodeChecker.

* [Build Logger (to generate JSON Compilation Database from your builds)](/analyzer/tools/build-logger/README.md)
* [Plist to HTML converter (to generate HTML files from the given plist files)](/docs/tools/report-converter.md#plist-to-html-tool)
* [Plist/Sarif to HTML converter (to generate HTML files from the given plist or sarif files)](/docs/tools/report-converter.md#plist-to-html-tool)
* [Report Converter Tool (to convert analysis results from other analyzers to the codechecker report directory format))](/docs/tools/report-converter.md)
* [Translation Unit Collector (to collect source files of a translation unit or to get source files which depend on the given header files)](/docs/tools/tu_collector.md)
* [Report Hash generator (to generate unique hash identifiers for reports)](/docs/tools/report-converter.md#report-hash-generation-module)
Expand Down
1 change: 1 addition & 0 deletions docs/supported_code_analyzers.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ CodeChecker result directory which can be stored to a CodeChecker server.
| | [Kernel-Doc](/docs/tools/report-converter.md#kernel-doc) ||
| | [Sparse](/docs/tools/report-converter.md#sparse) ||
| | [cpplint](/docs/tools/report-converter.md#cpplint) ||
| | [GNU GCC Static Analyzer](/docs/tools/report-converter.md#gcc) ||
| **C#** | [Roslynator.DotNet.Cli](/docs/tools/report-converter.md#roslynatordotnetcli) ||
| **Java** | [FindBugs](http://findbugs.sourceforge.net/) ||
| | [SpotBugs](/docs/tools/report-converter.md#spotbugs) ||
Expand Down
70 changes: 52 additions & 18 deletions docs/tools/report-converter.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ a CodeChecker server.
* [Thread Sanitizer](#thread-sanitizer)
* [Leak Sanitizer](#leak-sanitizer)
* [Cppcheck](#cppcheck)
* [GNU GCC Static Analyzer](#gnu-gcc-static-analyzer)
* [Spotbugs](#spotbugs)
* [Facebook Infer](#facebook-infer)
* [ESLint](#eslint)
Expand All @@ -29,7 +30,7 @@ a CodeChecker server.
* [Sparse](#sparse)
* [cpplint](#cpplint)
* [Roslynator.DotNet.Cli](#roslynatordotnetcli)
* [Plist to html tool](#plist-to-html-tool)
* [Plist/Sarif to html tool](#plistsarif-to-html-tool)
* [Usage](#usage-1)
* [Report hash generation module](#report-hash-generation-module)
* [Generate path sensitive report hash](#generate-path-sensitive-report-hash)
Expand All @@ -53,33 +54,37 @@ make package
<summary><i>$ <b>report-converter --help</b> (click to expand)</i></summary>

```
usage: report-converter [-h] -o OUTPUT_DIR -t TYPE [--meta [META [META ...]]]
[--filename FILENAME] [-c] [-v]
file
usage: report-converter [-h] -o OUTPUT_DIR -t TYPE [-e EXPORT]
[--meta [META ...]] [--filename FILENAME] [-c] [-v]
input [input ...]
Creates a CodeChecker report directory from the given code analyzer output
which can be stored to a CodeChecker web server.
positional arguments:
file Code analyzer output result file which will be parsed
and used to generate a CodeChecker report directory.
input Code analyzer output result files or directories which
will be parsed and used to generate a CodeChecker
report directory.
optional arguments:
options:
-h, --help show this help message and exit
-o OUTPUT_DIR, --output OUTPUT_DIR
This directory will be used to generate CodeChecker
report directory files.
-t TYPE, --type TYPE Specify the format of the code analyzer output.
Currently supported output types are: asan, clang-
tidy, coccinelle, cppcheck, cpplint, eslint,
fbinfer, golint, kernel-doc, lsan, mdl, msan,
pyflakes, pylint, roslynator, smatch, sparse, sphinx, spotbugs,
tsan, tslint, ubsan.
--meta [META [META ...]]
Metadata information which will be stored alongside
the run when the created report directory will be
stored to a running CodeChecker server. It has the
following format: key=value. Valid key values are:
Currently supported output types are: asan, clang-tidy,
coccinelle, cppcheck, cpplint, eslint, fbinfer, gcc,
golint, kernel-doc, lsan, mdl, msan, pyflakes, pylint,
roslynator, smatch, sparse, sphinx, spotbugs, tsan,
tslint, ubsan.
-e EXPORT, --export EXPORT
Specify the export format of the converted reports.
Currently supported export types are: .plist, .sarif.
(default: plist)
--meta [META ...] Metadata information which will be stored alongside the
run when the created report directory will be stored to
a running CodeChecker server. It has the following
format: key=value. Valid key values are:
analyzer_command, analyzer_version. (default: None)
--filename FILENAME This option can be used to override the default plist
file name output of this tool. This tool can produce
Expand All @@ -106,6 +111,7 @@ Supported analyzers:
cpplint - cpplint, https://github.com/cpplint/cpplint
eslint - ESLint, https://eslint.org/
fbinfer - Facebook Infer, https://fbinfer.com
gcc - GNU Compiler Collection Static Analyzer, https://gcc.gnu.org/wiki/StaticAnalyzer
golint - Golint, https://github.com/golang/lint
kernel-doc - Kernel-Doc, https://github.com/torvalds/linux/blob/master/scripts/kernel-doc
lsan - LeakSanitizer, https://clang.llvm.org/docs/LeakSanitizer.html
Expand Down Expand Up @@ -254,6 +260,34 @@ CppCheck: `analysis statistics`, `analysis duration`, `cppcheck command` etc.
For more information about logging checkout the log section in the
[user guide](/docs/usage.md).

### [GNU GCC Static Analyzer](https://gcc.gnu.org/wiki/StaticAnalyzer)

This project introduces a static analysis pass for GCC that can diagnose
various kinds of problems in C/C++ code at compile-time (e.g. double-free,
use-after-free, etc).

The analyzer runs as an IPA pass on the gimple SSA representation. It
associates state machines with data, with transitions at certain statements
and edges. It finds "interesting" interprocedural paths through the user's
code, in which bogus state transitions happen.

GCC 13.0.0 and later versions support the output in sarif formats, which
report-converter can parse. Earlier versions only supported a json output,
which report-converter doesn't support.

You can enable the GNU GCC Static Analyzer and the sarif output with the
following flags:
```sh
# Complie and analyze my_file.cpp.
g++ -fanalyzer -fdiagnostics-format=sarif-file my_file.cpp

# GCC created a new file, my_file.cpp.sarif.
report-converter -t gcc -o my_file.cpp.sarif ./gcc_reports

# Store the gcc reports with CodeChecker.
CodeChecker store ./codechecker_cppcheck_reports -n gcc_reports
```

### [Spotbugs](https://spotbugs.github.io/)
[Spotbugs](https://spotbugs.github.io/) is a static analysis tool for `Java`
code.
Expand Down Expand Up @@ -618,7 +652,7 @@ report-converter -t roslynator -o ./codechecker_roslynator_reports ./sample.xml
CodeChecker store ./codechecker_roslynator_reports -n roslynator
```

## Plist to html tool
## Plist/Sarif to html tool
`plist-to-html` is a python tool which parses and creates HTML files from one
or more `.plist` result files.

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# -------------------------------------------------------------------------
#
# Part of the CodeChecker project, under the Apache License v2.0 with
# LLVM Exceptions. See LICENSE for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
# -------------------------------------------------------------------------
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# -------------------------------------------------------------------------
#
# Part of the CodeChecker project, under the Apache License v2.0 with
# LLVM Exceptions. See LICENSE for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
#
# -------------------------------------------------------------------------

import logging
from typing import List

from codechecker_report_converter.report import Report
from codechecker_report_converter.report.parser import sarif

from ..analyzer_result import AnalyzerResultBase


LOG = logging.getLogger('report-converter')


class AnalyzerResult(AnalyzerResultBase):
""" Transform analyzer result of the GCC Static Analyzer. """

TOOL_NAME = 'gcc'
NAME = 'GNU Compiler Collection Static Analyzer'
URL = 'https://gcc.gnu.org/wiki/StaticAnalyzer'

def __init__(self):
super(AnalyzerResult, self).__init__()

def get_reports(self, result_file_path: str) -> List[Report]:
""" Get reports from the given analyzer result file. """

return sarif.Parser().get_reports(result_file_path)
10 changes: 5 additions & 5 deletions tools/report-converter/codechecker_report_converter/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@


from codechecker_report_converter.report.report_file import \
SUPPORTED_ANALYZER_EXTENSIONS
SUPPORTED_ANALYZER_TYPES, SUPPORTED_ANALYZER_EXTENSIONS
from codechecker_report_converter.report.parser import plist

LOG = logging.getLogger('report-converter')
Expand Down Expand Up @@ -73,7 +73,7 @@ class RawDescriptionDefaultHelpFormatter(
analyzer_result = getattr(module, "AnalyzerResult")
supported_converters[analyzer_result.TOOL_NAME] = analyzer_result
except ModuleNotFoundError:
pass
raise


supported_metadata_keys = ["analyzer_command", "analyzer_version"]
Expand Down Expand Up @@ -192,12 +192,12 @@ def __add_arguments_to_parser(parser):
type=str,
dest='export',
metavar='EXPORT',
choices=SUPPORTED_ANALYZER_EXTENSIONS,
choices=SUPPORTED_ANALYZER_TYPES,
default=plist.EXTENSION,
help="Specify the export format of the converted "
"reports. Currently supported export types "
"are: " + ', '.join(sorted(
SUPPORTED_ANALYZER_EXTENSIONS)) + ".")
"are: " +
', '.join(SUPPORTED_ANALYZER_EXTENSIONS) + ".")

parser.add_argument('--meta',
nargs='*',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,14 @@
Base parser class to parse analyzer result files.
"""

import json
import logging
import os

from abc import ABCMeta, abstractmethod
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, Tuple

from codechecker_report_converter import __title__, __version__
from codechecker_report_converter.report import File, Report
from codechecker_report_converter.report.checker_labels import CheckerLabels
from codechecker_report_converter.report.hash import HashType
Expand All @@ -22,6 +25,53 @@
LOG = logging.getLogger('report-converter')


def load_json(path: str):
"""
Load the contents of the given file as a JSON and return it's value,
or default if the file can't be loaded.
"""

ret = {}
try:
with open(path, 'r', encoding='utf-8', errors='ignore') as handle:
ret = json.load(handle)
except IOError as ex:
LOG.warning("Failed to open json file: %s", path)
LOG.warning(ex)
except OSError as ex:
LOG.warning("Failed to open json file: %s", path)
LOG.warning(ex)
except ValueError as ex:
LOG.warning("%s is not a valid json file.", path)
LOG.warning(ex)
except TypeError as ex:
LOG.warning('Failed to process json file: %s', path)
LOG.warning(ex)

return ret


def get_tool_info() -> Tuple[str, str]:
""" Get tool info.
If this was called through CodeChecker, this function will return
CodeChecker information, otherwise this tool (report-converter)
information.
"""
data_files_dir_path = os.environ.get('CC_DATA_FILES_DIR')
if data_files_dir_path:
analyzer_version_file_path = os.path.join(
data_files_dir_path, 'config', 'analyzer_version.json')
if os.path.exists(analyzer_version_file_path):
data = load_json(analyzer_version_file_path)
version = data.get('version')
if version:
return 'CodeChecker', f"{version['major']}." \
f"{version['minor']}.{version['revision']}"

return __title__, __version__


class AnalyzerInfo:
""" Hold information about the analyzer. """
def __init__(self, name: str):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
"""

import importlib
import json
import logging
import os
import plistlib
Expand All @@ -27,12 +26,11 @@
else:
from mypy_extensions import TypedDict

from codechecker_report_converter import __title__, __version__
from codechecker_report_converter.report import BugPathEvent, \
BugPathPosition, File, get_or_create_file, MacroExpansion, Range, Report
from codechecker_report_converter.report.hash import get_report_hash, HashType
from codechecker_report_converter.report.parser.base import AnalyzerInfo, \
BaseParser
BaseParser, get_tool_info


LOG = logging.getLogger('report-converter')
Expand Down Expand Up @@ -422,58 +420,13 @@ def __get_macro_expansions(

return macro_expansions

def __load_json(self, path: str):
"""
Load the contents of the given file as a JSON and return it's value,
or default if the file can't be loaded.
"""

ret = {}
try:
with open(path, 'r', encoding='utf-8', errors='ignore') as handle:
ret = json.load(handle)
except IOError as ex:
LOG.warning("Failed to open json file: %s", path)
LOG.warning(ex)
except OSError as ex:
LOG.warning("Failed to open json file: %s", path)
LOG.warning(ex)
except ValueError as ex:
LOG.warning("%s is not a valid json file.", path)
LOG.warning(ex)
except TypeError as ex:
LOG.warning('Failed to process json file: %s', path)
LOG.warning(ex)

return ret

def __get_tool_info(self) -> Tuple[str, str]:
""" Get tool info.
If this was called through CodeChecker, this function will return
CodeChecker information, otherwise this tool (report-converter)
information.
"""
data_files_dir_path = os.environ.get('CC_DATA_FILES_DIR')
if data_files_dir_path:
analyzer_version_file_path = os.path.join(
data_files_dir_path, 'config', 'analyzer_version.json')
if os.path.exists(analyzer_version_file_path):
data = self.__load_json(analyzer_version_file_path)
version = data.get('version')
if version:
return 'CodeChecker', f"{version['major']}." \
f"{version['minor']}.{version['revision']}"

return __title__, __version__

def convert(
self,
reports: List[Report],
analyzer_info: Optional[AnalyzerInfo] = None
):
""" Converts the given reports. """
tool_name, tool_version = self.__get_tool_info()
tool_name, tool_version = get_tool_info()

data: Dict[str, Any] = {
'files': [],
Expand Down
Loading

0 comments on commit 1d56191

Please sign in to comment.