Skip to content

Commit

Permalink
.pytool/Plugin/LineEndingCheck: Fix scanning and other changes
Browse files Browse the repository at this point in the history
Makes the following changes:

1. Recursively discovers files in packages instead of just checking
   the top-level package directory.

   The pattern should recursively find files relative to the package
   which accomplished with `**` in this change.

2. Differentiates between binary and text files, skipping line
   ending checks in binary files.

3. Verifies a path discovered is a file. Directories may rarely
   contain a dot in the name appearing in the path result list.

4. For files with issues, print the relative file path to the
   workspace. This is less than verbose than absolute, unique
   enough to find the file, and recognized for shortcut clicking
   in IDEs like VS Code.

5. For files with issues, print the relative file path to the
   file in POSIX format (with forward slashes). This makes copy/
   paste easy to tools like unixtodos and is equally recognized
   as a valid file path in IDEs like VS Code on Windows.

Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com>

.pytool/Plugin/LineEndingCheck: Add git ignore support

Adds support to ignore git ignored files and files under submodules
since those are either addressed in their respective repo's CI or
from an external project.

Some optimizations are also made to improve performance.

As the comment in the code states, these changes call the system git
command until GitPython is more widely used in the edk2 tool modules.

Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com>

Plugin/LineEndingCheck: Add autocrlf warning (#260)

LineEndingCheck's ReadMe.md recommends that autocrlf=false, but
developers may not have configured their system with this setting. This
change generates a warning message when it determines that the autocrlf
does not match the recommended setting.

For each item, place an "x" in between `[` and `]` if true. Example:
`[x]`.
_(you can also check items in the GitHub UI)_

- [ ] Impacts functionality?
- **Functionality** - Does the change ultimately impact how firmware
functions?
- Examples: Add a new library, publish a new PPI, update an algorithm,
...
- [ ] Impacts security?
- **Security** - Does the change have a direct security impact on an
application,
    flow, or firmware?
  - Examples: Crypto algorithm change, buffer overflow fix, parameter
    validation improvement, ...
- [ ] Breaking change?
- **Breaking change** - Will anyone consuming this change experience a
break
    in build or boot behavior?
- Examples: Add a new library class, move a module to a different repo,
call
    a function in a new library class in a pre-existing module, ...
- [ ] Includes tests?
  - **Tests** - Does the change include any explicit test code?
  - Examples: Unit tests, integration tests, robot tests, ...
- [ ] Includes documentation?
- **Documentation** - Does the change contain explicit documentation
additions
    outside direct code modifications (and comments)?
- Examples: Update readme file, add feature readme file, link to
documentation
    on an a separate Web page, ...

Testing by configuring a system with `git config --global core.autocrlf
true` and running a CI build to verify message is displayed, then by
configuring `git config --global core.autocrlf false` and running a CI
build to verify warning message is not displayed.
Tested when no core.autocrlf is is set (deleting from config file).

n/a
  • Loading branch information
makubacki authored and kenlautner committed May 9, 2023
1 parent c78e233 commit a2555f1
Showing 1 changed file with 172 additions and 17 deletions.
189 changes: 172 additions & 17 deletions .pytool/Plugin/LineEndingCheck/LineEndingCheck.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@
##

import glob
from io import StringIO
import logging
import os
import shutil
from pathlib import Path
from typing import Any, Callable, Dict, List, Tuple

Expand All @@ -20,6 +22,9 @@
from edk2toollib.gitignore_parser import parse_gitignore_lines
from edk2toollib.log.junit_report_format import JunitReportTestCase
from edk2toollib.uefi.edk2.path_utilities import Edk2Path
from edk2toollib.utility_functions import RunCmd
from git import Repo


PLUGIN_NAME = "LineEndingCheck"

Expand All @@ -32,10 +37,23 @@

ALLOWED_LINE_ENDING = b'\r\n'

#
# Based on a solution for binary file detection presented in
# https://stackoverflow.com/a/7392391.
#
_TEXT_CHARS = bytearray(
{7, 8, 9, 10, 12, 13, 27} | set(range(0x20, 0x100)) - {0x7f})


def _is_binary_string(_bytes: bytes) -> bool:
return bool(_bytes.translate(None, _TEXT_CHARS))


class LineEndingCheckBadLineEnding(Exception):
pass

class LineEndingCheckGitIgnoreFileException(Exception):
pass

class LineEndingCheck(ICiBuildPlugin):
"""
Expand Down Expand Up @@ -68,6 +86,112 @@ def GetTestName(self, packagename: str, environment: VarDict) -> Tuple:
return ("Check line endings in " + packagename, packagename +
"." + PLUGIN_NAME)

# Note: This function access git via the command line
#
# function to check and warn if git config reports that
# autocrlf is configured to TRUE
def _check_autocrlf(self):
r = Repo(".")
try:
result = r.config_reader().get_value("core", "autocrlf")
if result:
logging.warning(f"git config core.autocrlf is set to {result} "
f"recommended setting is false "
f"git config --global core.autocrlf false")
except Exception:
logging.warning(f"git config core.autocrlf is not set "
f"recommended setting is false "
f"git config --global core.autocrlf false")
return

# Note: This function currently accesses git via the git command to prevent
# introducing a new Python git module dependency in mu_basecore
# on gitpython.
#
# After gitpython is adopted by edk2-pytool-extensions, this
# implementation can be updated to use the gitpython interface.
def _get_git_ignored_paths(self) -> List[Path]:
""""
Gets paths ignored by git.
Returns:
List[str]: A list of file absolute path strings to all files
ignored in this git repository.
If git is not found, an empty list will be returned.
"""
if not shutil.which("git"):
logging.warn(
"Git is not found on this system. Git submodule paths will "
"not be considered.")
return []

outstream_buffer = StringIO()
exit_code = RunCmd("git", "ls-files --other",
workingdir=self._abs_workspace_path,
outstream=outstream_buffer,
logging_level=logging.NOTSET)
if (exit_code != 0):
raise LineEndingCheckGitIgnoreFileException(
f"An error occurred reading git ignore settings. This will "
f"prevent LineEndingCheck from running against the expected "
f"set of files.")

# Note: This will potentially be a large list, but at least sorted
rel_paths = outstream_buffer.getvalue().strip().splitlines()
abs_paths = []
for path in rel_paths:
abs_paths.append(Path(
os.path.normpath(os.path.join(self._abs_workspace_path, path))))
return abs_paths

# Note: This function currently accesses git via the git command to prevent
# introducing a new Python git module dependency in mu_basecore
# on gitpython.
#
# After gitpython is adopted by edk2-pytool-extensions, this
# implementation can be updated to use the gitpython interface.
def _get_git_submodule_paths(self) -> List[Path]:
"""
Gets submodule paths recognized by git.
Returns:
List[str]: A list of directory absolute path strings to the root
of each submodule in the workspace repository.
If git is not found, an empty list will be returned.
"""
if not shutil.which("git"):
logging.warn(
"Git is not found on this system. Git submodule paths will "
"not be considered.")
return []

if os.path.isfile(os.path.join(self._abs_workspace_path, ".gitmodules")):
logging.info(
".gitmodules file found. Excluding submodules in "
"LineEndingCheck.")

outstream_buffer = StringIO()
exit_code = RunCmd("git",
"config --file .gitmodules --get-regexp path",
workingdir=self._abs_workspace_path,
outstream=outstream_buffer,
logging_level=logging.NOTSET)
if (exit_code != 0):
raise LineEndingCheckGitIgnoreFileException(
f".gitmodule file detected but an error occurred reading "
f"the file. Cannot proceed with unknown submodule paths.")

submodule_paths = []
for line in outstream_buffer.getvalue().strip().splitlines():
submodule_paths.append(Path(
os.path.normpath(os.path.join(self._abs_workspace_path, line.split()[1]))))

return submodule_paths
else:
return []

def _get_files_ignored_in_config(self,
pkg_config: Dict[str, List[str]],
base_dir: str) -> Callable[[str], bool]:
Expand Down Expand Up @@ -124,52 +248,83 @@ def RunBuildPlugin(self, package_rel_path: str, edk2_path: Edk2Path,
0 : Ran successfully
-1 : Skipped due to a missing pre-requisite
"""

abs_pkg_path = \
self._check_autocrlf()
self._abs_workspace_path = \
edk2_path.GetAbsolutePathOnThisSystemFromEdk2RelativePath('.')
self._abs_pkg_path = \
edk2_path.GetAbsolutePathOnThisSystemFromEdk2RelativePath(
package_rel_path)

if abs_pkg_path is None:
if self._abs_pkg_path is None:
tc.SetSkipped()
tc.LogStdError(f"Package folder not found {abs_pkg_path}")
tc.LogStdError(f"Package folder not found {self._abs_pkg_path}")
return 0

all_files = [n for n in glob.glob(os.path.join(abs_pkg_path, '*.*'),
recursive=True)]

all_files = [Path(n) for n in glob.glob(
os.path.join(self._abs_pkg_path, '**/*.*'),
recursive=True)]
ignored_files = list(filter(
self._get_files_ignored_in_config(
package_config, abs_pkg_path), all_files))
package_config, self._abs_pkg_path), all_files))
ignored_files = [Path(f) for f in ignored_files]

for file in ignored_files:
if file in all_files:
logging.info(f" File ignored in plugin config file: "
f"{Path(file).name}")
all_files.remove(file)
all_files = list(set(all_files) - set(ignored_files))
if not all_files:
tc.SetSuccess()
return 0

all_files_before_git_removal = set(all_files)
git_ignored_paths = set(self._get_git_ignored_paths() + self._get_git_submodule_paths())
all_files = list(all_files_before_git_removal - git_ignored_paths)
git_ignored_paths = git_ignored_paths - (all_files_before_git_removal - set(all_files))
if not all_files:
tc.SetSuccess()
return 0

git_ignored_paths = {p for p in git_ignored_paths if p.is_dir()}

ignored_files = []
for file in all_files:
for ignored_path in git_ignored_paths:
if Path(file).is_relative_to(ignored_path):
ignored_files.append(file)
break

all_files = list(set(all_files) - set(ignored_files))
if not all_files:
tc.SetSuccess()
return 0

file_count = 0
line_ending_count = dict.fromkeys(LINE_ENDINGS, 0)

for file in all_files:
with open(file, 'rb') as fb:
if file.is_dir():
continue
with open(file.resolve(), 'rb') as fb:
if not fb.readable() or _is_binary_string(fb.read(1024)):
continue
fb.seek(0)

for lineno, line in enumerate(fb):
try:
for e in LINE_ENDINGS:
if line.endswith(e):
line_ending_count[e] += 1

if e is not ALLOWED_LINE_ENDING:
file_name = Path(file).name
file_path = file.relative_to(
Path(self._abs_workspace_path)).as_posix()
file_count += 1

tc.LogStdError(
f"Line ending on Line {lineno} in "
f"{file_name} is not allowed.\nLine "
f"{file_path} is not allowed.\nLine "
f"ending is {e} and should be "
f"{ALLOWED_LINE_ENDING}.")
logging.error(
f"Line ending on Line {lineno} in "
f"{file_name} is not allowed.\nLine "
f"{file_path} is not allowed.\nLine "
f"ending is {e} and should be "
f"{ALLOWED_LINE_ENDING}.")
raise LineEndingCheckBadLineEnding
Expand Down

0 comments on commit a2555f1

Please sign in to comment.