Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude the MISRA Website from CI-CD link verifier checks #91

Merged
merged 6 commits into from
Nov 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/pr_checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ jobs:
org: AWS,
branch: main,
run-link-verifier: true,
run-complexity: true,
run-complexity: false,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jobs is undergoing changes still
For now skip running build related tests.

run-doxygen: true,
build-flags: -DCMAKE_BUILD_TYPE=Debug -DBUILD_CLONE_SUBMODULES=ON -DCMAKE_C_FLAGS='--coverage -Wall -Wextra -Werror',
coverage-skips: '"\*test\*" "\*CMakeCCompilerId\*" "\*mocks\*" "\*source\*"',
Expand Down Expand Up @@ -210,7 +210,7 @@ jobs:
with:
path: repo/${{ matrix.inputs.repository }}
exclude-dirs: complexity, formatting
exclude-urls: https://dummy-url.com/ota.bin
exclude-urls: https://dummy-url.com/ota.bin, https://s3.region.amazonaws.com/joe-ota
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jobs has a new URL for its own tests, add to this list.


- name: "Complexity Check: ${{ matrix.inputs.repository }}"
if: matrix.inputs.run-complexity && ( success() || failure() )
Expand Down
13 changes: 10 additions & 3 deletions link-verifier/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ inputs:
exclude-urls:
description: 'Comma separated list of URLS not to check'
required: false
default: https://www.misra.org.uk/misra-c, https://www.misra.org.uk
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Placing these here just so that if not adding them explicitly it shows up the in action log

user-agent:
description: 'User agent string to use when making http requests.'
required: false
Expand All @@ -30,7 +31,8 @@ runs:
- name: Setup Python for link verifier action
uses: actions/setup-python@v3
with:
python-version: '>=3.7' # Minimum version for urllib v2 (https://urllib3.readthedocs.io/en/latest/v2-migration-guide.html)
# Minimum version for urllib v2 (https://urllib3.readthedocs.io/en/latest/v2-migration-guide.html)
python-version: '>=3.7'

- env:
# The bash escape character is \033
Expand Down Expand Up @@ -75,6 +77,7 @@ runs:
bashFail: \033[31;1mFAILED -
bashEnd: \033[0m
stepName: Check Links in Files
GITHUB_TOKEN: ${{ github.token }}
name: ${{ env.stepName }}
working-directory: ${{ inputs.path }}
shell: bash
Expand All @@ -94,13 +97,17 @@ runs:
args+=" --include-file-types ${file_types}"
fi

# Many of the FreeRTOS-Repos include a link to MISRA's website. This website
# now has a CAPTCHA landing page, as such always exclude it from this check.
touch allowList.txt
echo "https://www.misra.org.uk/misra-c" >> allowList.txt
echo "https://www.misra.org.uk" >> allowList.txt
Copy link
Member Author

@Skptak Skptak Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea here is to make this fix more "portable" as any repos that are already using the above parameters would need an individual PR

Adding the exclusion of these URLs means that we only need to make this change in this repo, not ever repo


if [ -n "${{ inputs.allowlist-file }}" ]; then
touch allowList.txt
cat ${{ inputs.allowlist-file }} >> allowList.txt
fi

if [[ "${{ inputs.exclude-urls }}" != "" ]]; then
touch allowList.txt
exclude_urls="${{ inputs.exclude-urls }}"
exclude_urls="${exclude_urls//,/ }"
for url in ${exclude_urls[@]}; do echo -e "$url" >> allowList.txt; done
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,9 @@
# Here's a random link for it to test as well
[verify-links.py](../../verify-links.py)
[CI-CD-Github-Actions](https://github.com/FreeRTOS/CI-CD-Github-Actions)
# Test that it will find this url
https://www.google.com
# Test that it will find this url and drop the slash
https://www.google.com/
# Test that it will find this url by dropping the coma
https://www.google.com,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test to make sure we don't try and use the trailing coma as part of the URL, currently an issue with the checker.

23 changes: 16 additions & 7 deletions link-verifier/verify-links.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
import traceback
from collections import defaultdict

MARKDOWN_SEARCH_TERM = r'\.md$'
MARKDOWN_SEARCH_TERM = r"\.md$"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still throws a warning about it not being a valid escape sequence? But not sure what it wants.

# Regex to find a URL
URL_SEARCH_TERM = r'(\b(https?)://[^\s\)\]\\"<>]+[^\s\)\.\]\\"<>])'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense to update this regex to exclude the trailing slash, or coma, but I honestly have no idea how this even matches currently.

HTTP_URL_SEARCH_TERM = r'https?://'
Expand Down Expand Up @@ -151,7 +151,11 @@ def identify_broken_links(self, files, verbose):
cprint(f'\t{link}','green')

for link in self.external_links:
is_broken, status_code = test_url(link)
# Remove the trailing slash or trailing coma
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing slash - So that we don't do a duplicate search of <URL>/ and <URL>

Trailing coma - There are a few places I've seen links be put into files like this
/* <COMMENT> <URL>, <MORE COMMENT> */
Where this then breaks the URL checker, since the current regex grabs the coma

if(link[-1] == "/" or link[-1] == ","):
is_broken, status_code = test_url(link[:-1])
else:
is_broken, status_code = test_url(link)
if is_broken:
self.broken_links.append(link)
file_printed = self.print_filename(files[self.name], file_printed)
Expand All @@ -166,7 +170,7 @@ def parse_file(html_file):
return HtmlFile(html_file)

def html_name_from_markdown(filename):
md_pattern = re.compile('\.md', re.IGNORECASE)
md_pattern = re.compile("\.md$", re.IGNORECASE)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this uses this, compared to the global one at the top of the file, but just tried using quotes to see if that helped with the warning

Added the $ to make sure it only looks for files that fully end in .md

return md_pattern.sub('.html', filename)

def create_html(markdown_file):
Expand Down Expand Up @@ -254,7 +258,10 @@ def fetch_issues(repo, issue_type, limit):
if process.returncode == 0:
key = issue_type + 's'
for issue in process.stdout.split():
main_repo_list[repo][key].add(int(issue))
if(issue.isnumeric()):
Copy link
Member Author

@Skptak Skptak Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If for some reason the GitHub issues can't be accessed this will throw an error attempting to convert it to an int

Check if we have an actual number first, if we do not that means there was an error reading the actual Issues from the repo. When this occurs it returns an output of:

  Stdout = ['gh:', 'To', 'use', 'GitHub', 'CLI', 'in', 'a', 'GitHub', 'Actions', 'workflow,', 'set', 'the', 'GH_TOKEN', 'environment', 'variable.', 'Example:', 'env:', 'GH_TOKEN:', '${{', 'github.token', '}}']

main_repo_list[repo][key].add(int(issue))
else:
raise TypeError(f"Attempted to cast {issue} as an int. Error fetching GitHub Issues")
return 0
else:
use_gh_cache = False
Expand Down Expand Up @@ -324,7 +331,6 @@ def main():
if args.verbose:
print("Using User-Agent: {}".format(http_headers['User-Agent']))


# If any explicit files are passed, add them to md_file_list.
if args.files is not None:
md_file_list = args.files
Expand All @@ -347,7 +353,7 @@ def main():
if any(file.endswith(file_type) for file_type in args.include_files):
f_path = os.path.join(root, file)
if args.verbose:
print("Processing File: {}".format(f_path))
print("\nProcessing File: {}".format(f_path))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a newline just to help space the log out when doing a verbose run.

with open(f_path, 'r', encoding="utf8", errors='ignore') as f:
# errors='ignore' argument Suppresses UnicodeDecodeError
# when reading invalid UTF-8 characters.
Expand Down Expand Up @@ -393,7 +399,10 @@ def main():
os.remove(f)

for link in link_set:
is_broken, status_code = test_url(link)
if ( ( link[-1] == "/" ) or ( link[-1] == "," ) ):
is_broken, status_code = test_url(link[:-1])
else:
is_broken, status_code = test_url(link)
if is_broken:
broken_links.append(link)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally use the version of the URL with the coma or slash for the error list. This way the exact link can be searched for in the source file easily.

print("FILES:", link_to_files[link])
Expand Down
Loading