Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lint workflow to run pre-commit on all files #545

Merged
merged 18 commits into from
Nov 22, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions .github/workflows/blossom-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,18 @@ on:
workflow_dispatch:
inputs:
platform:
description: 'runs-on argument'
description: 'runs-on argument'
required: false
args:
description: 'argument'
description: 'argument'
required: false
jobs:
Authorization:
name: Authorization
runs-on: blossom
runs-on: blossom
outputs:
args: ${{ env.args }}

# This job only runs for pull request comments
if: contains( '\
albert17,\
Expand All @@ -54,7 +54,7 @@ jobs:
OPERATION: 'AUTH'
REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}
REPO_KEY_DATA: ${{ secrets.BLOSSOM_KEY }}

Vulnerability-scan:
name: Vulnerability scan
needs: [Authorization]
Expand All @@ -66,20 +66,20 @@ jobs:
repository: ${{ fromJson(needs.Authorization.outputs.args).repo }}
ref: ${{ fromJson(needs.Authorization.outputs.args).ref }}
lfs: 'true'
# repo specific steps

# repo specific steps
#- name: Setup java
# uses: actions/setup-java@v1
# with:
# java-version: 1.8

# add blackduck properties https://synopsys.atlassian.net/wiki/spaces/INTDOCS/pages/631308372/Methods+for+Configuring+Analysis#Using-a-configuration-file
#- name: Setup blackduck properties
# run: |
# PROJECTS=$(mvn -am dependency:tree | grep maven-dependency-plugin | awk '{ out="com.nvidia:"$(NF-1);print out }' | grep rapids | xargs | sed -e 's/ /,/g')
# echo detect.maven.build.command="-pl=$PROJECTS -am" >> application.properties
# echo detect.maven.included.scopes=compile >> application.properties

- name: Run blossom action
uses: NVIDIA/blossom-action@main
env:
Expand All @@ -89,7 +89,7 @@ jobs:
args1: ${{ fromJson(needs.Authorization.outputs.args).args1 }}
args2: ${{ fromJson(needs.Authorization.outputs.args).args2 }}
args3: ${{ fromJson(needs.Authorization.outputs.args).args3 }}

Job-trigger:
name: Start ci job
needs: [Vulnerability-scan]
Expand All @@ -101,7 +101,7 @@ jobs:
OPERATION: 'START-CI-JOB'
CI_SERVER: ${{ secrets.CI_SERVER }}
REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Upload-Log:
name: Upload log
runs-on: blossom
Expand Down
3 changes: 0 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,6 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip tox
- name: Lint with flake8, black, isort
run: |
tox -e lint
- name: Checking Manifest
run: |
pip install check-manifest
Expand Down
17 changes: 17 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: lint

on:
pull_request:
push:
branches: [main]

jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
cache: 'pip'
cache-dependency-path: '**/**.txt'
- uses: pre-commit/action@v2.0.3
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ repos:
rev: 22.8.0
hooks:
- id: black
- repo: https://gitlab.com/pycqa/flake8
- repo: https://github.com/pycqa/flake8
rev: 3.9.2
hooks:
- id: flake8
Expand All @@ -17,7 +17,7 @@ repos:
hooks:
- id: mypy
language_version: python3
args: [--no-strict-optional, --ignore-missing-imports, --show-traceback, --install-types, --non-interactive]
args: [--non-interactive, --install-types]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest of these args have moved to mypy.ini

- repo: https://github.com/codespell-project/codespell
rev: v2.2.1
hooks:
Expand Down
2 changes: 1 addition & 1 deletion merlin_standard_lib/schema/schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
# polyfill cached_property for python <= 3.7 (using lru_cache which was introduced in python3.2)
from functools import lru_cache

cached_property = lambda func: property(lru_cache()(func)) # noqa
cached_property = lambda func: property(lru_cache()(func)) # type: ignore # noqa
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy throws errors about assigning to types here:

merlin_standard_lib/schema/schema.py:30: error: Cannot assign to a type
merlin_standard_lib/schema/schema.py:30: error: Incompatible types in assignment (expression has type "Type[property]", variable has type "Type[cached_property[Any]]")
merlin_standard_lib/schema/schema.py:30: error: Incompatible return value type (got "property", expected "cached_property[_T]")


import betterproto # noqa

Expand Down
5 changes: 4 additions & 1 deletion mypy.ini
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
[mypy]
python_version = 3.7
python_version = 3.8
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating python version to 3.8 now that we have a min version of 3.8

warn_unused_configs = True
exclude = versioneer.py
ignore_missing_imports = True
show_traceback = True
strict_optional = False

[mypy-cloudpickle.*]
ignore_missing_imports = True
Expand Down
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,5 @@ line_length = 100
balanced_wrapping = true
indent = " "
known_third_party = ["cudf", "cupy", "dask", "dask_cuda", "dask_cudf", "numba", "numpy", "pytest", "rmm", "NVTabular"]
skip = ["build",".eggs"]
skip = ["build",".eggs"]

8 changes: 1 addition & 7 deletions requirements/dev.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,2 @@
check-manifest
black==20.8b1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are now configured only in .pre-commit-config.yaml

flake8
isort
bandit
mypy==0.971
codespell
click<8.1.0
bandit
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ tag_prefix = v
parentdir_prefix = transformers4rec-

[codespell]
skip = .*pb2.py,./.git,./.github,./.mypy_cache,./dist,./build,./docs/build,.*egg-info.*,versioneer.py,*.csv,*.parquet
skip = .*pb2.py,.git/*,.github/*,./.mypy_cache,./dist,./build,./docs/build,.*egg-info.*,versioneer.py,*.csv,*.parquet
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./.github wasn't working to exclude this directory. Changed to .github/* and it works to ignore

ignore-words = ./ci/ignore_codespell_words.txt
count =
quiet-level = 3
9 changes: 3 additions & 6 deletions tests/data/test_preprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,7 @@
import pytest

from merlin_standard_lib import ColumnSchema, Schema, Tag
from transformers4rec.data.preprocessing import (
add_item_first_seen_col_to_df,
remove_consecutive_interactions,
)
from transformers4rec.data import preprocessing
from transformers4rec.data.synthetic import (
generate_item_interactions,
synthetic_ecommerce_data_schema,
Expand All @@ -21,7 +18,7 @@ def test_remove_consecutive_interactions():
schema += Schema([ColumnSchema.create_continuous("timestamp", tags=[Tag.SESSION])])

interactions_df = generate_item_interactions(500, schema)
filtered_df = remove_consecutive_interactions(interactions_df.copy())
filtered_df = preprocessing.remove_consecutive_interactions(interactions_df.copy())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my was throwing an error about the module not having the attribute with the function import version (not sure I fully understand why). But this seems to work without error:

tests/data/test_preprocessing.py:5: error: Module "transformers4rec.data.preprocessing" has no attribute "add_item_first_seen_col_to_df"  [attr-defined]
tests/data/test_preprocessing.py:5: error: Module "transformers4rec.data.preprocessing" has no attribute "remove_consecutive_interactions"  [attr-defined]


assert len(filtered_df) < len(interactions_df)
assert len(filtered_df) == 499
Expand All @@ -32,7 +29,7 @@ def test_add_item_first_seen_col_to_df():
schema = synthetic_ecommerce_data_schema.remove_by_name("item_recency")
schema += Schema([ColumnSchema.create_continuous("timestamp", tags=[Tag.SESSION])])

df = add_item_first_seen_col_to_df(generate_item_interactions(500, schema))
df = preprocessing.add_item_first_seen_col_to_df(generate_item_interactions(500, schema))

assert len(list(df.columns)) == len(schema) + 1
assert isinstance(df["item_ts_first"], pd.Series)
Expand Down
6 changes: 3 additions & 3 deletions tests/unit/test_notebooks.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,12 @@
# limitations under the License.
#

import importlib
import itertools
import json
import os
import subprocess
import sys
from importlib.util import find_spec
from os.path import dirname, realpath

import pytest
Expand All @@ -28,7 +28,7 @@
SESSION_PATH = "examples/getting-started-session-based/"


@pytest.mark.skipif(importlib.util.find_spec("cudf") is None, reason="needs cudf")
@pytest.mark.skipif(find_spec("cudf") is None, reason="needs cudf")
def test_session(tmpdir):
BASE_PATH = os.path.join(dirname(TEST_PATH), SESSION_PATH)
os.environ["INPUT_DATA_DIR"] = "/tmp/data/"
Expand All @@ -37,7 +37,7 @@ def test_session(tmpdir):
_run_notebook(tmpdir, nb_path)

# Run session based
torch = importlib.util.find_spec("torch")
torch = find_spec("torch")
if torch is not None:
os.environ["INPUT_SCHEMA_PATH"] = BASE_PATH + "schema.pb"
nb_path = os.path.join(BASE_PATH, "02-session-based-XLNet-with-PyT.ipynb")
Expand Down
13 changes: 0 additions & 13 deletions tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -38,19 +38,6 @@ commands =

python -m pytest -rsx --cov-config tests/.coveragerc --cov-report term-missing --cov=. tests

[testenv:lint]
; Runs in: Github Actions
; Runs all lint/code checks and fails the PR if there are errors.
; Install pre-commit-hooks to run these tests during development.
deps = -rrequirements/dev.txt
commands =
python -m pip install --upgrade git+https://github.com/NVIDIA-Merlin/core.git
flake8 transformers4rec tests merlin_standard_lib
black --check --diff transformers4rec tests merlin_standard_lib
isort -c . --skip .tox
mypy transformers4rec --install-types --non-interactive --no-strict-optional --ignore-missing-imports
codespell --skip .tox

[testenv:docs]
; Runs in: Github Actions
; Generates documentation with sphinx. There are other steps in the Github Actions workflow
Expand Down