Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: parse BUILD.bzel to determine whether a commit that only changed BUILD.bazel is a qualified commit #2937

Merged
merged 18 commits into from
Jun 26, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 43 additions & 2 deletions library_generation/model/config_change.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@
from enum import Enum
from typing import Optional
from git import Commit, Repo

from library_generation.model.gapic_inputs import parse_build_str
from library_generation.model.generation_config import GenerationConfig
from library_generation.model.library_config import LibraryConfig
from library_generation.utils.utilities import sh_util
Expand Down Expand Up @@ -77,6 +79,7 @@ def get_changed_libraries(self) -> Optional[list[str]]:
Returns a unique, sorted list of library name of changed libraries.
None if there is a repository level change, which means all libraries
in the current_config will be generated.

:return: library names of change libraries.
"""
if ChangeType.REPO_LEVEL_CHANGE in self.change_to_libraries:
Expand Down Expand Up @@ -143,11 +146,22 @@ def __create_qualified_commit(
:return: qualified commits.
"""
libraries = set()
file_change_num = len(commit.stats.files.keys())
for file in commit.stats.files.keys():
if file.endswith("BUILD.bazel"):
continue
versioned_proto_path = find_versioned_proto_path(file)
if versioned_proto_path in proto_paths:
# determine a commit that only contains `BUILD.bazel`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the if statement this comment refers to will skip the commit if the only affected file in the commit is the BUILD file and doesn't contain relevant changes. Can you please explain that in the comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# separately.
if (
file.endswith("BUILD.bazel")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use endswith instead of ==? Any special cases we need to handle?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file is a relative path starting from google, e.g., google/cloud/aiplatform/v1/BUILD.bazel.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. A nit improvement, rename file to file_path, I thought it is a file_name that is only BUILD.bazel.

and file_change_num == 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a commit contains changes to multiple BUILD.bazel files? It is a common use case for multi-module library.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I need to make some changes when a commit contains multiple BUILD.bazel.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

I changed the logic to qualify commits that adding BUILD.bazel because it's unlikely that a commit added a BUILD.bazel without proto change.

Commits that changed BUILD.bazel will be determined by parsing BUILD.bazel.

and (
not ConfigChange.__qualified_build_change(
commit=commit, build_file_path=file
)
)
):
continue
# Even though a commit usually only changes one
# library, we don't want to miss generating a
# library because the commit may change multiple
Expand All @@ -156,3 +170,30 @@ def __create_qualified_commit(
if len(libraries) == 0:
return None
return QualifiedCommit(commit=commit, libraries=libraries)

@staticmethod
def __qualified_build_change(commit: Commit, build_file_path: str) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readability nit: __is_qualified_build_change may help to quickly understand the check in line 159.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"""
Checks if the given commit containing a BUILD.bazel change is a
qualified commit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add another sentence explaining that the function parses the gapic_inputs from both versions to tell if there is a relevant change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


:param commit: a GitHub commit object.
:param build_file_path: the path of the BUILD.bazel
:return: True if the commit is a qualified commit; False otherwise.
"""
versioned_proto_path = find_versioned_proto_path(build_file_path)
build = str((commit.tree / build_file_path).data_stream.read())
parent_commit = commit.parents[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we are getting it from commit.parents, but conceptually I guess this is the last/previous commit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe.

However, there might be multiple commits changed a BUILD.bazel between baseline and current commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there might be multiple commits changed a BUILD.bazel between baseline and current commit

Can you elaborate on this with an example?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this with an example?

I don't have a concrete example but now I understand your original question.

Yes, the parent commit contains the last commit touching this file.

# If BUILD.bazel doesn't exist in the parent commit, i.e., it is added
# in the current commit, `parent_commit.tree / build_file_path` will
# raise KeyError.
# However, there's unlikely that a BUILD.bazel is the only file that
# added in a commit.
parent_build = str((parent_commit.tree / build_file_path).data_stream.read())
inputs = parse_build_str(build, versioned_proto_path)
parent_inputs = parse_build_str(parent_build, versioned_proto_path)
# If the GapicInputs objects parsed from BUILD.bazel (on the given
# commit and its parent) are different, there are changes in fields
# that used in library generation, then the given commit is a
# qualified commit.
return inputs != parent_inputs
23 changes: 20 additions & 3 deletions library_generation/model/gapic_inputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,20 @@ def __init__(
self.service_yaml = service_yaml
self.include_samples = include_samples

def __eq__(self, other):
if not isinstance(other, GapicInputs):
return False
return (
self.proto_only == other.proto_only
and self.additional_protos == other.additional_protos
and self.transport == other.transport
and self.rest_numeric_enum == other.rest_numeric_enum
and self.gapic_yaml == other.gapic_yaml
and self.service_config == other.service_config
and self.service_yaml == other.service_yaml
and self.include_samples == other.include_samples
)


def parse(
build_path: Path, versioned_path: str, build_file_name: str = "BUILD.bazel"
Expand All @@ -86,16 +100,19 @@ def parse(
"""
with open(f"{build_path}/{build_file_name}") as build:
content = build.read()
return parse_build_str(build_str=content, versioned_path=versioned_path)


def parse_build_str(build_str: str, versioned_path: str) -> GapicInputs:
proto_library_target = re.compile(
proto_library_pattern, re.DOTALL | re.VERBOSE
).findall(content)
).findall(build_str)
additional_protos = ""
if len(proto_library_target) > 0:
additional_protos = __parse_additional_protos(proto_library_target[0])
gapic_target = re.compile(gapic_pattern, re.DOTALL | re.VERBOSE).findall(content)
gapic_target = re.compile(gapic_pattern, re.DOTALL | re.VERBOSE).findall(build_str)
assembly_target = re.compile(assembly_pattern, re.DOTALL | re.VERBOSE).findall(
content
build_str
)
include_samples = "false"
if len(assembly_target) > 0:
Expand Down
50 changes: 50 additions & 0 deletions library_generation/test/model/config_change_unit_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,29 @@ def test_get_qualified_commits_success(self):
qualified_commits[2].commit.hexsha,
)

def test_get_qualified_commits_build_only_commit_returns_a_commit(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that what this test is testing just based on the name, I guess this is for the basic scenario that a commit contains rest_numeric_enums changes in BUILD.bazel?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the test name to test_get_qualified_commits_rest_numeric_enum_change_returns_a_qualified_commit. WDYT?

baseline_commit = "45694d2bad602c52170096072d325aa644d550e5"
latest_commit = "758f0d1217d9c7fe398aa5efb1057ce4b6409e55"
config_change = ConfigChange(
change_to_libraries={},
baseline_config=ConfigChangeTest.__get_a_gen_config(
googleapis_commitish=baseline_commit
),
current_config=ConfigChangeTest.__get_a_gen_config(
googleapis_commitish=latest_commit,
libraries=[
ConfigChangeTest.__get_a_library_config(
library_name="container",
gapic_configs=[GapicConfig(proto_path="google/container/v1")],
)
],
),
)
# one commit between latest_commit and baseline_commit which only
# changed BUILD.bazel.
# this commit changed `rest_numeric_enums`.
self.assertTrue(len(config_change.get_qualified_commits()) == 1)

def test_get_qualified_commits_build_only_commit_returns_empty_list(self):
baseline_commit = "bdda0174f68a738518ec311e05e6fd9bbe19cd78"
latest_commit = "c9a5050ef225b0011603e1109cf53ab1de0a8e53"
Expand All @@ -228,8 +251,35 @@ def test_get_qualified_commits_build_only_commit_returns_empty_list(self):
)
# one commit between latest_commit and baseline_commit which only
# changed BUILD.bazel.
# this commit didn't change fields used in library generation.
self.assertTrue(len(config_change.get_qualified_commits()) == 0)

def test_get_qualified_commits_new_client_commit_returns_a_commit(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here, the test name is not clear. Also since we are using real commits for the tests, it makes the problem worse that it's not easy for developers to see the test data, but that's a different problem to solve.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the test name to test_get_qualified_commits_add_build_file_returns_a_qualified_commit. WDYT?

baseline_commit = "d007ca1b3cc820651530d44d5388533047ae1414"
latest_commit = "05d889e7dfe087fc2ddc9de9579f01d4e1c2f35e"
config_change = ConfigChange(
change_to_libraries={},
baseline_config=ConfigChangeTest.__get_a_gen_config(
googleapis_commitish=baseline_commit
),
current_config=ConfigChangeTest.__get_a_gen_config(
googleapis_commitish=latest_commit,
libraries=[
ConfigChangeTest.__get_a_library_config(
library_name="cloudcontrolspartner",
gapic_configs=[
GapicConfig(
proto_path="google/cloud/cloudcontrolspartner/v1"
)
],
)
],
),
)
# one commit between latest_commit and baseline_commit which added
# google/cloud/cloudcontrolspartner/v1.
self.assertTrue(len(config_change.get_qualified_commits()) == 1)

@staticmethod
def __get_a_gen_config(
googleapis_commitish="", libraries: list[LibraryConfig] = None
Expand Down
Loading