Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close #26: Return all open PRs instead of filtering by date #205

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ctreatma
Copy link

This resource can inadvertently miss Pull Requests due to out-of-order
commits across PRs. If PR#2 is opened after PR#1, but the head commit
of PR#2 is older than the head commit of PR#1, the resource will not
include PR#2 in the list of new versions provided to Concourse.

Rather than attempt to find a different way of tracking which PRs are
"new" given an input version, we can remove the date-based filtering
so that this resource returns all open PRs that match all explicitly-configured
filters. Concourse can deduplicate versions based on metadata, which
means that it will only trigger new jobs for versions that Concourse
hasn't seen before.

This makes it easier for teams to use this resource to track PRs in
Concourse, since they no longer have to ensure that a PR has a later
head commit than all currently-opened PRs in order to notify Concourse
that their PR exists.

@ctreatma ctreatma requested a review from a team as a code owner May 26, 2020 17:52
@ctreatma ctreatma force-pushed the feature/remove_date_filter branch from b6a5ad6 to 7baf06c Compare June 3, 2020 15:25
@ctreatma ctreatma force-pushed the feature/remove_date_filter branch from 7baf06c to 6831b22 Compare August 4, 2020 19:26
@rickardl
Copy link
Contributor

rickardl commented Oct 20, 2020

@ctreatma I agree on the premise, to delegate this to Concourse, but is there any performance impact doing this, would you be able to provide some form of baseline of how this change might affect people? Also, we changed the behavior recently, in the current master, we instead look at if the PR itself has changed, instead of the most recent Commit. Does that impact the decision you made to remove the check?

If you could provide a resource with your implementation, that I can use, I might be able to get those metrics myself.

@ctreatma
Copy link
Author

I haven't run performance tests on this change yet. We ran a couple pipelines with this implementation to confirm that it worked as expected, but given that the resource at the time was limited to discovering open PRs and that we tend to have 10 or fewer PRs open on our repos at any given time, we weren't too concerned about the performance impact.

With the new change to use PR updated date and allow tracking merged & closed PRs, it's possible a resource configuration could return a large list of versions for Concourse to reconcile. I'll work on rebasing my change and will see if I can get it published in a place you can access.

@rickardl
Copy link
Contributor

@ctreatma I would appreciate that, if anyone else looking into this as something they'd like to test out, please feel free to pitch in.

Optionally, we could release this change as a feature flag, since it's small code change, but with a larger impact.

@rickardl
Copy link
Contributor

rickardl commented Oct 21, 2020

@ctreatma I spoke to @itsdalmo and we'd love to see this included in v0.22.0, so I'll keep a spot open for it.

@ctreatma
Copy link
Author

@rickardl I pushed a build with this feature to Docker Hub as ctreatma/github-pr-resource:no-date-filter.

I'm a bit out of my wheelhouse with performance testing a Concourse resource but happy to dig in. Could you provide some guidance on what metrics would be useful for measuring the impact of this change and how I can collect them from a Concourse instance? I'm figuring I'll run an instance locally for testing, but not sure if that's feasible or useful in this case.

@ctreatma ctreatma force-pushed the feature/remove_date_filter branch 2 times, most recently from 396b186 to fc19cb4 Compare December 4, 2020 21:38
@isaacsanders
Copy link
Contributor

I think this might help me. I would be a fan of merging this.

This resource can inadvertently miss Pull Requests due to out-of-order
commits across PRs.  If PR#2 is opened after PR#1, but the head commit
of PR#2 is older than the head commit of PR#1, the resource will not
include PR#2 in the list of new versions provided to Concourse.

Rather than attempt to find a different way of tracking which PRs are
"new" given an input version, we can remove the date-based filtering
and return all open PRs.  Concourse can deduplicate versions based on
metadata, which means that we will only trigger new jobs for versions
that Concourse hasn't seen before.

This makes it easier for teams to use this resource to track PRs in
Concourse, since they no longer have to ensure that a PR has a later
head commit than all currently-opened PRs in order to notify Concourse
that their PR exists.
@ctreatma ctreatma force-pushed the feature/remove_date_filter branch from fc19cb4 to b2493fb Compare February 9, 2021 20:00
@Benjamintf1
Copy link

I like this pr too!

@ctreatma
Copy link
Author

ctreatma commented May 6, 2021

@rickardl I haven't had a chance to figure out local performance testing of this change, but we've been using it for a couple months now with no issues thus far. If I put this change behind a flag so that users have to opt in to it, would that enable us to get it merged in for others to use?

ctreatma referenced this pull request in ctreatma/github-pr-resource Jun 8, 2021
This supersedes #205.

This resource can inadvertently miss Pull Requests due to out-of-order
commits across PRs. If PR#2 is opened after PR#1, but the head commit
of PR#2 is older than the head commit of PR#1, the resource will not
include PR#2 in the list of new versions provided to Concourse.

In #205, I removed the date filter entirely.  This ensures that the PR
resource will find all PRs that match the explicitly-configured filters.
While Concourse can detect and ignore duplicate versions, it has to run
a database query for every version returned by a `check`, so removing
the date filter entirely would increase load on a Concourse database.
(That said, I'm not sure whether this increased load is a particular
concern, and other resources don't seem to make much effort to avoid
returning duplicate versions from a `check`.)

To avoid that extra load on a Concourse database, this change instead
replaces the filter by commit date in `check.go` with a filter by updated
date in the GraphQL query to list pull requests.  This should reduce the
number of duplicate versions returned by a `check` while still allowing
the PR resource to detect PRs with out-of-order head commits.
@Pasupathi-Rajamanickam
Copy link

I like this PR, many of my PRs are not getting triggered, looking at the code, this could be the only condition that can stop my PRs not getting triggered. Please move this forward or, if I can make this changes somewhere and use it in concourse would love to do it.

ctreatma referenced this pull request in ctreatma/github-pr-resource Aug 23, 2021
This supersedes #205.

This resource can inadvertently miss Pull Requests due to out-of-order
commits across PRs. If PR#2 is opened after PR#1, but the head commit
of PR#2 is older than the head commit of PR#1, the resource will not
include PR#2 in the list of new versions provided to Concourse.

In #205, I removed the date filter entirely.  This ensures that the PR
resource will find all PRs that match the explicitly-configured filters.
While Concourse can detect and ignore duplicate versions, it has to run
a database query for every version returned by a `check`, so removing
the date filter entirely would increase load on a Concourse database.
(That said, I'm not sure whether this increased load is a particular
concern, and other resources don't seem to make much effort to avoid
returning duplicate versions from a `check`.)

To avoid that extra load on a Concourse database, this change instead
replaces the filter by commit date in `check.go` with a filter by updated
date in the GraphQL query to list pull requests.  This should reduce the
number of duplicate versions returned by a `check` while still allowing
the PR resource to detect PRs with out-of-order head commits.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants