Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Developer can wait for the deletion of an org to be complete #2605

Closed
gcapizzi opened this issue Jun 14, 2023 · 2 comments
Closed
Assignees
Labels
Milestone

Comments

@gcapizzi
Copy link
Contributor

gcapizzi commented Jun 14, 2023

Background

As a developer
I want to know when an org has finished deleting
So that I can plan other operations accordingly

Dev Notes

  • We may want to add timeout logic to check the deletion timestamp on the object set for deletion.
  • Note that other jobs default to a 15 minute timeout, but technically the top level delete jobs do not get marked as failed in CF for VMs.
  • We're thinking about adding the failed state here along with a small message about checking for the lingering resources like service binding projections
  • https://v3-apidocs.cloudfoundry.org/version/3.139.0/index.html#jobs
  • Similar to other jobs we suspect this will not require a new crd to track the job, instead encoding the required information in the url

Acceptance Criteria

GIVEN I have an org
WHEN I delete the org
AND I follow the redirect to the deletion job
THEN I see that the job looks like this:

{
  "guid": "org.delete~<the-org-guid>",
  "operation": "org.delete",
  "state": "PROCESSING",
  "links": {
    "self": {
      "href": "https://api.example.org/v3/jobs/org.delete~<the-org-guid>"
    }
  },
  "errors": [],
  "warnings": []
}

GIVEN I have waited long enough for the org deletion to successfully complete
WHEN I get the deletion job
THEN I see it looks like this:

{
  "guid": "org.delete~<the-org-guid>",
  "operation": "org.delete",
  "state": "COMPLETE",
  "links": {
    "self": {
      "href": "https://api.example.org/v3/jobs/org.delete~<the-org-guid>"
    }
  },
  "errors": [],
  "warnings": []
}

GIVEN I have waited long enough for the org deletion to time out
WHEN I get the deletion job
THEN I see it looks roughly like this:

{
  "guid": "org.delete~<the-org-guid>",
  "operation": "org.delete",
  "state": "FAILED",
  "links": {
    "self": {
      "href": "https://api.example.org/v3/jobs/org.delete~<the-org-guid>"
    }
  },
 "errors": [
    {
      "code": 10008,
      "title": "CF-UnprocessableEntity",
      "detail": "something went wrong, check for lingering resources in cforg"
    }
  ],
  "warnings": [
    {
      "detail": "warning! warning!"
    }
]
}

GIVEN the org doesn't exist
WHEN I get the deletion job
THEN I still see a COMPLETE response like the above

GIVEN I have no permissions to get the org
WHEN I get the deletion job
THEN I still see a COMPLETE response like the above

GIVEN an org that hasn't been deleted
WHEN I get the deletion job
THEN I get a 404 Not Found

@gcapizzi gcapizzi added the Jobs label Jun 14, 2023
@gcapizzi gcapizzi added this to the v0.9 milestone Jun 14, 2023
acosta11 pushed a commit that referenced this issue Jun 27, 2023
[#2605]

- Add check for org deletion timeout with message
- Calculate status for deletion based on presence of CRD and its timestamp
acosta11 added a commit that referenced this issue Jun 27, 2023
[#2605]

- Add check for org deletion timeout with message
- Calculate status for deletion based on presence of CRD and its timestamp

Co-authored-by: Dave Walter <walterda@vmware.com>
acosta11 added a commit that referenced this issue Jun 27, 2023
[#2605]

- Add check for org deletion timeout with message
- Calculate status for deletion based on presence of CRD and its timestamp

Co-authored-by: Dave Walter <walterda@vmware.com>
@gcapizzi gcapizzi modified the milestones: v0.9, v0.8 Jun 28, 2023
davewalter added a commit that referenced this issue Jun 28, 2023
- Add check for org deletion timeout with message
- Calculate status for deletion based on presence of CRD and its timestamp

[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
davewalter added a commit that referenced this issue Jun 28, 2023
- Add check for org deletion timeout with message
- Calculate status for deletion based on presence of CRD and its timestamp

[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
davewalter added a commit that referenced this issue Jun 28, 2023
- Add check for org deletion timeout with message
- Calculate status for deletion based on presence of CRD and its timestamp

[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
@matt-royal
Copy link
Member

I did a round of acceptance on this change, and it did not behave as expected. I used the cf-admin user to create an org, but the cf delete-org command failed due to the job not being found. Here's the command output:

$ CF_TRACE=1 cf delete-org o
REQUEST: [2023-06-28T17:05:45-07:00]
GET /whoami HTTP/1.1
Host: localhost
Accept: application/json
Authorization: [PRIVATE DATA HIDDEN]
Content-Type: application/json
User-Agent: cf/8.5.0+73aa161.2022-09-12 (go1.18.5; arm64 darwin)
[application/json Content Hidden]

RESPONSE: [2023-06-28T17:05:45-07:00]
HTTP/1.1 200 OK
Content-Type: application/json
Date: Thu, 29 Jun 2023 00:05:45 GMT
Server: envoy
Vary: Accept-Encoding
X-Correlation-Id: 63770957-dbec-47bc-8e44-38a6ea003e62
X-Envoy-Upstream-Service-Time: 4
{
  "kind": "User",
  "name": "cf-admin"
}


Really delete the org o, including its spaces, apps, service instances, routes, private domains and space-scoped service brokers? [yN]: y
Deleting org o as cf-admin...
REQUEST: [2023-06-28T17:05:48-07:00]
GET /v3/organizations?names=o HTTP/1.1
Host: localhost
Accept: application/json
Authorization: [PRIVATE DATA HIDDEN]
Content-Type: application/json
User-Agent: cf/8.5.0+73aa161.2022-09-12 (go1.18.5; arm64 darwin)
[application/json Content Hidden]

RESPONSE: [2023-06-28T17:05:48-07:00]
HTTP/1.1 200 OK
Content-Type: application/json
Date: Thu, 29 Jun 2023 00:05:48 GMT
Server: envoy
Vary: Accept-Encoding
X-Correlation-Id: a7488a88-92cd-43f7-874f-9c5b40f957a3
X-Envoy-Upstream-Service-Time: 19
{
  "pagination": {
    "first": {
      "href": "https://localhost/v3/organizations?names=o"
    },
    "last": {
      "href": "https://localhost/v3/organizations?names=o"
    },
    "next": null,
    "previous": null,
    "total_pages": 1,
    "total_results": 1
  },
  "resources": [
    {
      "created_at": "2023-06-28T23:28:13Z",
      "guid": "cf-org-4ccbcbde-d012-4e0e-81e4-c48ccda5ea36",
      "links": {
        "self": {
          "href": "https://localhost/v3/organizations/cf-org-4ccbcbde-d012-4e0e-81e4-c48ccda5ea36"
        }
      },
      "metadata": {
        "annotations": {
          "korifi.cloudfoundry.org/creation-version": "v9999.99.99-local.dev"
        },
        "labels": {}
      },
      "name": "o",
      "relationships": {},
      "suspended": false,
      "updated_at": "2023-06-28T23:28:13Z"
    }
  ]
}


REQUEST: [2023-06-28T17:05:48-07:00]
DELETE /v3/organizations/cf-org-4ccbcbde-d012-4e0e-81e4-c48ccda5ea36 HTTP/1.1
Host: localhost
Accept: application/json
Authorization: [PRIVATE DATA HIDDEN]
Content-Type: application/json
User-Agent: cf/8.5.0+73aa161.2022-09-12 (go1.18.5; arm64 darwin)
[application/json Content Hidden]

RESPONSE: [2023-06-28T17:05:48-07:00]
HTTP/1.1 202 Accepted
Content-Length: 0
Date: Thu, 29 Jun 2023 00:05:48 GMT
Location: https://localhost/v3/jobs/org.delete~cf-org-4ccbcbde-d012-4e0e-81e4-c48ccda5ea36
Server: envoy
X-Correlation-Id: 24d31142-90c2-46fe-9728-a0d5b8a0ce67
X-Envoy-Upstream-Service-Time: 26

REQUEST: [2023-06-28T17:05:48-07:00]
GET /v3/jobs/org.delete~cf-org-4ccbcbde-d012-4e0e-81e4-c48ccda5ea36 HTTP/1.1
Host: localhost
Accept: application/json
Authorization: [PRIVATE DATA HIDDEN]
Content-Type: application/json
User-Agent: cf/8.5.0+73aa161.2022-09-12 (go1.18.5; arm64 darwin)
[application/json Content Hidden]

RESPONSE: [2023-06-28T17:05:48-07:00]
HTTP/1.1 404 Not Found
Content-Type: application/json
Date: Thu, 29 Jun 2023 00:05:48 GMT
Server: envoy
Vary: Accept-Encoding
X-Correlation-Id: 97696f43-bc13-4cb1-b7d9-7f07bd730273
X-Envoy-Upstream-Service-Time: 29
{
  "errors": [
    {
      "code": 10010,
      "detail": "Job not found. Ensure it exists and you have access to it.",
      "title": "CF-ResourceNotFound"
    }
  ]
}


Job not found. Ensure it exists and you have access to it.
FAILED

davewalter added a commit that referenced this issue Jun 29, 2023
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
davewalter added a commit that referenced this issue Jun 30, 2023
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
davewalter added a commit that referenced this issue Jun 30, 2023
- The cached client would sometimes return a resource without the
  deletion timestamp when the job endpoint was called immediately after
  deleting the resource.
- Consolidate CFSpaceRepository and SpaceRepository interfaces.

[#2604]
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
davewalter added a commit that referenced this issue Jun 30, 2023
- The cached client would sometimes return a resource without the
  deletion timestamp when the job endpoint was called immediately after
  deleting the resource.
- Consolidate CFSpaceRepository and SpaceRepository interfaces.

[#2604]
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
davewalter added a commit that referenced this issue Jul 3, 2023
- Add a retry loop to wait for the deletion timestamp to propagate.
- Added a GetOrgForDeletion function to the org repository to avoid
  falsely returning a not found error when the role binding is deleted
  from an org during deletion.
- Consolidate CFSpaceRepository and SpaceRepository interfaces.

[#2604]
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
Co-authored-by: Julian Hjortshoj <hjortshojj@vmware.com>
davewalter added a commit that referenced this issue Jul 5, 2023
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
davewalter added a commit that referenced this issue Jul 5, 2023
- Add a retry loop to wait for the deletion timestamp to propagate.
- Added a GetOrgForDeletion function to the org repository to avoid
  falsely returning a not found error when the role binding is deleted
  from an org during deletion.
- Consolidate CFSpaceRepository and SpaceRepository interfaces.

[#2604]
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
Co-authored-by: Julian Hjortshoj <hjortshojj@vmware.com>
davewalter added a commit that referenced this issue Jul 6, 2023
- Add a retry loop to wait for the deletion timestamp to propagate.
- Added a GetOrgUnfiltered function to the org repository to avoid
  falsely returning a not found error when the role binding is deleted
  from an org during deletion.
- Consolidate CFSpaceRepository and SpaceRepository interfaces.

[#2604]
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
Co-authored-by: Julian Hjortshoj <hjortshojj@vmware.com>
davewalter added a commit that referenced this issue Jul 6, 2023
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
davewalter added a commit that referenced this issue Jul 6, 2023
- Add a retry loop to wait for the deletion timestamp to propagate.
- Added a GetOrgUnfiltered function to the org repository to avoid
  falsely returning a not found error when the role binding is deleted
  from an org during deletion.
- Consolidate CFSpaceRepository and SpaceRepository interfaces.

[#2604]
[#2605]

Co-authored-by: Dave Walter <walterda@vmware.com>
Co-authored-by: Julian Hjortshoj <hjortshojj@vmware.com>
@matt-royal
Copy link
Member

I just confirmed that all but 1 of the Acceptance Criteria are met.

When the deletion was in progress, I got this response:

{
  "created_at": "",
  "errors": [],
  "guid": "org.delete~cf-org-4eda0dac-4878-40b1-9d6c-f62f6b1e1909",
  "links": {
    "self": {
      "href": "https://localhost/v3/jobs/org.delete~cf-org-4eda0dac-4878-40b1-9d6c-f62f6b1e1909"
    }
  },
  "operation": "org.delete",
  "state": "PROCESSING",
  "updated_at": "",
  "warnings": null
}

When the org was successfully deleted, I got this response:

{
  "created_at": "",
  "errors": [],
  "guid": "org.delete~cf-org-f8e2b0c7-7854-40c8-abcf-9f317b1f2926",
  "links": {
    "self": {
      "href": "https://localhost/v3/jobs/org.delete~cf-org-f8e2b0c7-7854-40c8-abcf-9f317b1f2926"
    }
  },
  "operation": "org.delete",
  "state": "COMPLETE",
  "updated_at": "",
  "warnings": null
}

When I added a finalizer to the CFOrg so that it would never delete, I got this response eventually:

{
  "created_at": "",
  "errors": [
    {
      "code": 10008,
      "detail": "Org deletion timed out, check the remaining \"cf-org-4eda0dac-4878-40b1-9d6c-f62f6b1e1909\" resource",
      "title": "CF-UnprocessableEntity"
    }
  ],
  "guid": "org.delete~cf-org-4eda0dac-4878-40b1-9d6c-f62f6b1e1909",
  "links": {
    "self": {
      "href": "https://localhost/v3/jobs/org.delete~cf-org-4eda0dac-4878-40b1-9d6c-f62f6b1e1909"
    }
  },
  "operation": "org.delete",
  "state": "FAILED",
  "updated_at": "",
  "warnings": null
}

When I queried the job status for a non-existent org, I got this response:

{
  "created_at": "",
  "errors": [],
  "guid": "org.delete~no-such-org",
  "links": {
    "self": {
      "href": "https://localhost/v3/jobs/org.delete~no-such-org"
    }
  },
  "operation": "org.delete",
  "state": "COMPLETE",
  "updated_at": "",
  "warnings": null
}

The failed acceptance criterion is that when I queried the job status for an org that I don't have access to, I got this response:

{
  "errors": [
    {
      "code": 10010,
      "detail": "Job not found. Ensure it exists and you have access to it.",
      "title": "CF-ResourceNotFound"
    }
  ]
}

What I should have gotten was the "COMPLETE" response. I'll put this story back in the queue for the issue to be addressed.

matt-royal added a commit that referenced this issue Jul 17, 2023
Before this change, the job endpoint would say that an org was deleted
as soon as the CFOrg finalizer removed their role binding from the Org
namespace. We now fetch the CFOrg directly to determine if it's done
deleting. If it isn't deleting, then we will use GetOrg to determine if
the user has access to the org

[#2605]
matt-royal added a commit that referenced this issue Jul 17, 2023
Before this change, the job endpoint would say that an org was deleted
as soon as the CFOrg finalizer removed their role binding from the Org
namespace. We now fetch the CFOrg directly to determine if it's done
deleting. If it isn't deleting, then we will use GetOrg to determine if
the user has access to the org

[#2605]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

4 participants