Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Case Deletion #33831

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
3570ec2
Add support for deleting individual cases (dry run only)
minhaminha Oct 25, 2023
580b5a1
Modify UI based on feedback (list structure, extra deletion confirmat…
minhaminha Nov 3, 2023
556d299
More UI changes based on feedback (list reorganization, using list-gr…
minhaminha Nov 17, 2023
5f1c4c4
Reconstruct form name if xmlns_to_name returns the xmlns
minhaminha Nov 21, 2023
707da7d
Add support for deleting cases and forms (no more dry run) + add tests
minhaminha Dec 1, 2023
8847642
Add line in task to delete old enough soft deleted cases + change wai…
minhaminha Dec 4, 2023
52b9443
Merge branch 'master' into ml/case-deletion
minhaminha Dec 4, 2023
870f8ca
add es_test decorator
minhaminha Dec 4, 2023
f78631c
Make small changes based on PR feedback
minhaminha Dec 5, 2023
d96c8a9
Refactor get_cases_and_forms_for_deletion in 3 separate functions + a…
minhaminha Dec 6, 2023
4e36269
Use attrs classes instead of complex dictionaries to pass around disp…
minhaminha Dec 8, 2023
3481e4f
Further break down get_case_and_display_data
minhaminha Dec 13, 2023
588b377
Disable hard deletion of eligible data
minhaminha Dec 13, 2023
e91c974
Undo delete task changes
minhaminha Dec 14, 2023
53852ac
Integrate case deletion into form deletion if the form had created an…
minhaminha Jan 25, 2024
f40be4b
Add more tests for form deletion, new case deletion util functions
minhaminha Jan 25, 2024
3c00173
Various refactors based on PR feedback
minhaminha Feb 8, 2024
fe480b3
Refactor tests + add more tests
minhaminha Feb 9, 2024
0f5e737
Refactor main case walk function into two smaller class methods, upda…
minhaminha Feb 13, 2024
2ee8572
Remove case hard deletion method (in favor of eventual tombstoning me…
minhaminha Feb 13, 2024
52c42bb
Merge branch 'master' into ml/case-deletion
minhaminha Feb 15, 2024
98dd353
small nit changes
minhaminha Feb 23, 2024
86f5deb
Make walk_through_case_forms always returns a dict
minhaminha Feb 23, 2024
18fd6ba
Add in memory caching for forms and case blocks
minhaminha Feb 26, 2024
9c65a10
Make sure forms are actually cached in TempFormCache, remove domain a…
minhaminha Mar 4, 2024
83872d4
Text changes + fix escaping issue
minhaminha Mar 5, 2024
76c6760
Merge branch 'master' into ml/case-deletion
minhaminha Mar 7, 2024
e5b25f8
Make small changes (remove manual escaping, add more tests about the …
minhaminha Mar 11, 2024
09ad4ff
Add TempCaseCache and tests, add cleanup to tests that create forms a…
minhaminha Mar 12, 2024
b93a8cb
Simplify test to save only once
minhaminha Mar 12, 2024
cf79b54
Merge branch 'master' into ml/case-deletion
minhaminha Mar 22, 2024
d3197eb
Fix bug caused by returned form list if there is no create form
minhaminha Mar 22, 2024
53f3877
Fix bug that raises a 404 when getting apps without an app_id
minhaminha Mar 25, 2024
1e49ef8
Minor text and code change suggestions
minhaminha Mar 25, 2024
e1fc391
Fix 404 raising bug when trying to fetch deleted forms + small refact…
minhaminha Mar 26, 2024
e5a2f9b
Fix bug that prevents bulk imports from able to delete
minhaminha Mar 26, 2024
9538140
Fix form order for large deletions
minhaminha Mar 28, 2024
55d63a3
Make form and case soft deletion all or nothing + prevent same forms …
minhaminha Mar 29, 2024
37d3f06
Minor fix
minhaminha Mar 29, 2024
23f27df
Small typo fix
minhaminha Mar 29, 2024
2565d07
Small typo fix part 2
minhaminha Apr 1, 2024
3a9d827
Reverse sorted list so latest form is archived first
minhaminha Apr 1, 2024
04bdb7b
Add test covering bulk form archive error handling
minhaminha Apr 2, 2024
1b1b86d
Typo fix in success message
minhaminha Apr 3, 2024
1b4b297
Make soft_delete_cases_and_forms a celery task (needs more work)
minhaminha Apr 3, 2024
c9b879e
Merge branch 'master' into ml/case-deletion
minhaminha Apr 3, 2024
b062be8
Merge branch 'master' into ml/case-deletion
gherceg May 28, 2024
bfa16b7
Merge branch 'master' into ml/case-deletion
minhaminha Jun 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion corehq/apps/cleanup/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@
from corehq.apps.cleanup.utils import get_cutoff_date_for_data_deletion
from corehq.apps.domain.models import Domain
from corehq.apps.hqwebapp.tasks import mail_admins_async
from corehq.form_processor.models import XFormInstance
from corehq.form_processor.models import CommCareCase, XFormInstance


UNDEFINED_XMLNS_LOG_DIR = settings.LOG_HOME

Expand All @@ -34,10 +35,13 @@ def permanently_delete_eligible_data(dry_run=False):
dry_run_tag = '[DRY RUN] ' if dry_run else ''
cutoff_date = get_cutoff_date_for_data_deletion()
form_counts = XFormInstance.objects.hard_delete_forms_before_cutoff(cutoff_date, dry_run=dry_run)
case_counts = CommCareCase.objects.hard_delete_cases_before_cutoff(cutoff_date, dry_run=dry_run)

logger.info(f"{dry_run_tag}'permanently_delete_eligible_data' ran with the following results:\n")
for table, count in form_counts.items():
logger.info(f"{dry_run_tag}{count} {table} objects were deleted.")
for table, count in case_counts.items():
logger.info(f"{dry_run_tag}{count} {table} objects were deleted.")


@periodic_task(run_every=crontab(minute=0, hour=0), queue=getattr(settings, 'CELERY_PERIODIC_QUEUE', 'celery'))
Expand Down
209 changes: 208 additions & 1 deletion corehq/apps/reports/standard/cases/case_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
)
from django.utils.decorators import method_decorator
from django.utils.html import format_html
from django.utils.safestring import mark_safe
from django.utils.translation import get_language
from django.utils.translation import gettext as _
from django.utils.translation import gettext_lazy
Expand All @@ -24,6 +25,7 @@
from django_prbac.utils import has_privilege
from memoized import memoized

from casexml.apps.case import const
from casexml.apps.case.cleanup import close_case, rebuild_case_from_forms
from casexml.apps.case.mock import CaseBlock
from casexml.apps.case.templatetags.case_tags import case_inline_display
Expand All @@ -42,7 +44,7 @@
from corehq.apps.accounting.utils import domain_has_privilege
from corehq.apps.analytics.tasks import track_workflow
from corehq.apps.app_manager.const import USERCASE_TYPE
from corehq.apps.app_manager.dbaccessors import get_latest_app_ids_and_versions
from corehq.apps.app_manager.dbaccessors import get_latest_app_ids_and_versions, get_app
from corehq.apps.data_dictionary.models import CaseProperty
from corehq.apps.data_dictionary.util import is_case_type_deprecated
from corehq.apps.domain.decorators import login_and_domain_required
Expand All @@ -68,6 +70,7 @@
from corehq.apps.reports.display import xmlns_to_name
from corehq.apps.reports.view_helpers import case_hierarchy_context
from corehq.apps.reports.views import (
archive_form,
DATE_FORMAT,
BaseProjectReportSectionView,
get_data_cleaning_updates,
Expand All @@ -76,6 +79,7 @@
from corehq.apps.users.models import HqPermissions
from corehq.form_processor.exceptions import CaseNotFound
from corehq.form_processor.interfaces.dbaccessors import LedgerAccessors
from corehq.form_processor.interfaces.processor import FormProcessorInterface
from corehq.form_processor.models import (
CommCareCase,
UserRequestedRebuild,
Expand Down Expand Up @@ -555,6 +559,209 @@ def close_case_view(request, domain, case_id):
return HttpResponseRedirect(reverse('case_data', args=[domain, case_id]))


MAX_CASE_COUNT = 10
MAX_SUBCASE_DEPTH = 3


@location_safe
def get_cases_and_forms_for_deletion(request, domain, case_id):
millerdev marked this conversation as resolved.
Show resolved Hide resolved
delete_cases = [] # list of cases to be soft deleted
delete_forms = [] # list of forms to be soft deleted

# For formatting the list of cases/submission forms
cases = {} # structured like: {case: {form: {case touched by form: actions taken by form}}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps a dataclass or similar data structure could be used to give this better structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really familiar with using dataclasses but can they be unwrapped like a dict in an html template? The sole use of this dictionary is to display the case/form information in the exact order it's stored in so I'm not sure if converting it to a dataclass would really help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@minhaminha I think something like a namedtuple would be helpful, here is an example:

Results = namedtuple('Results', ['restored', 'not_found', 'not_deleted'])

you can then call . _asdict() on your namedtuple

Copy link
Contributor

@millerdev millerdev Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really familiar with using dataclasses but can they be unwrapped like a dict in an html template?

dataclass and attrs objects (a richer form of the same) can both be converted to dicts, although it should not be necessary. Just pass the object(s) out to the template and reference properties normally.

Copy link
Member

@biyeun biyeun Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@millerdev I always found namedtuple nice due to the shorter notation for a quick dataclass-esq object for simple use cases like this. curious where your preference for an attrs object comes in vs a namedtuple? Usually I would use the former if I wanted optional fields and methods, but namedtuple for very quick named attributes that were always required

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find dataclasses and attrs to support more natural class syntax than namedtuple. There is tons more flexibility such as field defaults like factory=list, it's trivial to add methods, etc. Also, something is sticking in my mind that namedtuple has a performance overhead penalty, but I could be wrong about that. SO also points out that

When you compare named tuples, the names are ignored: two named tuples are equal if they contain the same values in the same order, even if they have different class names or field names

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other thing is namedtuple doesn't support default values which is often useful.

Also, I find most IDEs quite bad at backlinking to namedtuple definitions since they are just a variable and have no static class definition.

case_names = {} # {case_id: case_name}
form_names = {} # {form_ids: form name}
reopened_cases = {}
affected_cases = {}

update_actions = [const.CASE_ACTION_INDEX,
const.CASE_ACTION_UPDATE,
const.CASE_ACTION_ATTACHMENT,
const.CASE_ACTION_COMMTRACK,
const.CASE_ACTION_REBUILD]

def get_forms_for_deletion_from_case(case, subcase_count):
delete_cases.append(case.case_id)
if len(delete_cases) > MAX_CASE_COUNT or subcase_count >= MAX_SUBCASE_DEPTH:
millerdev marked this conversation as resolved.
Show resolved Hide resolved
raise ValueError("Too many cases to delete")
millerdev marked this conversation as resolved.
Show resolved Hide resolved
if case.case_id not in cases:
cases[case.case_id] = {}
case_names[case.case_id] = case.name
if len(case_names) == 1: # only add primary label to first/main case
case_names[case.case_id] += ' <span class="label label-default">primary case</span>'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be a boolean rather than raw HTML?

Similar to my comment above - perhaps a data structure would be better than a string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

second this! raw HTML in python should always be avoided

case_xforms = case.xform_ids

for form_id in case_xforms:
if form_id not in delete_forms:
delete_forms.insert(0, form_id)
form_object = XFormInstance.objects.get_form(form_id, domain)
if form_id not in form_names:
form_names[form_id] = xmlns_to_name(domain, form_object.xmlns, form_object.app_id)
if form_names[form_id] == form_object.xmlns:
form_name = [
get_app(domain, form_object.app_id).name or "[Unknown App]",
"[Unknown Module]",
form_object.name or "[Unknown Form]"
]
form_names[form_id] = ' > '.join(form_name)
case_db = FormProcessorInterface(domain).casedb_cache(
domain=domain,
load_src="process_stock",
millerdev marked this conversation as resolved.
Show resolved Hide resolved
)
touched_cases = FormProcessorInterface(domain).get_cases_from_forms(case_db, [form_object])

case_actions = {}
for touched_id in touched_cases:
case_object = safely_get_case(request, domain, touched_id)
snopoke marked this conversation as resolved.
Show resolved Hide resolved
actions = list(touched_cases[touched_id].actions)
if touched_id == case.case_id:
case_actions['current'] = actions
elif touched_id not in delete_cases:
if touched_id not in case_actions:
case_actions[case_object.name] = actions
if const.CASE_ACTION_CREATE in actions and touched_id != case.case_id:
subcase_count += 1
get_forms_for_deletion_from_case(case_object, subcase_count)
subcase_count -= 1
millerdev marked this conversation as resolved.
Show resolved Hide resolved
if const.CASE_ACTION_CLOSE in actions:
reopened_cases[touched_id] = form_id
case_names[touched_id] = case_object.name
if any(action in actions for action in update_actions):
if touched_id not in affected_cases:
affected_cases[touched_id] = {}
affected_cases[touched_id][form_id] = ', '.join(actions)
case_names[touched_id] = case_object.name
cases[case.case_id][form_id] = case_actions
return
millerdev marked this conversation as resolved.
Show resolved Hide resolved

case_instance = safely_get_case(request, domain, case_id)
subcase_count = 0
try:
get_forms_for_deletion_from_case(case_instance, subcase_count)
except ValueError:
messages.error(request, _("Deleting this case would delete too many related cases. "
"Please delete some of this cases' subcases before attempting"
"to delete this case."))
return {}, True
millerdev marked this conversation as resolved.
Show resolved Hide resolved

def get_case_link(caseid):
url = reverse('case_data', args=[domain, caseid])
return mark_safe('<a href="{}"> {} </a>'.format(url, case_names[caseid]))

def get_form_link(formid):
url = reverse('render_form_data', args=[domain, formid])
return mark_safe('<a href="{}"> {} </a>'.format(url, form_names[formid]))

prepared_cases = {}
for case in cases:
prepared_forms = {}
for form in cases[case]:
form_link = get_form_link(form)
prepared_forms[form_link] = {}
for case_action in cases[case][form]:
prepared_forms[form_link][case_action] = ', '.join(cases[case][form][case_action])
prepared_cases[get_case_link(case)] = prepared_forms

prepared_reopened_cases = {}
for case in reopened_cases:
prepared_reopened_cases[get_case_link(case)] = get_form_link(reopened_cases[case])

prepared_affected_cases = {}
for case in affected_cases:
if case in delete_cases:
continue
prepared_forms = {}
for form in affected_cases[case]:
prepared_forms[get_form_link(form)] = affected_cases[case][form]
prepared_affected_cases[get_case_link(case)] = prepared_forms
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to tie in mentions of a data structure. perhaps something like

AffectedCase = namedtuple('AffectedCase', ['name', 'url', 'affected_forms'])
AffectedForm = namedtuple('AffectedForm', ['name, 'url', 'actions'])

and then in the template, you can call on the properties by iterating through a list of structured items rather than key, value pairs from affected_cases.items

Copy link
Contributor

@millerdev millerdev Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except use attrs rather than namedtuple

from attrs import define, field

@define
class AffectedCase:
    name = field()
    url = field()
    affected_forms = field()

...


return {
'main_case_name': case_instance.name,
'delete_dict': prepared_cases,
'affected_cases': prepared_affected_cases,
'reopened_cases': prepared_reopened_cases,
'case_delete_list': delete_cases,
'form_delete_list': delete_forms,
}, False


@location_safe
class DeleteCaseView(BaseProjectReportSectionView):
urlname = 'soft_delete_case_view'
page_title = gettext_lazy('Delete Case and Related Forms')
template_name = 'reports/reportdata/case_delete.html'
delete_dict = {}

@method_decorator(require_case_view_permission)
def dispatch(self, request, *args, **kwargs):
self.delete_dict, redirect = get_cases_and_forms_for_deletion(request, self.domain, self.case_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does this take to run? Wondering it would make sense to offload it to celery

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my testing, It doesn’t take long at all. There’s also a limit on how many cases it can look at/return so the lag shouldn’t really be noticeable.

if redirect:
return HttpResponseRedirect(reverse('case_data', args=[self.domain, self.case_id]))
return super(DeleteCaseView, self).dispatch(request, *args, **kwargs)

@property
def case_id(self):
return self.kwargs['case_id']

@property
def domain(self):
return self.kwargs['domain']

@property
def page_url(self):
return reverse(self.urlname, args=(self.domain, self.case_id))

@property
def page_context(self):
context = {
"case_id": self.case_id,
}
context.update(self.delete_dict)
return context

def post(self, request, *args, **kwargs):
if request.POST.get('input') != self.delete_dict['main_case_name']:
messages.error(request, "Incorrect name. Please enter the case name as shown into the textbox.")
millerdev marked this conversation as resolved.
Show resolved Hide resolved
return HttpResponseRedirect(self.page_url)
msg, error = soft_delete_cases_and_forms(request, self.domain, self.delete_dict['case_delete_list'],
self.delete_dict['form_delete_list'])
if error:
messages.error(request, msg, extra_tags='html')
return HttpResponseRedirect(reverse('case_data', args=[self.domain, self.case_id]))
else:
msg = self.delete_dict['main_case_name'] + msg
messages.success(request, msg)
return HttpResponseRedirect(reverse('project_report_dispatcher',
args=(self.domain, 'submit_history')))


@location_safe
@require_permission(HqPermissions.edit_data)
def soft_delete_cases_and_forms(request, domain, case_delete_list, form_delete_list):
"""
Archiving the form that created the case will automatically "unmake" the case, but won't delete
the case from the database. This deletion will happen 90 days from the deletion date by an
automated deletion task.
"""
error = False
msg = ", its related subcases and submission forms were deleted successfully."
millerdev marked this conversation as resolved.
Show resolved Hide resolved
for form in form_delete_list:
if archive_form(request, domain, form, is_case_delete=True):
form_instance = XFormInstance.objects.get_form(form, domain)
form_instance.soft_delete()
else:
# I'm fairly certain this will never enter here but this is just in case something does go wrong
error = True
msg = "The form {} could not be deleted. Please try manually archiving, then deleting the form," \
"before trying to delete this case again.".format(form)
millerdev marked this conversation as resolved.
Show resolved Hide resolved
break
if not error:
CommCareCase.objects.soft_delete_cases(domain, list(case_delete_list))

return msg, error


@location_safe
@require_case_view_permission
@require_permission(HqPermissions.edit_data)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,12 @@ <h1>{% blocktrans %}Section: {{ section_id }}{% endblocktrans %}</h1>
</button>
</form>
{% endif %}
{% if not is_usercase %}
<a class="btn btn-danger pull-left" href="{% url 'soft_delete_case_view' domain case_id %}">
<i class="fa fa-trash"></i>
{% trans 'Delete Case' %}
</a>
{% endif %}
</div>
</div>
{% endif %}
Expand Down
Loading
Loading