Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Case and User Ownership (Script) #34588

Open
wants to merge 70 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
3cc7f6f
function for changing user location
zandre-eng May 9, 2024
006a266
function for changing case ownership
zandre-eng May 9, 2024
ff5cdae
bulk save users
zandre-eng May 9, 2024
b47f557
update owner id for cases
zandre-eng May 9, 2024
3d045ad
catch error retrieving non commcare user
zandre-eng May 9, 2024
2b80d2e
fix func name overshadowing import
zandre-eng May 9, 2024
c35c65d
minor refactoring
zandre-eng May 9, 2024
975cd7f
remove test code
zandre-eng May 10, 2024
bf3b264
refactor all funcs into classes
zandre-eng May 10, 2024
d777a43
add todos
zandre-eng May 10, 2024
4a9d929
fix logging func writing empty lines if no msg
zandre-eng May 10, 2024
51734c4
func to load all relevant case ids to file
zandre-eng May 10, 2024
a6d54c5
create sqlite db manager class + remove text logging
zandre-eng May 17, 2024
d436a14
refactor user class to use db manager
zandre-eng May 17, 2024
3aaa66f
refactor case class to use db manager
zandre-eng May 17, 2024
25e153b
custom table names for case/user updater
zandre-eng May 17, 2024
c87c4b8
correctly flatten ids
zandre-eng May 17, 2024
192b6ee
correctly create tuple
zandre-eng May 17, 2024
ce348c4
wrap user doc into user obj
zandre-eng May 17, 2024
0f092e1
fix incorrect col ref
zandre-eng May 21, 2024
1760f47
retrieve user id
zandre-eng May 21, 2024
c10d783
define var outside of conditional
zandre-eng May 21, 2024
01c4f7c
check for same location before querying loc
zandre-eng May 21, 2024
559dfad
set case updater to only reference a single case type
zandre-eng May 21, 2024
a000ce2
minor logging refactor
zandre-eng May 21, 2024
5888d2f
lint
mkangia May 31, 2024
c817a30
move update to individual commands
mkangia Jun 2, 2024
e4e7b5a
fetch cases from one shard at once
mkangia Jun 2, 2024
4cb9da3
add warning
mkangia Jun 2, 2024
949d47c
reduce class variables
mkangia Jun 3, 2024
8f050f7
fetch only active mobile workers
mkangia Jun 3, 2024
7875743
keep it simple and have chunk size only
mkangia Jun 3, 2024
aba0953
reorder
mkangia Jun 3, 2024
3e6ce37
minor naming update
mkangia Jun 3, 2024
f6a371a
remove redundant method
mkangia Jun 3, 2024
fabe95a
exit with message
mkangia Jun 3, 2024
9288792
fix naming
mkangia Jun 3, 2024
e1e7c63
store updated id as well
mkangia Jun 3, 2024
136cae3
load owner updates from user updates
mkangia Jun 3, 2024
cae7288
setup fresh migration command with approach
mkangia Jun 5, 2024
26498e9
add structure for new migration
mkangia Jun 5, 2024
81f1849
find child location with name
mkangia Jun 5, 2024
16bcbb8
find users at location
mkangia Jun 5, 2024
886d3d2
find case ids to update
mkangia Jun 5, 2024
5645fc7
update cases and users
mkangia Jun 5, 2024
9d053f8
add logging
mkangia Jun 5, 2024
768994c
log update completion for each user
mkangia Jun 5, 2024
bc7708a
update one case type at a time
mkangia Jun 5, 2024
9625ab1
respect dry run option
mkangia Jun 5, 2024
ceb0992
add option to update for only one village at once
mkangia Jun 5, 2024
5bf510b
log errors separately
mkangia Jun 5, 2024
80dc9f7
correct file mode for append
mkangia Jun 5, 2024
6c16dee
remove all redundant commands
mkangia Jun 5, 2024
26e3e0d
nit: added a couple of logs
ajeety4 Jun 20, 2024
e11aa9f
fix - location codes are lowercase
ajeety4 Jun 20, 2024
ff44777
fix - use get_user_data() instead of user_data
ajeety4 Jun 20, 2024
c4c23bc
fix-case block as text
ajeety4 Jun 20, 2024
80fc2c7
nit: logs
ajeety4 Jun 20, 2024
130b843
add custom.benin to installed apps
ajeety4 Jun 24, 2024
3154c6f
minor updates: execution time, device id and logs
ajeety4 Jun 24, 2024
c7e2cdb
unset exsiting location
ajeety4 Jun 27, 2024
cb455b5
nit: use . instead of : in file name for scp support
ajeety4 Jun 27, 2024
0e160f1
nit: avoid printing progress bar for 0 records
ajeety4 Jul 1, 2024
9fc9cc1
use python logging
ajeety4 Jul 1, 2024
ed4e5e8
adds option to run in celery
ajeety4 Jul 1, 2024
2098d59
move script location
zandre-eng Jul 4, 2024
1ca177a
update queue + remove unused script file
zandre-eng Jul 4, 2024
d799f67
optional fetch villages by commune
zandre-eng Jul 8, 2024
2056184
additional user stats in logging
zandre-eng Jul 10, 2024
e7ab359
Comment out assertion
Charl1996 Jul 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file added custom/benin/__init__.py
Empty file.
Empty file.
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
import math
import time
import logging

from django.core.management.base import BaseCommand

from casexml.apps.case.mock import CaseBlock
from dimagi.utils.chunked import chunked

from corehq.apps.es.cases import CaseES
from corehq.apps.hqcase.utils import submit_case_blocks
from corehq.apps.locations.dbaccessors import get_users_by_location_id
from corehq.apps.locations.models import SQLLocation
from corehq.apps.users.user_data import prime_user_data_caches
from corehq.util.log import with_progress_bar
from custom.benin.tasks import process_updates_for_village_async

LOCATION_TYPE_VILLAGE = "village"
LOCATION_TYPE_RC = "rc"
ajeety4 marked this conversation as resolved.
Show resolved Hide resolved

logger = logging.getLogger("custom_benin_script")


class Command(BaseCommand):
help = 'Migrate benin project\'s users and their cases to new rc level locations'

def add_arguments(self, parser):
super(Command, self).add_arguments(parser)

parser.add_argument('domain')
parser.add_argument(
'--village_id'
)
parser.add_argument(
'--commune_id'
)
parser.add_argument(
'--dry_run',
action='store_true',
default=False,
help="A dry run to only share the updates that would happen",
)
parser.add_argument(
'--run_in_celery',
action='store_true',
default=False,
help="Will spawn a celery task for each village",
)

def handle(self, domain, **options):
"""
Steps:
1. Fetch all villages (location type: Village).
2. For each village:
1. Fetch all users assigned to the village with usertype 'rc'
2. For each user
1. find the corresponding RC under the village with name same as user's user data in rc_number.
Log error if no matching RC, and move to next user
2. if RC present
1. Find all OPEN cases (case_type: menage, membre)
1. owned by village
2. opened_by the user (Use CaseES.opened_by)
2. Find all OPEN cases (case_type: seance_educative, fiche_pointage)
1. opened_by the user (Use CaseES.opened_by)
2. Why are we updating it though? They are already owned by users. They aren't many though
3. Update all cases
1. Update owner to be the corresponding RC location
4. Update users location to corresponding RC location only after cases to enable
retry on this update in case of any intermittent failures
"""
dry_run = options['dry_run']
village_id = options['village_id']
commune_id = options['commune_id']
run_in_celery = options['run_in_celery']

# assert not village_id and commune_id, "Provide either village ID or commune ID, cannot use both"

start_time = time.time()
logger.info("Started processing of script")
if village_id:
villages = SQLLocation.active_objects.get_locations([village_id])
elif commune_id:
villages = _find_child_locations(
domain=domain,
location_id=commune_id,
location_type_code=LOCATION_TYPE_VILLAGE
)
else:
villages = _find_locations(domain=domain, location_type_code=LOCATION_TYPE_VILLAGE)
logger.info(f"Total number of villages found: {len(villages)}")

for village in villages:
if run_in_celery:
process_updates_for_village_async.delay(domain, village.id, dry_run)
else:
process_updates_for_village(domain, village.id, dry_run)

if run_in_celery:
logger.info("Celery tasks queued for all villages.")
else:
logger.info(f"Processing completed. Total execution time: {(time.time() - start_time):.2f}s")


def process_updates_for_village(domain, village_id, dry_run):
village = SQLLocation.active_objects.get(pk=village_id)
logger.info(f"Starting updates for village {village.name}")
users = _find_rc_users_at_location(domain, village)
logger.info(f"Total number of users in village {village.name}: {len(users)}")
for user in users:
user_rc_number = user.get_user_data(domain).get('rc_number')
user_user_type = user.get_user_data(domain).get('usertype')
user_stats_str = ' '.join([
'Active' if user.is_active else 'Deactivated',
user_rc_number if user_rc_number else 'RC:None',
user_user_type if user_user_type else 'Type:None'
])
if user_rc_number:
try:
new_user_rc_location = _find_child_location_with_name(
parent_location=village,
location_name=user_rc_number
)
except MultipleMatchingLocationsFound:
logger.error(f"[{user_stats_str}] Multiple matching locations found for user "
f"{user.username}:{user.user_id} with rc number {user_rc_number}")
else:
if new_user_rc_location:
_update_cases(domain=domain, user=user, current_owner_id=village.location_id,
new_owner_id=new_user_rc_location.location_id,
dry_run=dry_run)
_update_users_location(user=user, existing_location=village,
new_location=new_user_rc_location, dry_run=dry_run)
logger.info(f"[{user_stats_str}] User {user.username}:{user.user_id} updates completed.")
else:
logger.error(f"[{user_stats_str}] User {user.username}:{user.user_id} rc "
f"{user_rc_number} location not found")
else:
logger.error(f"[{user_stats_str}] User {user.username}:{user.user_id} missing rc number")
logger.info(f"Updates for village {village.name} processed.")


def _find_locations(domain, location_type_code):
return SQLLocation.active_objects.filter(
domain=domain,
location_type__code=location_type_code
)

def _find_child_locations(domain, location_id, location_type_code):

Check failure on line 148 in custom/benin/management/commands/migrate_users_and_their_cases_to_new_rc_level.py

View workflow job for this annotation

GitHub Actions / Flake8

custom/benin/management/commands/migrate_users_and_their_cases_to_new_rc_level.py#L148

Expected 2 blank lines, found 1 (E302)
loc = SQLLocation.active_objects.get(domain=domain, location_id=location_id)
return loc.get_descendants().filter(location_type__code=location_type_code)

def _find_rc_users_at_location(domain, location):

Check failure on line 152 in custom/benin/management/commands/migrate_users_and_their_cases_to_new_rc_level.py

View workflow job for this annotation

GitHub Actions / Flake8

custom/benin/management/commands/migrate_users_and_their_cases_to_new_rc_level.py#L152

Expected 2 blank lines, found 1 (E302)
# return users with usertype as 'rc'
users = _find_users_at_location(domain, location)
users = prime_user_data_caches(users, domain)

return [
user
for user in users
if user.get_user_data(domain).get('usertype') == 'rc'
]


def _find_users_at_location(domain, location):
return get_users_by_location_id(domain, location.location_id)


def _find_child_location_with_name(parent_location, location_name):
# find location under parent location that has the name location_name
locations = parent_location.get_descendants().filter(
name=location_name
)
if not locations:
return None
if len(locations) == 1:
return locations[0]
if len(locations) > 1:
raise MultipleMatchingLocationsFound


def _update_cases(domain, user, current_owner_id, new_owner_id, dry_run):
case_types = ['menage', 'membre', 'seance_educative', 'fiche_pointage']
for case_type in case_types:
case_ids = _find_case_ids(case_type=case_type, owner_id=current_owner_id, opened_by_user_id=user.user_id)

logger.info(f"Updating {len(case_ids)} {case_type} cases for user {user.username}")
if case_ids:
for case_ids in with_progress_bar(
chunked(case_ids, 100),
length=math.ceil(len(case_ids) / 100),
oneline=False
):
_update_case_owners(domain, case_ids, new_owner_id, dry_run)


def _update_case_owners(domain, case_ids, owner_id, dry_run):
case_blocks = []
for case_id in case_ids:
case_blocks.append(
CaseBlock(
create=False,
case_id=case_id,
owner_id=owner_id
).as_text()
)
if not dry_run:
submit_case_blocks(
case_blocks=case_blocks,
domain=domain,
device_id=__name__
)


def _find_case_ids(case_type, owner_id, opened_by_user_id):
# find ids for open cases of a case type owned and opened by specific user
return (
CaseES()
.case_type(case_type)
.owner(owner_id)
.opened_by(opened_by_user_id)
.is_closed(False)
zandre-eng marked this conversation as resolved.
Show resolved Hide resolved
.get_ids()
)


def _update_users_location(user, existing_location, new_location, dry_run):
if not dry_run:
user.set_location(new_location)
user.unset_location_by_id(existing_location.location_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh this was needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, based on the discussion with Adam, it was decided to have only the rc location for the mobile worker.
(Side note: It also helps issues (location ambiguity) with Case Sharing to have only 1 location that own cases)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, good to not do this if existing_location is same as new_location, Not sure if that is possible or not.

Copy link
Contributor

@ajeety4 ajeety4 Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new_location will always be different from existing_location for this script because it is child location of the existing location.

logger.info(f"User {user.username}:{user.user_id} location updated to {new_location.location_id}")


class MultipleMatchingLocationsFound(Exception):
pass
8 changes: 8 additions & 0 deletions custom/benin/tasks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from corehq.apps.celery import task


@task(queue='soltech_custom_script_queue', ignore_result=True)
def process_updates_for_village_async(domain, village, dry_run):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

celery is great approach. Just one thing to note is that once things are in celery, you won't have any visibility/control on the updates as you lose the logging and once the tasks are queued up you don't know when they would get picked up.

Just good to consider that before you finalize the approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. That is a valid concern.
I have added a new logger (file) for the script however I am not sure if it work when tasks are run across multiple celery machines.

from custom.benin.management.commands.migrate_users_and_their_cases_to_new_rc_level import \
process_updates_for_village
process_updates_for_village(domain, village, dry_run)
12 changes: 12 additions & 0 deletions settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -405,6 +405,7 @@
'custom.hmhb',

'custom.ccqa',
'custom.benin',

'corehq.extensions.app_config.ExtensionAppConfig', # this should be last in the list
)
Expand Down Expand Up @@ -1410,6 +1411,12 @@ def _pkce_required(client_id):
'maxBytes': 10 * 1024 * 1024,
'backupCount': 20
},
'custom_benin_script': {
'level': 'DEBUG',
'class': 'logging.handlers.RotatingFileHandler',
'formatter': 'simple',
'filename': "%s/%s" % (FILEPATH, "custom_benin_case_updates.log"),
},
},
'root': {
'level': 'INFO',
Expand Down Expand Up @@ -1505,6 +1512,11 @@ def _pkce_required(client_id):
'handlers': ['file'],
'level': 'INFO',
'propagate': False,
},
'custom_benin_script': {
'handlers': ['custom_benin_script', 'console'],
'level': 'DEBUG',
'propagate': False,
}
}
}
Expand Down
Loading