Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make sure to include areas with disputed regions #74

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

12rambau
Copy link
Owner

@12rambau 12rambau commented Dec 3, 2024

Fix #73

Instead of loading only the identified ISO_3 I naively discover them from the geojson. I'm concerned with the ordering and need to verify that it's not introducing an artificial offset. It does not slow down the computation that much.

image

@12rambau 12rambau marked this pull request as draft December 3, 2024 10:11
@julianwid
Copy link

I think by concatenating the DFs retrieved for each ISO in isos you create a mismatch between level_df and complete_df, right? Because in the latter the Z0X are only at the end of complete_df, but in level_df they may be at any row index.

@12rambau
Copy link
Owner Author

12rambau commented Dec 3, 2024

that's what I though, then I won't avoid the reordering.....

@julianwid
Copy link

julianwid commented Dec 4, 2024

How about this:

isos = level_gdf.GID_0.dropna().unique()

# workaround for the wrong naming convention in the geojson files
# https://gis.stackexchange.com/questions/467848/how-to-get-back-spaces-in-administrative-names-in-gadm-4-1
# it should disappear in the next version of GADM
# we are forced to retrieve all the names from the df (sourced from.gpkg) to replace the one from
# the geojson that are all in camelCase

df_list = [Names(admin=iso, content_level=content_level, complete=True) for iso in isos]
complete_df = pd.concat(df_list)
# GID columns to merge on; they (should) match exactly
shared_cols = [f"GID_{i}" for i in range(int(content_level) + 1)]
# Camel-case columns to drop
drop_cols = [f"NAME_{i}" for i in range(int(content_level) + 1)]
gdf = pd.merge(level_gdf.drop(drop_cols, axis=1), complete_df,
             how='inner', on=shared_cols)

return gdf

In my hands, this works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ADM0 boundaries for china and india
2 participants