Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoPackage does not specify a SQLITE_MAX_COLUMN value in the standards document #685

Open
KRyden opened this issue Aug 5, 2024 · 1 comment

Comments

@KRyden
Copy link

KRyden commented Aug 5, 2024

The GeoPackage standard does not appear to specify a MAX_COLUMN value for the maximum number of columns in a table. This is something that should be added to section 1.1 of the document where base definitions of the SQLite container are defined.

SQLite defines a "SQLITE_MAX_COLUMN" value which is defaulted to 2000. From the SQLite documentation:

“The default setting for SQLITE_MAX_COLUMN is 2000. You can change it at compile time to values as large as 32767. On the other hand, many experienced database designers will argue that a well-normalized database will never need more than 100 columns in a table.”

Why is this an issue for GeoPackage? If everybody builds their SQLite install with SQLITE_MAX_COLUMN of 2000, and tells their users that's the limit, there is no interoperability problem due to too many columns - nobody will have GeoPackages with more than 2000 columns. If somebody says "I can fix your column limit problem" and bumps SQLITE_MAX_COLUMN to a higher value, then we'll see GeoPackages with some number of columns greater than the default of 2000, and the those instances of SQLite built with the default 2000 value will not be able to select or operate on the tables with greater column counts.

So, what should we do?

  1. I think we need to settle on a value for SQLITE_MAX_COLUMN for conforming GeoPackages. I would recommend the SQLite Default of 2000 because that's already in wide use in the field.

  2. We should put out a call to GeoPackage implementers to see if anybody is changing this value in their implementations. This will help us understand if there's a wider issue waiting to surface in the field, or if some larger value for the GeoPackage definition of SQLITE_MAX_COLUMN should be considered.

  3. We need to document the GeoPackage recommendation (or requirement if we chose to make it a hard limit) for SQLITE_MAX_COLUMN and describe how changing this value will impact the interoperability of GeoPackages.

  4. We need to check the SITE testing to see what it does with GeoPackages with more than 2000 columns - it probably raises an undefined SQL parse error - but we haven't tested this yet.

  5. Implementers may want to check how their software reacts to too many columns for their SQLite instance to see if the error message raised is informative enough for the user to understand what's up with their GeoPackage.

We did not run across this issue as a result of a GeoPackage from another supplier. This surfaced when one of our users tried to export data from a source that supported more than 2000 columns in a table, and the underlying SQLITE_MAX_COLUMN limit was exceeded. On further review - the concerns discussed above about needing to define column count limits in GeoPackage to ensure interoperability surfaced - had we simply increased SQLITE_MAX_COLUMN to accommodate this users dataset, it's highly likely others would have been unable to use it.

@rouault
Copy link
Contributor

rouault commented Aug 5, 2024

+1 to recommend/mandate 2000 columns max for interoperability (at least while this remains the default of SQLite3)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants