-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add back users.nearby column as a dummy #2543
Comments
Well I was going to reply to your other comment but as you've decided to duplicate that with this ticket I'll reply here instead... My opinion is that we can't be held hostage by a package like osmosis that is essentially dead. Loading data into an API database is not something most people ever have any need to do so I don't see this as an urgent problem to solve. Making changes to the user table in the production database is also a bit of a pain so I try and avoid doing unnecessary changes there. |
Oh and you missed |
I agree it’s a niche need, but it’s worth making a special effort to support people modifying OSM website code and using DB extracts to try out their changes. We want more people to be able to contribute here and leaving Osmosis incompatible with the website API DB is going to be a major missing stair for this user population. |
The thing is that osmosis loading has never worked very well and tends to just lead to frustrated users which is why I try and discourage it. |
Sorry to hear you've been unsuccessful! It remains one of the only mainstream API DB write paths that I’m aware of, so it’s likely that we are stuck with it. |
Rewriting this part of osmosis based on libosmium could be a good GSoC project maybe? Needs someone with good c++ fu, though. |
Could be interesting! That, or providing a supported piece of code inside this repository to write to the API DB. |
We should also evaluate if the new flex backend in Osm2pgsql could support this. Then it would be a rather simple task of writing some lua code. |
Loading data into an API database is relatively easy if the database is empty - creating user records is the hardest thing to deal with. Where it gets really problematic, and where things usually fell apart for people, is when they try and load data into a database that isn't empty... |
Makes sense, thanks! We’re still going to have a pretty core compatibility problem here in the meantime. What do we think about re-introducing |
You yourself described it as "niche" a few hours ago and now it has escalated to a "core compatibility problem" for some reason? I have already explained why I don't particularly want to revert this. I'm also not keen on adding a way to load data to this repository. I have no objection to a third party tool but adding it here will require us to maintain something that we're never going to use. |
My recommendation for the time being would be to introduce an additional column for the import via plain old Postgresql command line tools:
That would allow you to run osmosis as is, and avoids any additional trouble for production. Also, we don't need any Rails db migration. Trying to keep this simple. We could add this as a hint in https://github.com/openstreetmap/openstreetmap-website/blob/34dd2293db85e28b7e5df0889b0b778a685306bb/CONFIGURE.md#populating-the-database, mentioning that osmosis doesn't reflect the latest db schema changes. |
Sorry, I should have been clearer: this concern is core to a specific niche that we claim to care a lot about. There aren’t a lot of people who will encounter this issue as potential contributors to OSM-website but we want it to be painless for 100% of those people who do. |
I have actually hit exactly the same problem just now. The trick by @mmd-osm with a temporary addition of the missing column before invocation of It is very sad that A working method to import into DB via command line interface is important for OSM contributors (and for the OSM project in perspective), let me describe my collaboration use case. I need ability to import data in OSM format into a private instance of OSM API database I've set up. I plan to share a bunch of vector features obtained from a CC0-licensed source with a number of contributors. On my private OSM API instance, these several people can collaborate on improving these features before uploading them to the public OSM.org database. I am essentially recreating a WFS-like functionality in a form of an OSM API instance. Using the same JOSM editor and two alternating API URLs, a user does the following steps.
I hope my explanation is not overly complicated to grasp (it's a bit late here to think clearly...) There are a few remaining technicalities to solve (like negative IDs for "new" primitives etc.), but nothing unsolvable. |
You can describe the complicated things you are doing as much as you like, but the scope of this project is to develop the www.openstreetmap.org web site. Some people choose to take the code and run their own clones of www.openstreetmap.org and we do our best to help them where we can but at the end of the day the primary goal takes priority. Trying to do bidirectional data migration between the main database and a mirror is definitely way out of scope and not something we would want to encourage at all. |
Probably an issue with colliding user ids. The documentation I mentioned assumes that you're only creating users after running osmosis. It even explicitly states: After installing but before creating any users or data, import an extract with Osmosis and the --write-apidb task. What you describe sounds quite involved. You might want to check out RapiD editor for conflation: once a user copies over an element from an external vector source and uploads it to OSM, that feature would be automatically removed from the external data source without any need for user intervention. @migurski might be a good contact to talk to about this ;) |
@tomhughes, thanks for replying!
Surely I understand the main goal here and value your contributions to it.
Data migration here is actually unidirectional:
What happens in the reverse direction is a cleanup to prevent double work. Essentially, primitives are "moved", not "copied" from private API to public API. I go through all these troubles to try a new technique of collaboration over an open data source. Consider this in the context of the OSM project philosophy:
A single person can do the inspection locally in JOSM: the editor does support notion of layers. However, typical data imports are huge enough to be parallelizable among many persons. So, a distributed second data "layer" is needed for them to collaborate. Traditionally, this is done via ad-hoc measures like a shared folder with a bunch of smaller OSM files for everyone to grab, inspect and upload. But shared folders and other filesystem-based collaboration workflows suck immediately when amount of data grows. That is why databases were invented, and this is what I am trying to use here. |
Ah, silly me. Thanks for pointing that out! Anyway, it was a minor hurdle compared to other stuff I had to deal with, given how n00b I am with Ruby and Postgres. I am still surprised I managed to bring the API server up, and it works!
Yes, this is exactly what I plan to achieve. Having the "removal" phase automated is what I currently lack. I thought that RapiD was primarily meant to import AI-traced roads, but maybe I should give it a deeper look. Thanks! |
@grigory-rechistov I think we've wandered a long way from the topic here. It's enough to know that you've used the osmosis apidb import, and that your use-case (as explained already by @tomhughes ) isn't one that we explicitly support. Beyond that, I'm afraid the rest of the details of your setup and future plans are off-topic. |
I agree with all the above statements, which is frustrating since there's no obvious way forward. My magic wand (if I had one) would conjure up someone to spin an osmosis 0.47.1 release with just the patch in openstreetmap/osmosis#54 applied and then we can move on. I really don't want to add an unused column to our database just to satisfy a broken client that should never have been using this column in the first place. The fact that it is unmaintained suggests that the correct approach is going to be to either find a new maintainer or find replacement software. And osmosis has never really worked very well for this anyway, due to both conflicts with existing data, and leaving sequences broken, and all that kind of stuff. Given the existing need to run "fixup" SQL commands for the sequences when running osmosis, I think we should just update the documentation to add in the additional SQL commands (along with clearer warnings about the downsides, like messing up dbs with existing geodata, user accounts or both) and move on.
The problem is that adding additional data to the db is simple only when the db is already empty. Any other situation becomes complex to the point of being a significant task. I suspect you need the user to decide what to do when e.g. node_id 10,003 already exists (overwrite it? or create a new node, and rewrite all references in the current import), or when user_id = 2 is already in the database - use this user for all the related data in the import, or create a new user for stuff coming from the import, and so on and so on. All these decisions have big implications for what the user is trying to do, when anyone wants to use it beyond just filling an empty db. So this suggests a non-trivial piece of software with many options, tests and complexity, that's not really suitable as a script in this repo. |
Bummer, but okay. What do you think would be required to release an Osmosis 0.47.1? I’m not current on the relationship that we may have had with distributors to get past versions out. |
Last sign of life was in October 2018: https://lists.openstreetmap.org/pipermail/osmosis-dev/2018-October/001847.html If anybody wants to take on release management duties, let me know, and I'll help them get access to some of the key infrastructure such as the dev server for hosting distribution binaries, Maven Central artefact upload process, and GitHub git repository. Nobody volunteered, meaning it's dead now. |
I've done this and didn't figure out the error "Unable to insert user with id = -1 into the database." when import a geofabrik data into local osm website database. |
Please try again with a Geofabrik extract that includes valid metadata (called "internal" on their website). You need to log on with your OSM user account to download this file. By the way, this has really nothing to do with the issue here, please create a new issue next time. |
It was an issue with the version. I just installed the latest one and everything ran smoothly. |
As of version 0.47.1, Osmosis now correctly omits the |
We removed
users.nearby
in #2439, but this new API DB schema breaks the write capability of Osmosis. I opened a PR on Osmosis in openstreetmap/osmosis#54, where @simonpoole pointed out that Osmosis no longer maintained. @mmd-osm adds that some other process could be developed to populate an API DB, but there’s no one true path at this time.Can we add back a dummy
users.nearby
column to support the current best import path until a new one is decided-upon? This issue duplicates parts of the discussion in #2449, which has been closed.The text was updated successfully, but these errors were encountered: