-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flat Loops Shouldn't Break Opposing Edge Index #2385
Conversation
@@ -138,7 +138,8 @@ void ConstructEdges(const OSMData& osmdata, | |||
auto way_node = *way_nodes[current_way_node_index]; | |||
const auto way = *ways[way_node.way_index]; | |||
const auto first_way_node_index = current_way_node_index; | |||
const auto last_way_node_index = first_way_node_index + way.node_count() - 1; | |||
const auto last_way_node_index = | |||
first_way_node_index + way.node_count() - way_node.way_shape_node_index - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we can visit a way twice now (when it doubles back) we need to worry about not starting at the 0th node of a way
@@ -159,7 +160,7 @@ void ConstructEdges(const OSMData& osmdata, | |||
// Remember this edge starts here | |||
Edge prev_edge = Edge{0}; | |||
Edge edge = Edge::make_edge(way_node.way_index, current_way_node_index, way); | |||
edge.attributes.way_begin = true; | |||
edge.attributes.way_begin = way_node.way_shape_node_index == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could go either way on this. @gknisely what is preferable, i know this has implications for lane stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without testing I am not sure.
size += 1; | ||
// remember what edge this node will end, its complicated by the fact that we delay adding the | ||
// edge until the next iteration of the loop, ie once the edge becomes prev_edge | ||
uint32_t end_of = static_cast<uint32_t>(edges.size() + prev_edge.is_valid()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i tried to make this easier to understand. let me know if the comment makes sense
@@ -200,27 +201,40 @@ void ConstructEdges(const OSMData& osmdata, | |||
prev_edge.attributes.way_prior = true; | |||
} | |||
|
|||
if (!edge.attributes.way_begin) | |||
// We should add the previous edge now that we know its done | |||
if (prev_edge.is_valid()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again i tried to make this more clear. if we have a prev_edge
finished and waiting to get pushed on then we do so now because we are about to overwrite it. ie queue the next one
way_node = next_way_node; | ||
++current_way_node_index; | ||
doubled_back = current_way_node_index != last_way_node_index; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the above block is where we end seeing and deciding to skip over doubled back sections of a way
// backed over itself and we need to skip it | ||
if (current_way_node_index == last_way_node_index || doubled_back) { | ||
edges.push_back(prev_edge); | ||
current_way_node_index += !doubled_back; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the trick. if we doubled back we want to treat it like we are processing a new way where the doubling back ends (ie splits off again). in the case that we are doubling back the index is already fast forwarded (from the loop above) to the right one so we dont need to increment in that case. in the other case where we want the next way we do need to increment.
src/mjolnir/pbfgraphparser.cc
Outdated
OSMNode osm_node{node}; | ||
auto inserted = loop_nodes_.insert(std::make_pair(node, i)); | ||
osm_node.duplicate_ = | ||
!inserted.second || (i != 0 && i != nodes.size() - 1 && nodes[i - 1] == nodes[i + 1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when parsing ways, we mark any nodes that a way references as "duplicate" if they were already referenced by the way once or more befor OR if they were a turn around (the point at which it doubles back)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit for readability: can you pull out (i != 0 && i != nodes.size() - 1 && nodes[i - 1] == nodes[i + 1])
into a variable and give it a meaningful name?
src/mjolnir/pbfgraphparser.cc
Outdated
@@ -288,6 +288,8 @@ struct graph_callback : public OSMPBF::Callback { | |||
sequence<OSMWayNode>::iterator element = (*way_nodes_)[current_way_node_index_]; | |||
while (current_way_node_index_ < way_nodes_->size() && | |||
(way_node = element = (*way_nodes_)[current_way_node_index_]).node.osmid_ == osmid) { | |||
// we need to keep the duplicate flag that way parsing set | |||
n.duplicate_ = way_node.node.duplicate_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the way parser was only setting the nodeid for the nodes it referenced you can see below we just overwrite the whole thing with what we parsed from the nodes tags. however now that the way callback also writes this single bit saying its a duplicate we need to copy that forward before we overwrite the node data
Currently we also track any ways that are loops in a file that is output when building tiles called |
* @return true if the edge has at least 1 shape point | ||
*/ | ||
bool is_valid() const { | ||
return attributes.llcount > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the loop where we create edges in the graph, we only add the edge in the next iteration of the loop. because of this there is always a previous edge that we are tracking but we only want to add it if its not the first previous edge (ie not a real edge). this little function helps make that easier to see. when you initialize an edge without actual data you end up with an edge that has no shape (nodes associated) yet so this function uses that to decide whether the edge is valid or not
@@ -1828,18 +1830,6 @@ struct graph_callback : public OSMPBF::Callback { | |||
bss_nodes_.reset(bss_nodes); | |||
} | |||
|
|||
// Output list of wayids that have loops | |||
void output_loops() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i removed this because its no longer useful now that the code successfully handles loops. we can let other projects more suited to data cleanup handle calling out degeneracies in the data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we pluck out a few previously loop ways and verify we can route on them successfully now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a readability nit and a question.
check_opposing(map); | ||
} | ||
|
||
TEST(loops, split_lolli) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love these test case names 😍
@@ -1828,18 +1830,6 @@ struct graph_callback : public OSMPBF::Callback { | |||
bss_nodes_.reset(bss_nodes); | |||
} | |||
|
|||
// Output list of wayids that have loops | |||
void output_loops() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we pluck out a few previously loop ways and verify we can route on them successfully now?
src/mjolnir/pbfgraphparser.cc
Outdated
OSMNode osm_node{node}; | ||
auto inserted = loop_nodes_.insert(std::make_pair(node, i)); | ||
osm_node.duplicate_ = | ||
!inserted.second || (i != 0 && i != nodes.size() - 1 && nodes[i - 1] == nodes[i + 1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit for readability: can you pull out (i != 0 && i != nodes.size() - 1 && nodes[i - 1] == nodes[i + 1])
into a variable and give it a meaningful name?
|
||
void check_opposing(const gurka::map& map) { | ||
// then we test the invariant that any edge should be the opposing edge of its opposing edge | ||
// edge == opposing(opposing(edge)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should also include this assertion somewhere in data processing to catch bad data in development, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we kind of do that already! its here: https://github.com/valhalla/valhalla/blob/master/src/mjolnir/graphvalidator.cc#L164-L179
it basically says if im assigning an opp index to an edge that already got one that means there were duplicates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that's only logging a debug statement to the console right? If violating the invariant edge == opposing(opposing(edge))
leads to invalid memory access when querying the service, it seems like we should consider that a fatal error at data processing time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a major change. We need to RAD test this in NA and EU.
I have to harden the tests on this one: https://www.openstreetmap.org/way/759927840#map=19/51.35764/4.63997 The updated logic does not handle the case where two nodes have been seen before but werent adjacent which means we lose these edges. Specifically we need to track unique pairs of nodes to know whether or not they should be marked as duplicated. |
nodes[i + 1] == nodes[inserted.first->second - 1]; | ||
bool unflattening = i > 0 && inserted.first->second < nodes.size() - 1 && | ||
nodes[i - 1] == nodes[inserted.first->second + 1]; | ||
osm_node.flat_loop_ = flattening || unflattening; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in a previous revision of this pr this got triggered whenever we found the point at which at double back occured or when a node was not unique. it turns out we need to allow nodes to be not unique in the case of legitimate self intersections. so instead we harden to just the case where the going and outgoing paths at a non unique node are the same (ie is a flat loop). the cool thing is that this work also for the node where the double back occurs even though its not duplicated it has the same ingoing and outgoing path. in this case path means neighboring node ids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, approved pending RAD.
@gknisely can i get another look. I RAD'd the planet. Because of the time it takes for the planet to complete the planet file used between runs was different. I used a test request file with 14k auto routes in it. Of those there were 166 differences. I reviewed manually 50 of them. And they all ended up being data edits that either: changed the shape slightly so simple diffing failed, changed connectivity slightly so that a detour was taken. |
Loops strike again! So here is another problem of handling a data edge case (we seem to be hitting a lot of these lately). In this case we have a "flat loop": https://www.openstreetmap.org/way/747414277#map=19/42.56012/-70.96623
Where you can see the way goes from node
6993574477
to6993574481
and back to6993574477
. The tile building code should then force a graph node to happen on node6993574481
which should mean two separate pairs of edges; one pair from6993574477
to6993574481
and another pair from6993574481
back to6993574477
. The last step in tile building should then link up the opposing edge indices between the two pairs of edges between those nodes. However something is going wrong here. For some reason this code cant differentiate between the two pairs and ends up creating edges whose opposing edges dont match.Specifically one invarient in the code should be should be that:
edge == opposing(opposing(edge))
However this way breaks that.. The failing test confirms it
but I still need to figure out why it cant differentiate. It may be the case that we will have to collapse flat loops as a special case in the parsing logic.So yeah in graph validator we go hook up edges with their opposing edges by matching them based on their begin and end nodes and their properties. In this case these cases the code will see two potential matching opposing edges and it always takes the last one. Because of this two edges end up picking the same opposing edge and sharing it. We could potentially devise a way to make the tie braker scenario work (pick the one that wasnt already picked) but instead what I did was make it so that we dont duplicate edges in the first place.