Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

commit order must consider non-nullability of join columns as prioritised over nullable join columns: Take 2 #7180

Closed
wants to merge 14 commits into from
Closed

commit order must consider non-nullability of join columns as prioritised over nullable join columns: Take 2 #7180

wants to merge 14 commits into from

Conversation

bendavies
Copy link

@bendavies bendavies commented Apr 8, 2018

This is a resubmission of #6533, which attempted to fix #6499 (also #7006)

Note that that I consulted Marco on slack before doing this.

I don't want to step on anyones toes, but the original PR has seemed to have gone stale and come to a halt. Maybe the original submitter has lost interest/moved on (his fork appears to be gone) or whatever (not judging), but regardless, I'd like to pick this back again and try and get it over the line.

I've maintained the original commits from the PR as well as some of Marco's styling fixes.

#6533 was merged and then reverted (#6533 (comment)).
The Tests reveal the problem, CascadeRemoveOrderTest. If anyone has any idea how this can be fixed before I start poking around, please give me a heads up.

$calc->addDependency(
$targetClass->name,
$class->name,
(int) (($joinColumns['nullable'] ?? true) === false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I wrote this ternary, and honestly I should be ashamed :-\

This should be split.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afraid so ;).

I can split it out, but feel free to push to my branch if you have a preferred style.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Ocramius
Copy link
Member

Ocramius commented Apr 9, 2018

Restarted zeh build

@bendavies
Copy link
Author

taking a look at the commit order of the entities in CascadeRemoveOrderTest, they are reversed with this patch on the remove -> flush.

Copy link
Member

@lcobucci lcobucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small issues, which won't really solve the mystery...

/** @Id @Column(type="integer") @GeneratedValue */
public $id;

/** @ManyToOne(targetEntity="DDC6499A") */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DDC6499A::class

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

use Doctrine\Tests\OrmFunctionalTestCase;

/**
* @group #6499
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't contain the #

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

$calc->addDependency(
$targetClass->name,
$class->name,
(int) $joinColumnsNotNullable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that $joinColumnsNullable === false ? 1 : 0 is already readable enough, but its just my machine brain talking 😂

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that seems fine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

/** @Id @Column(type="integer") @GeneratedValue */
public $id;

/** @ManyToOne(targetEntity="DDC6499A") */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DDC6499A::class

@@ -1255,8 +1255,14 @@ private function getCommitOrder(array $entityChangeSet = null)
}

$joinColumns = reset($assoc['joinColumns']);
$joinColumnsNullable = $joinColumns['nullable'] ?? true;
Copy link
Member

@lcobucci lcobucci Apr 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your test and CascadeRemoveOrderEntityG are kind of mutually exclusive... yours expects that the lack of a @ORM\JoinColumn annotation will lead to 0 as weight and the existing one expects it to be 1. However both scenarios seem to be valid...

@guilhermeblanco what do you think about this?

Copy link
Author

@bendavies bendavies Apr 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original thinking is in the description of #6533

Copy link
Author

@bendavies bendavies Apr 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a look (confirming your comment), the patch introduces a difference to the weight for when nullable is not defined. old weight would be 1, new is 0.

https://3v4l.org/ksRoj

so the fix seems to correct the bug. a not defined nullable should be created as true.
why this breaks cascade removals is obvious, but how do we fix it?

Copy link
Author

@bendavies bendavies Apr 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, the original flawed (?) nullable check was introduced by @guilhermeblanco here: #1570, which was fixing a cascade remove bug!

use Doctrine\Tests\OrmFunctionalTestCase;

/**
* @group #6499
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be @group 6499 instead

@bendavies
Copy link
Author

bendavies commented Apr 10, 2018

Ok, I've taken a deeper look at why CascadeRemoveOrderTest is failing, and it is as follows:

  1. CascadeRemoveOrderEntityO has a nullable FK to CascadeRemoveOrderEntityG
  2. CascadeRemoveOrderEntityG has a nullable FK to CascadeRemoveOrderEntityO

This results in the CommitOrderCalculator having a node list of
CascadeRemoveOrderEntityO -> CascadeRemoveOrderEntityG, weight 0
CascadeRemoveOrderEntityG -> CascadeRemoveOrderEntityO, weight 0.

i.e this graph as 2 valid commit orders. It just so happens that with the fix in this PR, the "wrong" one is chosen, but we have no way of knowing it's the wrong/right one.

It seems that CascadeRemoveOrderTest only passes currently on 2.6 by depending on the bug that a non defined nullable is treated as nullable=false.
Indeed, if you add @JoinColumn(nullable=true) to CascadeRemoveOrderEntityG::$ownerO, (essentially changing nothing), CascadeRemoveOrderTest will fail on 2.6/master.

So it seems we need to do something about the test. the fix in this PR seems valid.

Guidance appreciated @Ocramius @lcobucci @guilhermeblanco

@arnaud-lb
Copy link
Contributor

I believe that the changes proposed in the PR are not enough to fix the problem, unfortunately.

With this PR applied, the code at https://gist.github.com/arnaud-lb/2ac4a3007bba08ca5f8fa78cfc353cf5 will generate the following graph:

<?php

$calc = new CommitOrderCalculator();

$calc->addNode('A', 'A');
$calc->addNode('B', 'B');
$calc->addNode('C', 'C');

// B referenced by A
$calc->addDependency('B', 'A', 1);

// A referenced by B (nullable)
$calc->addDependency('A', 'B', 0);

// A referenced by C
$calc->addDependency('A', 'C', 1);

The output of $calc->sort() is C, B, A. Since C has a non-nullable reference to A, trying to insert C will not work.

The problems seems to arise from the fact that we are trying to sort a cyclic graph topologically. When the execution reaches a state that shouldn't happen in the orignal algorithm (case self::IN_PROGRESS), the implementation doesn't abort, and nodes may be added to the sorted node list prematurely.

Ignoring the nullable references when building the graph seems to creates an acyclic graph in my case (Is a cycle of non-null references possible?), and the commit order is correct (on a quite complex schema). The order may not be optimal, though, and I don't know if it works for entity removal.

@bendavies
Copy link
Author

thanks for that @arnaud-lb.
I'll add failing test cases to this PR.

@lcobucci
Copy link
Member

@bendavies I think that we might consider to have a more complicated logic to calculate the weight of the dependencies. I mean, the CommitOrderCalculator is flexible enough to deal with other values (not only 0 and 1) and we might be able to solve this issue by increasing the weight based on certain criteria, like:

$weight = 0;

if ($joinColumnIsNullable) {
    ++$weight;
}

if ($isSomethingElse) {
    ++$weight;
}

// ...

return $weight;

@bendavies
Copy link
Author

bendavies commented Apr 12, 2018

hah, i managed to hack in a fix for @arnaud-lb test/problem.

@bendavies
Copy link
Author

bendavies commented Apr 12, 2018

@lcobucci I'm thinking that CascadeRemoveOrderTest is unsolvable by the CommitOrderCalculator.
CascadeRemoveOrderEntityO has a nullable FK to CascadeRemoveOrderEntityG
CascadeRemoveOrderEntityG has a nullable FK to CascadeRemoveOrderEntityO

O's FK could be populated, which means we need to delete O first, then G.
G's FK could be populated, which means we need to delete G first, then O.
Both O's and G's FK could be populated, in which case neither can be deleted first - It must be a 2 phase flush.

EDIT: I've just seen that CascadeRemoveOrderEntityO has onDelete="SET NULL" on it's FK. Could the CommitOrderCalculator take that into account? However, if it didn't have onDelete="SET NULL", the problem goes back to being unsolvable?

The only think i can think of (but i'm not very familiar with UOW) is to add scheduleExtraUpdate support for nullable foreign keys when removing entities, so you first null the FK, then remove the entity. I'm sure someone more experienced can tell me why that's a stupid idea.

@arnaud-lb
Copy link
Contributor

hah, i managed to hack in a fix for @arnaud-lb test/problem.

Awesome! Will it handle cycles involving more than 2 nodes, though?

@bendavies
Copy link
Author

bendavies commented Apr 13, 2018 via email

@arnaud-lb
Copy link
Contributor

Sure, here is a new failing test with a 3 nodes cycle: arnaud-lb@c7facac

@bendavies
Copy link
Author

Thanks @arnaud-ld. What is the correct commit order for that, to save me trying to work it out?

@arnaud-lb
Copy link
Contributor

F, E, D, G would work as a commit order. Removing the addDependency() call when $joinColumnsNullable is true yields this order. However, I believe that doing this would cause more scheduleExtraUpdate than necessary with real world entities.

@Ocramius
Copy link
Member

@bendavies I think that we might consider to have a more complicated logic to calculate the weight of the dependencies. I mean, the CommitOrderCalculator is flexible enough to deal with other values (not only 0 and 1)

@lcobucci that opens a can of worms IMO, as we start defining weights without having a clear idea of what the weights actually are.

It must be a 2 phase flush.

These are scenarios in which the ORM should probably kick in at metadata validation time... Would be interesting to see how far we can go with defining the commit order before runtime: possibly a completely different project?

EDIT: I've just seen that CascadeRemoveOrderEntityO has onDelete="SET NULL" on it's FK. Could the CommitOrderCalculator take that into account? However, if it didn't have onDelete="SET NULL", the problem goes back to being unsolvable?

In short: no. Not all RDBMSs in use support onDelete or foreign keys in general. To take a simple one: MySQL's onDelete is already limited to 16 cascades, so it can't really be safely relied upon. We can't rely on that for now, sorry.

@bendavies
Copy link
Author

@Ocramius thanks for that. Do you have any opinion on CascadeRemoveOrderTest specifically? I still don't think the commit order calculator can determine the correct order for the classes in that test from class metadata alone, as it stands.

@Ocramius
Copy link
Member

Do you have any opinion on CascadeRemoveOrderTest specifically?

It seems like the test was faulty due to missing explicit nullable=true there. Not much to do there: we simply unveiled a hidden issue here. I don't think it should be changed for now, as users might be affected.

@Ocramius Ocramius added the Bug label Apr 13, 2018
@bendavies
Copy link
Author

bendavies commented Apr 14, 2018

I've taken in @arnaud-lb failing tests and added the same scenario to the commit order calculator.
The graph is this:
image
It's obvious what the path should be from looking at it: F, E, D, G. The commit order calculator isn't currently clever enough to work this out.

@bendavies
Copy link
Author

bendavies commented Apr 14, 2018

As above, at the moment the CommitOrderCalculator is too naive to work out some quite simple graphs.
I have a POC that fixes the above issues and improves the CommitOrderCalculator so that it solves the above graph and more complex ones.

Basically:

  1. Find the Minimum spanning tree of the Graph.
  2. Perform a Topological Sort of this graph to get your commit order.

Edit: I'm not sure this is appropriate actually. MST is for unidirected graphs while ours is directed.

However, there is a slight complication in that the CommitOrderCalculator could contain multiple unconnected graphs. MST requires a connected graph to perform on.
So before performing MST and sort, the distinct graphs need to be identified, and the above steps performed on each graph.

The sorts of these unconnected graphs can then be returned in any order as they are not dependent.

The above is a not a trivial undertaking so I don't want pursue it if someone thinks it's a bad idea or can see a really simple fix to the current impl that I've overlooked.

If @guilhermeblanco could provide some feedback before I try and take this any further, that would be great, since he implemented the CommitOrderCalculator in its current form.

@lcobucci
Copy link
Member

lcobucci commented Apr 14, 2018

The commit order calculator isn't currently clever enough to work this out.

@bendavies @Ocramius the commit order calculator knows only about the weight, it wasn't designed to know about the association mapping:

https://github.com/doctrine/doctrine2/blob/77e3e5c96c1beec7b28443c5b59145eeadbc0baf/lib/Doctrine/ORM/Internal/CommitOrderCalculator.php#L143-L148

It may indeed open a can of worms, but if we have issues with the logic which is sending data to the commit order calculator we shouldn't simply change the class (CommitOrderCalculator) but rather that logic.

@bendavies
Copy link
Author

@lcobucci i think the data provided to the calculator is fine. This graph is simple but isn't solved correctly: #7180 (comment)

@arnaud-lb
Copy link
Contributor

What's the status of this bug ? @bendavies you seemed to be on the right path.

@bendavies
Copy link
Author

My solution wasn't good because MST is for unidirected graphs while ours is directed.

I was waiting for @guilhermeblanco's feedback/help, really.

@bendavies
Copy link
Author

maybe fixed by #7260

@bendavies
Copy link
Author

rebased after #7260 was merged

@bendavies
Copy link
Author

bendavies commented Oct 10, 2018

thought i'd take a look at how hibernate handles this. They do it a bit differently.

Doctrine deduces table insert ordering from class metadata, while hibernate orders the inserted entities themselves.

@ndench
Copy link

ndench commented Jul 19, 2019

@bendavies how did you go with this?

@bendavies
Copy link
Author

bendavies commented Jul 19, 2019

no further sorry. I think the proper fix would be to copy Hibernate.

Doctrine deduces table insert ordering from class metadata, while hibernate orders the inserted entities themselves.

@mpdude
Copy link
Contributor

mpdude commented Feb 27, 2023

Still remember what you did 5 years ago? If so, maybe you can help to review #10547.

mpdude pushed a commit to mpdude/doctrine2 that referenced this pull request Feb 27, 2023
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Feb 27, 2023
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Feb 27, 2023
mpdude pushed a commit to mpdude/doctrine2 that referenced this pull request Feb 28, 2023
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Feb 28, 2023
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Feb 28, 2023
@bendavies bendavies closed this by deleting the head repository Mar 3, 2023
mpdude pushed a commit to mpdude/doctrine2 that referenced this pull request Mar 11, 2023
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Mar 11, 2023
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Mar 11, 2023
mpdude pushed a commit to mpdude/doctrine2 that referenced this pull request Mar 11, 2023
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Mar 11, 2023
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Mar 11, 2023
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Jun 2, 2023
Tests suggested in doctrine#7180 (comment) and doctrine#7180 (comment) by @arnaud-lb.

Co-authored-by: Arnaud Le Blanc <arnaud.lb@gmail.com>
greg0ire added a commit that referenced this pull request Jun 5, 2023
Add test to show #7180 has been fixed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants