Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creation order parent-children inversed #10864

Closed
GregoireHebert opened this issue Aug 1, 2023 · 50 comments
Closed

Creation order parent-children inversed #10864

GregoireHebert opened this issue Aug 1, 2023 · 50 comments

Comments

@GregoireHebert
Copy link

GregoireHebert commented Aug 1, 2023

BC Break Report

Q A
BC Break yes
Version 2.16.0

Summary

Record order creation has changed.
Probably following this : #10547

Previous behavior

Creation of an entity an its child.

{
      "_links": {
        "self": {
          "href": "/max_depth_eager_dummies/1"
        },
        "child": {
          "href": "/max_depth_eager_dummies/2"
        }
      },
      "_embedded": {
        "child": {
          "_links": {
            "self": {
              "href": "/max_depth_eager_dummies/2"
            }
          },
          "id": 2,
          "name": "level 2"
        }
      },
      "id": 1,
      "name": "level 1"
    }

Current behavior

Now the child is created before the parent.

        {
          "_links": {
              "self": {
                  "href": "/max_depth_eager_dummies/2"
              },
              "child": {
                  "href": "/max_depth_eager_dummies/1"
              }
          },
          "_embedded": {
              "child": {
                  "_links": {
                      "self": {
                          "href": "/max_depth_eager_dummies/1"
                      }
                  },
                  "id": 1,
                  "name": "level 2"
              }
          },
          "id": 2,
          "name": "level 1"
      }

How to reproduce

Get the last version of API Platform and run this scenario:
https://github.com/api-platform/core/actions/runs/5727605051/job/15520391811?pr=5675#step:9:94

Downgrading to 2.15.5 fixes the problem.

@GregoireHebert GregoireHebert changed the title Creation order breaks BC Break - Creation order parent-children inversed Aug 1, 2023
@mpdude
Copy link
Contributor

mpdude commented Aug 1, 2023

Can you show me the code where the entities are defined, created and persisted?

@greg0ire
Copy link
Member

greg0ire commented Aug 1, 2023

Can you please state the obvious and say why this should be considered a breaking change in your opinion? Right now the only difference I can spot on my phone is a difference in ids. A diff might help.

@dunglas
Copy link
Contributor

dunglas commented Aug 1, 2023

@mpdude the entity definition: https://github.com/api-platform/core/blob/3.1/tests/Fixtures/TestBundle/Entity/MaxDepthEagerDummy.php
In this test, persistence is handled "automatically" by the Doctrine bridge of API Platform: https://github.com/api-platform/core/blob/3.1/src/Doctrine/Common/State/PersistProcessor.php#L102-L107

@greg0ire this can be considered a BC break because upgrading to a new minor version of the ORM changes the generation order of autoincremented primary keys. It's totally possible and even likely that client applications (JS or mobile apps) or third-party apps that access the same DB as the PHP app rely on the order (stable order is one of the characteristics expected from sequences).

Of course, we can update our test suite to not rely on the ID, but this change will likely affect end users too.

@greg0ire
Copy link
Member

greg0ire commented Aug 1, 2023

If that's just it, then I'd say we can close this. The point of auto increment is IMO to let the db avoid collisions, not to guarantee an order. I agree that this might break a lot of code but there is a price to pay for all the goodness brought by recent changes I'm afraid. Unless we documented a specific order somewhere, I'd say that it would be wrong to assume any order.

@greg0ire
Copy link
Member

greg0ire commented Aug 1, 2023

I think we could accept PRs that would restore the previous order for your particular case where both orders seem to work though. Provided the code is not too ugly, and as long as we are not removing any tests, it should be fine.

@mpdude
Copy link
Contributor

mpdude commented Aug 1, 2023

I agree that the test suite is (implicitly) making assumptions about the commit order by working with fixed IDs. So will be many test suites, and also our own ORM tests do it in many places.

When I wrote #10547, I tried to maintain order as much as possible, otherwise it would have broken our own tests as well.

So, why is this case different and why did our own tests not have this problem?

@mpdude
Copy link
Contributor

mpdude commented Aug 1, 2023

Maybe one idea:

When the entities are all of the same class and use the same table, pre-#10547 would not consider them independently. It would process all entities for a particular table in the order they were given to persist().

Are all associations in the test suite in question nullable? If so, it might be that entities were always processed in the order you passed them to persist() and foreign key relationships were filled in at a later time with extra UPDATE queries.

The new commit order computation will honor these FKs within the same table and insert referred-to entities first, to avoid extra (unnecessary) UPDATE queries.

Just guessing, though.

@dunglas
Copy link
Contributor

dunglas commented Aug 1, 2023

Postgres seems to make this behavior configurable, but by default, results are guaranteed to be ordered: https://www.postgresql.org/docs/current/sql-createsequence.html

MySQL also seems to care about order:

When you insert any other value into an AUTO_INCREMENT column, the column is set to that value and the sequence is reset so that the next automatically generated value follows sequentially from the largest column value.

InnoDB:

auto-increment values are assigned in a predictable and repeatable order for a given sequence of SQL statements

MyISAM:

For MyISAM tables, you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.

SQLite explicitly doesn't guarantee order (but in this case, they do): https://www.sqlite.org/autoinc.html

I agree that this might break a lot of code but there is a price to pay for all the goodness brought by recent changes I'm afraid.

Shouldn't these kinds of changes be merged in 3.0? Changing the order of PKs in minor releases can create a mess at the data level and is probably unexpected by most end-users.

@greg0ire
Copy link
Member

greg0ire commented Aug 1, 2023

But the id's are still sequential, are they not? What really changes is the order.

All that order guarantee is assuming the ORM also guarantees the order in which it will perform the inserts. The ORM is not merely an abstraction layer over the db, that would be the DBAL. In the ORM, you define a graph, and then flush it. The ORM does not have a concept of insert in it's public API. So I'd say the only thing broken thing here are wrong assumptions. Regardless of someone manages to restore the original order, I'd say code that assumes one should be rewritten.

@GregoireHebert
Copy link
Author

To give more insights, the persist call is made with one object.

ApiPlatform\Tests\Fixtures\TestBundle\Entity\MaxDepthEagerDummy {#25040
  -id: null
  +name: "level 1"
  +child: ApiPlatform\Tests\Fixtures\TestBundle\Entity\MaxDepthEagerDummy {#25031
    -id: null
    +name: "level 2"
    +child: null
  }
}

With this instruction, I see two ways. Having the parent being created before the child, then attach them, or having the child created then in one go, create the parent and attach the child. Both can happen for different reasons. cardinalities, performance, reducing computations... And that's fair. I don't want to remove from @mpdude the great work he did :)
I simply signal that this breaks our CI expectations, and might alter other projects.

The entity causing the problem is this one: https://github.com/api-platform/core/blob/b8ca845a6988b2634a7459519d1657c2ecf063fc/tests/Fixtures/TestBundle/Entity/MaxDepthEagerDummy.php#L29

Maybe our cases should be fixed, but I want to be sure.

@dunglas
Copy link
Contributor

dunglas commented Aug 1, 2023

Isn't this class part of the public API? https://github.com/doctrine/orm/blob/2.16.x/lib/Doctrine/ORM/UnitOfWork.php#L78-L86

The UnitOfWork is responsible for tracking changes to objects during an "object-level" transaction and for writing out changes to the database in the correct order.

(emphasis mine)

In the same class (but not part of the public documentation):

Perform entity insertions first, so that all new entities have their rows in the database and can be referred to by foreign keys.

Also, the unit of work has public methods related to the concept of inserts: https://github.com/doctrine/orm/blob/2.16.x/lib/Doctrine/ORM/UnitOfWork.php#L1424

To clarify, I don't argue that this change is not needed/logic (I don't know). But as it may have a huge impact on production systems, it should at least be documented, and probably not included in a minor version.

@greg0ire
Copy link
Member

greg0ire commented Aug 1, 2023

If we merge something to restore the original order, we should call it a favor, not a fix IMO, and it should be reverted after merging up to 3.0.x. A possible next step would be to write a test case so that the unwanted behavior is easy to reproduce.

@greg0ire
Copy link
Member

greg0ire commented Aug 1, 2023

I think by "the correct order" what is meant is an order that does not lead to an error. Before this change, the uow precisely fails to produce the correct order.

As for the uow, it does expose a concept of inserts. I don't know if there was a change with those APIs directly, although it's probable. I don't think that's the most advertised API of the ORM, and don't recall ever using it directly myself. Again, a test case might help save our time in case this might be easy to fix. I love philosophical discussions such as this one, but I feel we might be arguing longer than it takes to write a "fix" here.

@dunglas
Copy link
Contributor

dunglas commented Aug 1, 2023

Again, a test case might help save our time in case this might be easy to fix.

I'll try to write one tomorrow. In the meantime, running ours is straightforward:

git clone https://github.com/api-platform/core
cd core
composer install --prefer-source # so Doctrine can be hacked directly from the vendors
./vendor/bin/behat features/hal/max_depth.feature

@GregoireHebert
Copy link
Author

GregoireHebert commented Aug 1, 2023

I just took the time to create a reproducer:
GregoireHebert@d3ea5a9

1) Doctrine\Tests\ORM\Functional\GH10864Test::testInsertionOrderIsParentThenChildren
Parent should be inserted first but his Id is 2
Failed asserting that 2 matches expected 1.

@haroldiedema
Copy link

We are using nelmio/alice in combination with a SQLite driver to functionally test our repositories & entities. This update broke our testing system due to the creation order becoming non-deterministic. We'll have to pin to <2.16.0 as well until this issue is resolved.

@derrabus
Copy link
Member

derrabus commented Aug 2, 2023

creation order becoming non-deterministic

I'm pretty sure that the order is still deterministic. It's simply a different one than in 2.15.

That being said, the ORM does neither define nor guarantee an order in which inserts happen. By letting the ORM persist an entity graph, you delegate the control over how this happens to the ORM.

The algorithm was changed in order to fix cases in which an entity graph could not be persisted at all because of a bad order of inserts and updates. To a certain degree, we the maintainers and contributors must be able to fine-tune the order of DML statements.

If we consider this a BC break, I need something more than "my test suite breaks because auto-increment IDs have a different order now".

@mpdude
Copy link
Contributor

mpdude commented Aug 2, 2023

From https://www.doctrine-project.org/projects/doctrine-orm/en/2.15/reference/working-with-objects.html#persisting-entities:

Invoking the persist method on an entity does NOT cause an immediate SQL INSERT to be issued on the database. Doctrine applies a strategy called transactional write-behind, which means that it will delay most SQL commands until EntityManager#flush() is invoked which will then issue all necessary SQL statements to synchronize your objects with the database in the most efficient way and a single, short transaction, taking care of maintaining referential integrity.

There was never an assurance about a particular order, or that entities would be INSERTed in the order in which they were passed to persist().

In the case discussed here, the commit order computation previously missed to check the association between two entities of the same class. It only got away with it because an extra UPDATE query could be used later on to fill in a missing ID. For example with non-nullable associations, this would have failed. The new commit order in this case is more efficient, since it uses two INSERTs only and does not require the foreign key to be nullable.

As tedious as it may be, probably the most resilient strategy for test suites would be not to make assumptions about how auto-increment IDs will be assigned and write tests based on the IDs actually used; or, use application-provided IDs for full control.

Alternatively, you might try to call flush() multiple times if you need to guarantee that one entity is persisted before the other.

grossmannmartin added a commit to shopsys/shopsys that referenced this issue Aug 2, 2023
- order item creation depends on the order of autoincrement ids
- the way in which order the inserts are executed changed in 2.16.0
- see doctrine/orm#10547 and doctrine/orm#10864
@haroldiedema
Copy link

haroldiedema commented Aug 2, 2023

I attempted to address this issue by manually assigning IDs to each fixture object (instances of entities) and using the AssignedGenerator as IdGenerator instead of the default IdentityGenerator. This approach did help in controlling the order of IDs, but unfortunately somehow broke the relationships between objects, at least from the perspective of the repository (see below).

Alice, in this context, generates a collection of pre-hydrated entities, all lacking an ID. Our setup involves an in-memory SQLite driver, and before each individual unit test, we flush all the pre-hydrated entities to memory. I've undertaken the following actions to rectify the situation; however, this has led to even more perplexing problems. I've summarized the steps concisely for clarity:

  1. (Initial iteration) Monitoring IDs per entity (table name) and incrementing them by one whenever an object of that type is encountered. Subsequently, I assigned the automatically generated ID to the id property of the entity, utilizing reflection due to the absence of a dedicated setter method.
  2. (Second iteration) Invoking the persist() function on all objects (with assigned IDs at this stage).
  3. Calling the flush() operation.

The problem arises when we invoke the findOneBy method on a repository, using other entity objects as criteria. This results in completely unrelated outcomes that deviate from the original fixture set. Given that ID generation is exclusively managed by Doctrine rather than nelmio/alice, my suspicion is that this is a Doctrine-related issue.

I'm uncertain if this issue was already there since we never had to manually assign ID's, or came to light with the recent changes.

@mpdude
Copy link
Contributor

mpdude commented Aug 2, 2023

@haroldiedema are you saying that findOneBy does not work for entities with application-assigned IDs? If so, could you please open a dedicated issue for that and provide a test that reproduces the issue?

@haroldiedema
Copy link

haroldiedema commented Aug 3, 2023

@mpdude It seems that way yes. I'll try to see if I can make a somewhat condensed/dedicated repro.

Update: I've created an isolated testcase but can't seem to reproduce the problem with doctrine alone. Apologies for the fuzz.

@beberlei
Copy link
Member

beberlei commented Aug 3, 2023

The order of how entities are persisted in the UnitOfWork is an implementation detail, maybe caused by changes to the commit order calculator that we did (i think). I would not consider it a BC Break.

@derrabus
Copy link
Member

derrabus commented Aug 10, 2023

Creating entities in a loop is now unpredictable,

I would challenge that assessment. If you create and persist entities in a loop, they'll all be created when you flush the EM. That's the contract of the ORM and that contract has not changed. Which DML statements the ORM uses and in which order it executes them has never been part of the contract.

Basically first() is now useless because we can't assume anything about the order.

If you don't order the collection, you cannot make assumptions on which one is the first element, correct.

@mpdude
Copy link
Contributor

mpdude commented Aug 10, 2023

To clarify and/or re-state, there is no "unpredictable" or "random" part in the commit order computation. If e. g. your test suite creates and persist()s the same set of entities repeatedly, the commit order will be the same every time.

What was fixed in 2.16.0 is that the commit order computation now considers associations between entities in more detail. This was necessary to fix a long list of issues where the computation was not precise enough previously. That may lead to a different ordering of INSERT operations when there are associations between entities.

Also, I would assume that there are no changes when you insert a series of entities that have no associations among each other. I designed the commit order computation in #10547 to honor the ordering of persist() method calls when no associations require to deviate from it.

In fact, the ORM test suite also contains a lot of tests where it creates and persists entities and assumes that the IDs will be assigned in the order of persist() calls. Those tests still work.

@Dallas62
Copy link

@mpdude Is the change also impact the deletion order ?

I encounter now an issue (Integrity constraint violation: 1451 Cannot delete or update a parent row: a foreign key constraint fails), that wasn't present before (no data-structure change or in the code).

I will try to create a reproduction but It's not totally clear for me what is the root cause expect a change in the deletion order.

@mpdude
Copy link
Contributor

mpdude commented Aug 13, 2023

@Dallas62 yes, that was necessary to fix at least #9192 and #10348

@derrabus
Copy link
Member

@Dallas62 Do you use any foreign keys that are not created by the ORM? If that's not the case, can you create a reproducer and open a new bug for that?

@Dallas62
Copy link

@derrabus All the database is handled / generated by the ORM. I will look to create a reproducer ASAP.

@Dallas62
Copy link

Dallas62 commented Aug 15, 2023

I'm still not able to reproduce it yet on fresh repository. I feel like the cascade parameter is ignored but I don't understand why. (Cascade is configured at orm level, not database level even if supported)

@Dallas62
Copy link

@derrabus @mpdude I'm now able to reproduce the issue in a "simple" reproducer:
https://github.com/Dallas62/doctrine-orm-reproducer-deletion-issue

First thought was caused by the nested structure, but looks like the OneToOne is leading to this issue. I cannot reproduce in 2.15.5, but in 2.16.1 I have this issue:

 bin/console reproducer:delete_data                   
[critical] Error thrown while running command "reproducer:delete_data". Message: "An exception occurred while executing a query: SQLSTATE[23000]: Integrity constraint violation: 1451 Cannot delete or update a parent row: a foreign key constraint fails (`reproducer`.`user_rooms`, CONSTRAINT `FK_9E63E1CEA76ED395` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`))"

In ExceptionConverter.php line 56:
                                                                                                                                                                                                                                                       
  An exception occurred while executing a query: SQLSTATE[23000]: Integrity constraint violation: 1451 Cannot delete or update a parent row: a foreign key constraint fails (`reproducer`.`user_rooms`, CONSTRAINT `FK_9E63E1CEA76ED395` FOREIGN KEY   
  (`user_id`) REFERENCES `users` (`id`))                                                                                                                                                                                                               
                                                                                                                                                                                                                                                       

In Exception.php line 28:
                                                                                                                                                                                                                                              
  SQLSTATE[23000]: Integrity constraint violation: 1451 Cannot delete or update a parent row: a foreign key constraint fails (`reproducer`.`user_rooms`, CONSTRAINT `FK_9E63E1CEA76ED395` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`))  
                                                                                                                                                                                                                                              

In Statement.php line 121:
                                                                                                                                                                                                                                              
  SQLSTATE[23000]: Integrity constraint violation: 1451 Cannot delete or update a parent row: a foreign key constraint fails (`reproducer`.`user_rooms`, CONSTRAINT `FK_9E63E1CEA76ED395` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`))  
                                                                                                                                                                                                                                              

@mpdude
Copy link
Contributor

mpdude commented Aug 15, 2023

@Dallas62 What is user_rooms? I cannot find that in the repo you mentioned.

Update Let's continue discussion about this delete ordering topic in #10912

@Dallas62
Copy link

@mpdude Forgot to push 😩

Will push later...

It's like House in the actual repository, and Owner as a OneToOne relationship.

@Dallas62
Copy link

Now it's ok

@adrienfr
Copy link

We are using nelmio/alice in combination with a SQLite driver to functionally test our repositories & entities. This update broke our testing system due to the creation order becoming non-deterministic. We'll have to pin to <2.16.0 as well until this issue is resolved.

Same here, using nelmio/alice & theofidry/AliceDataFixtures and this update broke many tests relying on fixtures being persisted in the same order they are in config files.

@greg0ire
Copy link
Member

@adrienfr , in case you're posting this to get us to change our minds about this, I think it's been made pretty clear already that we won't. In case you are posting this to get help on how to setup fixtures the right way, your question has to little details for us to help you, and also this is not the right place to ask support questions.

@adrienfr
Copy link

@adrienfr , in case you're posting this to get us to change our minds about this, I think it's been made pretty clear already that we won't. In case you are posting this to get help on how to setup fixtures the right way, your question has to little details for us to help you, and also this is not the right place to ask support questions.

Hi @greg0ire, I was just giving some feedback that @haroldiedema is not the only having issues with these fixtures bundles.
I’m not asking for you to change your mind, these changes were made to improve current issues so that’s a nice step forward to me and it should not be deleted.

Having said that, in view of the number of people who have reported the change in behavior and the resulting effects, I'm wondering what's best to do: try to make a PR that, while keeping the improvements, attempts to restore the previous order of persists or, as you said above, do the people concerned a favor and move it to 3.0.X.

It's an open question, in any case modifications will have to be made by the people concerned, short or medium term.

@greg0ire
Copy link
Member

greg0ire commented Aug 16, 2023

I know you're not the first suggesting to move this to 3.0.x, but I don't think it would actually be doing people a favor. Without a backwards-compatibility layer on 2.0.x, it would mean people migrating to 3.0.x would have to change code while doing so, when ideally they should be able to do so before even touching the version constraint for doctrine/orm. Plus, it prevents people suffering from the issues here from getting a fix, all for the sake of being nice to people who thought we were guaranteeing any order. If we want to get everyone in this thread on 3.0.x, it's IMO better to have them do several medium efforts rather than a single, gigantic one. Ideally, they should address all deprecations in as many deploys as they want, and then be able to move to 3.0.x without changing anything.

attempts to restore the previous order of persists or

Such a PR would be welcome I suppose, as I said before, but if in the end, it results in extra code just for BC reasons, we should probably remove it in 3.0.x, and add a BC layer so that people can migrate early on. Worse, it would not encourage people to migrate to solve this properly.

The real solution that should be pursued IMO should not be to adapt your fixtures to the new order, but to make it independent from that order. We changed the order once, and reserve the right to change it again.

mpdude added a commit to mpdude/doctrine2 that referenced this issue Aug 16, 2023
In doctrine#10864 (comment) and subsequent comments, @Dallas62 reported that since the upgrade to 2.16.1, they have an issue with the order of `DELETE` statements. The order of `DELETES` leads to a foreign key constraint violation.

Since the report does not exactly match the original issue topic in doctrine#10864, I am creating this one here so we can track it individually.

In order to reproduce, a demo Symfony application was provided in https://github.com/Dallas62/doctrine-orm-reproducer-deletion-issue, but obviously we cannot include that in the ORM test suite.

I have picked the two entities I deemed relevant, but as of now, this is not sufficient to reproduce the bug.
@k0ka
Copy link

k0ka commented Oct 11, 2023

My opinion is this is a Breaking Change. However, it looks like different people are talking about abstract things meaning different problems.

I created a simple failing test: https://github.com/k0ka/doctrine-orm/blob/e43f8bcf8c34d1da5b7a916e534850e0482b211a/tests/Doctrine/Tests/ORM/Functional/Ticket/GH10864Test.php

It doesn't have any circular dependencies or something complex. Just a Parent and 2 Children. In this test the first child will have id=2, and the first one id=1. But if I move $this->_em->persist($parent); to line 27 (just after creating the parent), children ids will be correct.

So moving the parent persist() in this simple case will change the order of inserted children. Even if it doesn't contradict the documentation directly, it's clearly counterintuitive and wrong from the common point of view.

@derrabus
Copy link
Member

children ids will be correct.

By what definition of "correct"? Where does the ORM guarantee any particular order in which records are persisted? You basically delegate calculating the best order to the ORM and this is what it has always done.

@k0ka
Copy link

k0ka commented Oct 11, 2023

The correct order is the one in which I'm adding them.

I'm adding 2 identical children. As the id field is auto_increment they the first child must have lesser id than the second child. After 2.16 the order of these children ids is arbitrary. Which means it's the same as UUID primary keys. So Doctrine essentially stopped supporting auto_increment PKs.

If I want the same order I'm adding children, I must create another field in the database order and manually manage its incrementation.

@greg0ire
Copy link
Member

greg0ire commented Oct 11, 2023

"The correct order" as you call it is not documented, because there isn't one. The order always has been arbitrary, and we reserve the right to change it again, arbitrarily. You should basically never assert the id of an entity in a test. Doctrine supports and uses auto_increment PKs, which allow it to avoid collisions. There is no gap in the sequence of entities unless your remove an entity. If you need a specific order, do the inserts yourself with the DBAL API. As soon as you use the ORM, you are delegating the insertion (otherwise persist() would have been called insert()).

@k0ka

This comment was marked as abuse.

@greg0ire
Copy link
Member

Ah? It looks like the discussion is getting slightly heated. Which is too heated. Let's do everyone a favor and lock this.

@greg0ire greg0ire closed this as not planned Won't fix, can't repro, duplicate, stale Oct 11, 2023
@doctrine doctrine locked as too heated and limited conversation to collaborators Oct 11, 2023
@greg0ire
Copy link
Member

greg0ire commented Nov 11, 2023

Hey everyone 👋

Somebody was kind enough to write a PR about this: #11058

If this is of interest to you , please let us know in the PR and we might merge it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests