-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: Save, saves related data also (not changed) #16343
Comments
More details here and description... lets say we have 2 Models: Companies & Contractors. Contractors connected Companies with belongsTo relation (alias Company). Somewhere in controller action: $contractor = \Models\Contractors::findFirst();
$company = $contractor->getCompany(); // Get related company
...
$contractor->save(); ... this will invoke ->save method on $contractor but and ->save method in company (i added dump in beforeValidation method on Company model). PhalconPHP 5.2.* Not sure is it a feature or a bug... anyway. I want to help somehow. |
This is a bug. It appeared in 4.1 branch. To be precise now we are running 4.1.2. There is no such behavior in php7.4-phalcon_4.0.4-908. We even fixed this package in our Docker but now we had to update and got a lot issues. Lets assume we have a code
Queries will be:
On the one hand let it be, just additional update. But if we changed spendings_categories_id in spendings to another the second update will wrongly update a row in spendings_categories. For example you had spendings_categories: There is workaround with switches off relation update
Do not hesitate to contact me if you need more details or tests. |
TLDR: this happens in collectRelatedToSave() Starts in the Model::__get
if you fetch anything it does not get set in the dirtyRelated, I think the ideas was to only execute presaverelated() and postsaverelated() if hasRelated() and collectRelatedToSave() Issue is in the collectRelatedToSave()
so the getRelated() will set this->related[lower_property] = related Right now it's a feature because probably there "is" a bug if we A->B->C->setSomething(value); I can think of 3 ways to fix this:
FYI to trigger an infinite loop ATM you just have to
I am working currently on this in my repo while trying to add FirstLevelCache... not easy without breaking a few things, sorry. |
Forgot to mention, my proposal for extra Executed States if not optional they might be a breaking change on model before and prepare Events with business logic. |
This is weird. We use Phalcon 4.1.3 and have this bug too. But in 4.1.3 in the protected function collectRelatedToSave(): array
{
$related = $this->related;
$dirtyRelated = $this->dirtyRelated;
foreach ($related as $name => $record) {
if (isset($dirtyRelated[$name]) === true) {
continue;
}
if ($record instanceof ModelInterface === false) {
continue;
}
// We mimic the same behavior like without snapshots.
$changed = true;
if ($record instanceof Model === true) {
$changed = $record->hasChanged();
}
if ($changed === true || $record->getDirtyState() !== self::DIRTY_STATE_PERSISTENT) {
$dirtyRelated[$name] = $record;
}
}
return $dirtyRelated;
} It completely resolved the issue! We use snapshots, else |
It's a quick fix if you are only trying to save the "first ring" or first level of relations, if you go beyond that it is an issue. If you have a second level or a second ring, A->B->C and changed only C and trigger A->save(); All ok here A checks for B hasChanged, nothing is changed on B so move on. Problem is we only changed C and it will never be saved unless you call C->save somewhere in your code. It's a quick fix for simple level 1 relations, anything above that is broken. |
We never used C level, because of the way The solution to your case, would be, to call the same method on every protected function collectRelatedToSave(): array
{
$related = $this->related;
$dirtyRelated = $this->dirtyRelated;
foreach ($related as $name => $record) {
if (isset($dirtyRelated[$name]) === true) {
continue;
}
if ($record instanceof ModelInterface === false) {
continue;
}
$changed = true;
if ($record instanceof Model === true) {
$hasDirtyRelated = $record->collectRelatedToSave();
if (\count($hasDirtyRelated) > 0) {
$record->save();
}
$changed = $record->hasChanged();
}
if ($changed === true || $record->getDirtyState() !== self::DIRTY_STATE_PERSISTENT) {
$dirtyRelated[$name] = $record;
}
}
return $dirtyRelated;
} That should solve |
@noone-silent no you can't do that! you have prepare, before and after that must me run on it's order of type of relation, that will trigger prepareSave and beforeSave no matter what type of relation there is, it is a breaking change, massive one. |
@noone-silent the only way to fix it correctly is with state flags and first level cache (thread cache). This is something that Java Hibernate ORM has by default and you can't disable it. https://vladmihalcea.com/jpa-hibernate-first-level-cache/ https://vladmihalcea.com/the-anatomy-of-hibernate-dirty-checking/ I wish it was an easier problem to solve, but it's not, all magic comes with a price, ORM magic comes with a heavy price of data flow complexity, consistency complexity and a lot of cycles to ensure that everything checks out. |
To be honest I think prepareSave, while it has good intentions of enabling making a lot of changes just before the model gets saved, it is a bad practice if you rely on it for data consistently. I have a very complex project that relations can go to 5 or 6 layers deep, don't like it but it's the way the customer needs it. We could however have a dirty cache inside a thread cache that when a save gets triggered it runs normally then checks the cache for unsaved models, this adds more cycles and checks every time a model gets changed or save, and might even be unwanted behaviour, I don't recommend it like this. So my proposal is dirty_state execution flags to avoid loops, resultset with added logic to append/add and persist models with a clear way for dev to program to make sure that it's get triggered by save(), and avoid magic has much has possible. I am working on it has we speak, it's slow progress because I am trying not to break previous behaviour, i.e. saving hasMany that is an array instead of a resultset object, keeping collectRelatedToSave function. |
@rudiservo Is this related to this comment: #15148 (comment) Because we are having massive performance issues in our application since the Phalcon 5 update :) |
If you find a solution, please patch it into phalcon 4 too. People still use it |
I have also replicated this issue in our application. We have installed ClockWork that allows us to see all database calls made in a call. I ran this code: public function phalconAction()
{
/** @var EventTicket $ticket */
$ticket = EventTicket::query()->andWhere('uuid = "93b006a8-188b-4c2b-bd3c-64f8c6f3101e"')->execute()->getFirst();
$orderItem = $ticket->orderItem;
$orderStatus = $orderItem->order->status;
if ($orderStatus === OrderStatus::COMPLETE) {
$ticket->status = EventTicketStatus::RESERVED;
$ticket->save();
}
} That results in the following ClockWork result: The EventTicket Query is not on the screenshot, but you can see that 2 UPDATE queries are done for Order & OrderItem. The data in the UPDATE query is not different from what is currently in the database. The example code I ran is not too bad, but in our production application this creates a massive performance issue. |
At first glance yes, do you have dynamic update? Just to know if it influences it. |
@rudiservo Pleas explain what you mean with dynamic updates? The code example I provided in my previous comment shows what the issue is :) |
It checks the model for changed fields before making and update to the database. The increased load you are seeing is in the PHP or the database?
|
Our database is the biggest issue. But that's understandable looking at my example. (#16343 (comment)) Every model select triggers a database update as well. That explains why our database has been having a hard time since the Phalcon 5 update. |
@larsverp well dynamic updates will save writes to your DB, also they will not issue an update for all the fields of the model, only the ones that have changed, making the update query smaller. Test it and give your feedback on how it went. The solution for this without using dynamic updates, is to forget about "dirty state" on relations has you have to always check the relations for changes and just use the dirty state only for changed fields. |
@rudiservo Alright thanks I will do some testing with the Dynamic update. Just to be on the same line, the issue we're trying to solve is about Phalcon marking a model as dirty while no fields are updated , right? Is there a possibility for us to disable Phalcon setting a Dirty state on a model? I'd rather check it myself while it's not working. |
@rudiservo We already seem to use DynamicUpdates so this doesn't do anything for us. It still tries to update the entire model in the DB even no fields have been changed. |
@rudiservo I think I was able to write a workaround for our application. I found the following code in the CPhalcon repo: protected function collectRelatedToSave() -> array
{
var name, record;
array related, dirtyRelated;
/**
* Load previously queried related records
*/
let related = this->related;
/**
* Load unsaved related records
*/
let dirtyRelated = this->dirtyRelated;
for name, record in related {
if isset dirtyRelated[name] {
continue;
}
if typeof record !== "object" || !(record instanceof ModelInterface) {
continue;
}
record->setDirtyState(self::DIRTY_STATE_TRANSIENT);
let dirtyRelated[name] = record;
}
return dirtyRelated;
} Source: https://github.com/phalcon/cphalcon/blob/5.0.x/phalcon/Mvc/Model.zep#L1126 In our application all models extend from the ModelBase class (Which extends from the Phalcon Model class) in the ModelBase class I overwrote the collectRelatedToSave function as follows: public function collectRelatedToSave(): array
{
$related = $this->related;
$dirtyRelated = $this->dirtyRelated;
foreach ($related as $name => $record) {
if (isset($dirtyRelated[$name])) {
continue;
}
if (!is_object($record) || !($record instanceof ModelInterface)) {
continue;
}
if (empty($record->getChangedFields()) && empty($record->collectRelatedToSave())) {//<------ This is new
continue;
}
$record->setDirtyState(1);
$dirtyRelated[$name] = $record;
}
return $dirtyRelated;
} What's your opinion? It seems to work for our application as far as I've tested it. |
@larsverp it's a quick fix, probably not ideal, it may have a big performance impact, also needs to pass the tests. |
@larsverp Will only work if Dynamic update is set, so you have to put it in the doLowUpdate function. i.e. "4" and "4.0" are not exactly the same for PHP, a lot of dev's do not cast the value on set, so this needs to be done. So the getChangedFields, might not get all changed fields, the function needs a revamp and doLowUpdate needs a revamp to. The weird part is where you say you have dynamicUpdates, and still saves the model, it should not unless you have an issue in your snapshots, or the type of data you have in your database and model. Check this to see if there is something different about your model so we can do something about it. |
@borisdelev do you have dynamicUpdate set? |
Nope, didnt set dynamic update. Its a little project. |
@rudiservo Thanks for the comment. I'll do some research to find out if this is the case for us. :) |
But we have this already? Like I said, the default should be |
@noone-silent I can make those changes, but that is not for me to decide, project maintainers have to decide if it is a bug, then it is the default, otherwise it could be a breaking change and that would require a new major version. FYI Checking for dynamic updates costs a lot, I do think it should be the default, but if you want performance the developers should "tag" the model has dirty, that way it would only do the DU if it really is dirty. Magic != performance. |
This has been one of the most painful areas of the framework to be honest. This particular discussion goes back to even v3. When to update? That is the question. Personally, I don't let the ORM update things for me, I do the updates where I need them to, wrapping the whole operation in a transaction. But that is just me. I like the idea from @noone-silent. We can go with that and set the behavior for updates once and for all. |
@niden DynamicUpdates and snapshots by default or mandatory? FYI, this is just quick fix, still I my advice would be to only run save on relations that we set i.e. $a->b = $b (WIP), I will create an issue for this. |
I would say default. Developers can then change the behavior if they need to. |
Fix #16343, DynamicUpdate is now enabled system wide
Resolved in #16400 Thank you @rudiservo |
Does it really fix it? This is just the dynamic update enabled by default. The check in |
@noone-silent Yes. The collectRelatedToSave will call the relation but the save will not trigger a SQL update. |
Then this issue is not fixed. This issue is about saving a model will also save all related models. Not about making a config setting as default. The config setting as default is just a side effect needed for this ticket. @rudiservo You complicate things too much here. There have been made many supposed fixes for this already, all using the dynamic update setting. You have concerns about performance, but performance is not the main aspect someone uses ORM. We use ORM because of convenience. If we need pure performance or have hundreds or thousands of queries, we should act like @niden does and do transactions directly, or if we only work like this, don't use ORM at all. Nobody is using 200 relations and all these relations also use relations of the hundreds. Yes, then performance and even resources would be a problem. But not in those simple use cases we have here. |
@noone-silent it's not fixed? I can not really just bluntly check for changed fields on the first level of relations because the ORM does not know how deep in the chain relation anyone made a change to one model. You got $A->B->C, a change was made in C but you save A, C does not get saved, how do you fix it honestly?
Thank you I guess? Yeah I do that. Simple != Easy
Do not get this the wrong way, but think about it from the maintainers point of view, is your statement a fact, personal observation or a guess? I would love to make changes on what I think I need and no one else is using, I can't propose those changes without a really good justification. BTW sometimes I use 20 relations with 5 relations with 3 more relations and 1 more at the end, why? Bureaucracy and complex business logic that the customer needs, that if I where to manually do it in SQL statements it would cost me more time to develop without any real gains in performance, I did try, and to make matters worse sometimes the customer can be unforgiving in deadlines. Look if you have any idea how to fix it, do a PR like I do, the maintainers would very much appreciate that, it's actually quite fun and a good learning experience. You can look at Doctrine, Hibernate and Propel for ideas and how they solved their problems, it's a lot of code but it's fun to read. Also would advice you to read this blog post from a Zig maintainer, it is quite insightful. |
I want that save only is saving stuff that needs to be saved. Like before. This is what this issue is about. The performance impact now is much bigger than any change of checking the chain of responsibility will ever create!
But you have to. How else you want to check the chain? And why would it break the code. The code is breaking right now, because it does not work like people used it to be or expect it to be. You check the first level and give it down the relation chain and in the end you have all models you need to save. $A->B->C->D->...->Z would be solved with this.
You do not stop on the first level. You give it down the relation chain.
It was not an attack or anything. Don't get me wrong.
I fixed my code with this. Every model is now calling
Yes, I do that too. We have a billing and invoicing system. A lot of relations there.
The fix of it has been proposed many times here already. People implement it already by themself, because this issue is a breaking one. If your code was saving in 250ms before it is now up to 2s. That is breaking. I will look into it. |
@noone-silent It's not working has before because someone submitted an issue, v4 broke behaviour that was on v3 and prior, and I've using Phalcon since v1 and honestly skipped v4. Here is the issue. DynamicUpdate is better than getChangedFields, look at the code! In fact I don't even know if getChangedFields is reliable if you compare the code, I have to recheck what is going on.
We can't have it both ways, save all relations and not save all relations without any input from the dev that triggers either. What you are proposing is a breaking change, that means a v6, and honestly if we are to make a v6 this is definitely not the way to fix this because it is also flawed. In Example, the way Doctrine 2 works is opt-out with a readOnly flag, that does not mean it wont go rampant all the way on all relations, yes it's heavy! You have to opt-in opt-out, it can not be none and things just work out of magic. I on the other hand think that it should be opt-in, you flag that a relation chain has changes, readOnly flag is if you only want to check relations and skip any update checks, more work for the dev, but far more control over the dataflow. IMO this is the better solution, it's really good for performance and I stand by it, if you didn't care about performance this would not be an issue for out. This is what I have been trying to explain to you and why I needed to know what was your expectation. This change has to be well thought for ramifications, it has to pass the tests and create new ones, it's a ton of work. Now look, the old behaviour has flaws, the current behaviour has flaws, honestly both are bad, either we can agree on a new behaviour that really fixes the issue once and for all and hopefully the maintainers agree to be the default behaviour, otherwise, it's a setting, not that bad IMO. You can check my proposal, the relations get flagged with changes WHEN you set them or by another type of opt-in behaviour.
I didn't take it like one, you just come out like you don't seem to understand the ramifications and caveats of a complex system like the ORM, I understand you want it fixed for your use case and it's frustrating, but the team can't do it without screwing up someone else, concessions have to be made to make this work otherwise it's not a reliable ORM and we can't please both sides with magic.
You really didn't understand, if you don't return B, B never calls C, C never calls D... Z, it wouldn't honestly, I don't how it did before without some opt-in or an overload function to make it work.
That is what is doing right now, it did not do before! Your solution of not returning that model in the collectRelatedToSave, actually breaks that, think about it! if you don't change B and set it, B never calls C.
It's a BAD FIX, IF you are not setting the relation or trigger save() on relations in the middle of the model, specially the belongsTo and hasOne, you are going to get unsaved data and a lot of issues, either way, IT'S A BAD UNRELIABLE FIX and it will come back to bite you, I've been there! I will propose my opt-in solution anyway, it probably has to be a new issue because it's going to be a long conversations with a lot decisions and it's needed for the management of the project, also the maintainers have to accept it. |
I have interesting situation.... @niden can u check out that? A help!
In that case, save will be executed in relation too. I mean object $contractor will be save and object $company too. Why is that?
Originally posted by @borisdelev in #16336 (comment)
The text was updated successfully, but these errors were encountered: