-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 6160: Deadlock removed #12315
Bug 6160: Deadlock removed #12315
Conversation
lib/private/DB/Adapter.php
Outdated
@@ -110,7 +141,7 @@ public function insertIfNotExist($table, $input, array $compare = null) { | |||
} | |||
$query = substr($query, 0, -5); | |||
$query .= ' HAVING COUNT(*) = 0'; | |||
|
|||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove the whole comment - that's what we have version control for. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, thanks for the hint. Lack of me beeing new to github. Will do this if we agree if the changes can be done that way. Currently i don't know someone can do it better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code itself makes a lot of sense and seem to work as stuff that uses those methods (group creation, app config setting, file upload,...) still work fine. 👍
Well this just leaves more time for concurrent queries to break it between the select and the insert. Exactly the reason why the method was created and used in first place. |
Can't we just wrap it in an transaction? |
@nickvergessen @rullzer What do you think about #6899 (comment) ? If you say that this is okay, I would port it by tomorrow so it will be in beta 1. |
yep, this is why it was first meant to be a version for testing and to see if it solves the problem in general. As for the aspect that there could be a concurrent access doing it that way: In almost any cases only the select is done, while entry is already there. You don't need lock for this. The update is only done once: when a file is created. I don't now how often the same file (of the same user) is created in the same moment twice, but i currently cannot think about the constellation this happens. You could use transaction + locking for this, but that would mean a lot of overhead for a each select. I would expect performance impacts then. I wouldn't do this, while even in worst case you could get duplicate entries and that's it. So looking at the changes the second time it even improves performance compared to the original a lot while most time file is already there. In this case my changes are only doing selects (that can be speed up). But for the original Update...Select MySql wants to copy the table in temporary table first - each time says the docs. And this not only if the file is created - but for every time the function is used. So finally original query bring dbms in trouble causing deadlocks and duplicate values and is inefficient either way each time the function is used. I would definitely change it somehow. At last: I have not used query builder, because i don't know what it is. I used text editor (kwrite). But i am new and willing to learn. Can you point me to some explanations to query builder to use for nc? Nevermind what i am talking to the things above: Very good work you are doing with NC, keep it up! |
server/lib/private/Files/Cache/Cache.php Line 271 in a661f04
oc_filecache .
|
why was the deadlock not caused by itself? for me it seemed so very much as i were the only user that time doing no other things with it.
I saw the function beeing used also in other context, not sure whether there is always an index on the table. Even if so, i wouldn't let the dbms run into execptions first if i could prevent them. So how do we proceed? As for now i have not seen any negative test results/aspects further, but at least three Issues beeing confirmed to be solved by it and i guess around 10 Issues will be solved further. I would rate this beeing mayor, as users are quiet upset seeing that many errors. All those errors are gone with that merge. Would you agree merging it if i remove the comments as @MorrisJobke suggested? |
Sure. I'm not against your pr but I think its the wrong way 🙈 |
I can fully understand this one, because now the time is higher, because it needs to go back to PHP instead of being just a sub select, which has less time, but still the option to cause this. I guess we need to catch an exception here. There is just no further way to eliminate this except in an atomic SQL function, which is not present in all DBs. So the approach would be to keep this method how it is currently, because theoretically it makes things worse:
Or do I miss something here? I don't know why it makes something better, but the obvious approach is to additionally to this method call also wrap the method call into a try catch with the exception that is expected (most likely a UniqueConstraintException), but unfortunately you cannot know if this is the case for all queries or should we just do this by default inside the function? |
That is what I wanted to describe above - there is no guarantee (except a single SQL statement itself which is currently not available on all supported platforms) to do this. So triggering and catching the exception is unfortunately the only way to go for now. |
I guess if we focus on the stuff in |
that is exactly my thought too |
Removed old Stuff
year, don't get me wrong, i really understand the bad feeling that in theory it worse and thats why i would really also prefer better ways. But as for now i did not see the other approaches beeing better. Again i can only say that causing exeptions for me is really a wrong way imho. And the theoretical timeslot between updates is really very much theory. Think of when this will happen and how often you will see it - same file to be created at the same time from different users? And what would happen then? An exception 👍 Or am i missing anything? So in practical for me it has even speed up NC while the complexity of update..select at dmbs level to find out, that file is there in 99% is high dbms load. Well, its up to you to decide. At my work i need to do those nasty decisions every day ;) |
It is actually not that theoretical - just look into the linked tickets. And if we are talking about short time slots, then look into this bug here: owncloud/core#24052 That one took me a long time to debug with on a high volume instance. It was basically this: if($this->isCached($cacheKey)) {
# here the cached values TTL ended and it got not cached anymore
$this->getCached($cacheKey); # 💥
} Keep in mind that the TTL (time to live) was IIRC 12h. And this one happened on the instance quite often (a handful of times per day). The time between those two entries is far less than a fraction of a millisecond and it hit on time for a 12h timeout to hit this exact spot. Never say: "it is super unlikely" - at some point it will bite you no matter how unlikely it is. So better be safe here and catch the exception to be really bullet proof and not only theoretically safe. 😉 |
* caused by a concurrect insert that happens between the INSERT and it's sub-SELECT which was there to actually avoid it within insertIfNotExists - sub selects are not atomic and allow this * see also #12315 Avoids an error that has this stack trace: ``` Doctrine\DBAL\Exception\UniqueConstraintViolationException An exception occurred while executing 'INSERT INTO "oc_file_locks" ("key","lock","ttl") SELECT ?,?,? FROM "oc_file_locks" WHERE "key" = ? HAVING COUNT(*) = 0' with params ["files/737d52477f1fb583a5bfe5eb33e820da", 1, 1524066628, "files/737d52477f1fb583a5bfe5eb33e820da"]: SQLSTATE[23505]: Unique violation: 7 ERROR: duplicate key value violates unique constraint "lock_key_index" DETAIL: Key (key)=(files/737d52477f1fb583a5bfe5eb33e820da) already exists. #0 3rdparty/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php(128): Doctrine\DBAL\Driver\AbstractPostgreSQLDriver->convertException('An exception oc...', Object(Doctrine\DBAL\Driver\PDOException)) #1 3rdparty/doctrine/dbal/lib/Doctrine/DBAL/Connection.php(1015): Doctrine\DBAL\DBALException::driverExceptionDuringQuery(Object(Doctrine\DBAL\Driver\PDOPgSql\Driver), Object(Doctrine\DBAL\Driver\PDOException), 'INSERT INTO "oc...', Array) #2 lib/private/DB/Connection.php(213): Doctrine\DBAL\Connection->executeUpdate('INSERT INTO "oc...', Array, Array) #3 lib/private/DB/Adapter.php(114): OC\DB\Connection->executeUpdate('INSERT INTO "oc...', Array)\n#4 lib/private/DB/Connection.php(251): OC\DB\Adapter->insertIfNotExist('*PREFIX*file_lo...', Array, Array)\n#5 lib/private/Lock/DBLockingProvider.php(118): OC\DB\Connection->insertIfNotExist('*PREFIX*file_lo...', Array, Array)\n#6 lib/private/Lock/DBLockingProvider.php(163): OC\Lock\DBLockingProvider->initLockField('files/737d52477...', 1) #7 lib/private/Files/Storage/Common.php(704): OC\Lock\DBLockingProvider->acquireLock('files/737d52477...', 1) #8 lib/private/Files/View.php(1931): OC\Files\Storage\Common->acquireLock('files/Documents', 1, Object(OC\Lock\DBLockingProvider)) #9 lib/private/Files/View.php(2041): OC\Files\View->lockPath('/*******/files/...', 1, false) #10 lib/private/Files/Node/Node.php(360): OC\Files\View->lockFile('/*******/files/...', 1) #11 apps/files_sharing/lib/Controller/ShareAPIController.php(928): OC\Files\Node\Node->lock(1) #12 apps/files_sharing/lib/Controller/ShareAPIController.php(589): OCA\Files_Sharing\Controller\ShareAPIController->lock(Object(OC\Files\Node\Folder)) #13 [internal function]: OCA\Files_Sharing\Controller\ShareAPIController->getShares('true', 'false', 'false', Object(OC\Files\Node\Folder), 'false') #14 lib/private/AppFramework/Http/Dispatcher.php(160): call_user_func_array(Array, Array) #15 lib/private/AppFramework/Http/Dispatcher.php(90): OC\AppFramework\Http\Dispatcher->executeController(Object(OCA\Files_Sharing\Controller\ShareAPIController), 'getShares') #16 lib/private/AppFramework/App.php(114): OC\AppFramework\Http\Dispatcher->dispatch(Object(OCA\Files_Sharing\Controller\ShareAPIController), 'getShares') #17 lib/private/AppFramework/Routing/RouteActionHandler.php(47): OC\AppFramework\App::main('OCA\\Files_Shari...', 'getShares', Object(OC\AppFramework\DependencyInjection\DIContainer), Array) #18 [internal function]: OC\AppFramework\Routing\RouteActionHandler->__invoke(Array) #19 lib/private/Route/Router.php(299): call_user_func(Object(OC\AppFramework\Routing\RouteActionHandler), Array) #20 ocs/v1.php(78): OC\Route\Router->match('/ocsapp/apps/fi...') #21 ocs/v2.php(23): require_once('/usr/share/weba...') #22 {main} ```` Signed-off-by: Morris Jobke <hey@morrisjobke.de>
I started with the catching approach - first one is the issue in the locking: #12366 |
List to have a look at:
|
* caused by a concurrect insert that happens between the INSERT and it's sub-SELECT which was there to actually avoid it within insertIfNotExists - sub selects are not atomic and allow this * see also #12315 Avoids an error that has this stack trace: ``` Doctrine\DBAL\Exception\UniqueConstraintViolationException: An exception occurred while executing 'INSERT INTO "oc_filecache" ("mimepart","mimetype","mtime","size","etag","storage_mtime","permissions","checksum","path_hash","path","parent","name","storage") SELECT ?,?,?,?,?,?,?,?,?,?,?,?,? FROM "oc_filecache" WHERE "storage" = ? AND "path_hash" = ? HAVING COUNT(*) = 0' with params [1, 2, 1502911047, -1, "59949a47cabae", 1502911047, 31, "", "fb66dca5f27af6f15c1d1d81e6f8d28b", "files_trashbin", 1742774, "files_trashbin", 136, 136, "fb66dca5f27af6f15c1d1d81e6f8d28b"]: SQLSTATE[23505]: Unique violation: 7 ERROR: duplicate key value violates unique constraint "fs_storage_path_hash" DETAIL: Key (storage, path_hash)=(136, fb66dca5f27af6f15c1d1d81e6f8d28b) already exists. 3rdparty/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php - line 128: Doctrine\DBAL\Driver\AbstractPostgreSQLDriver->convertException('An exception oc...', Object(Doctrine\DBAL\Driver\PDOException)) 3rdparty/doctrine/dbal/lib/Doctrine/DBAL/Connection.php - line 1015: Doctrine\DBAL\DBALException driverExceptionDuringQuery(Object(Doctrine\DBAL\Driver\PDOPgSql\Driver), Object(Doctrine\DBAL\Driver\PDOException), 'INSERT INTO "oc...', Array) lib/private/DB/Connection.php - line 213: Doctrine\DBAL\Connection->executeUpdate('INSERT INTO "oc...', Array, Array) lib/private/DB/Adapter.php - line 114: OC\DB\Connection->executeUpdate('INSERT INTO "oc...', Array) lib/private/DB/Connection.php - line 251: OC\DB\Adapter->insertIfNotExist('*PREFIX*filecac...', Array, Array) lib/private/Files/Cache/Cache.php - line 273: OC\DB\Connection->insertIfNotExist('*PREFIX*filecac...', Array, Array) lib/private/Files/Cache/Cache.php - line 230: OC\Files\Cache\Cache->insert('files_trashbin', Array) lib/private/Files/Cache/Scanner.php - line 279: OC\Files\Cache\Cache->put('files_trashbin', Array) lib/private/Files/Cache/Scanner.php - line 216: OC\Files\Cache\Scanner->addToCache('files_trashbin', Array, -1) lib/private/Files/Cache/Scanner.php - line 175: OC\Files\Cache\Scanner->scanFile('files_trashbin') lib/private/Files/Cache/Scanner.php - line 322: OC\Files\Cache\Scanner->scanFile('files_trashbin/...', 3, -1, NULL, false) lib/private/Files/Cache/Updater.php - line 124: OC\Files\Cache\Scanner->scan('files_trashbin/...', false, 3, false) lib/private/Files/View.php - line 321: OC\Files\Cache\Updater->update('files_trashbin/...', 1502911047) lib/private/Files/View.php - line 1151: OC\Files\View->writeUpdate(Object(OCA\Files_Trashbin\Storage), 'files_trashbin/...') lib/private/Files/View.php - line 269: OC\Files\View->basicOperation('mkdir', '/files_trashbin...', Array) apps/files_trashbin/lib/Trashbin.php - line 154: OC\Files\View->mkdir('files_trashbin/...') apps/files_trashbin/lib/Trashbin.php - line 225: OCA\Files_Trashbin\Trashbin setUpTrash('klas') apps/files_trashbin/lib/Storage.php - line 247: OCA\Files_Trashbin\Trashbin move2trash('Photos', false) apps/files_trashbin/lib/Storage.php - line 189: OCA\Files_Trashbin\Storage->doDelete('files/Photos', 'rmdir') lib/private/Files/View.php - line 1136: OCA\Files_Trashbin\Storage->rmdir('files/Photos') lib/private/Files/View.php - line 348: OC\Files\View->basicOperation('rmdir', '/Photos', Array) apps/dav/lib/Connector/Sabre/Directory.php - line 303: OC\Files\View->rmdir('/Photos') 3rdparty/sabre/dav/lib/DAV/Tree.php - line 179: OCA\DAV\Connector\Sabre\Directory->delete() 3rdparty/sabre/dav/lib/DAV/CorePlugin.php - line 287: Sabre\DAV\Tree->delete('Photos') [internal function] Sabre\DAV\CorePlugin->httpDelete(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response)) 3rdparty/sabre/event/lib/EventEmitterTrait.php - line 105: call_user_func_array(Array, Array) 3rdparty/sabre/dav/lib/DAV/Server.php - line 479: Sabre\Event\EventEmitter->emit('method DELETE', Array) 3rdparty/sabre/dav/lib/DAV/Server.php - line 254: Sabre\DAV\Server->invokeMethod(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response)) apps/dav/appinfo/v1/webdav.php - line 71: Sabre\DAV\Server->exec() remote.php - line 162: require_once('/var/www/nextcl...') {main} ``` Signed-off-by: Morris Jobke <hey@morrisjobke.de>
I moved this list to #12369 |
Let's go for #12371. This is basically what @blizzz @nickvergessen and @danielkesselberg suggested as well. |
@obel1x Thanks for your contribution anyways. It was a good start and helped me to understand the underlying problem. 👍 You could also check out the other issues - maybe some that are suited for people that are not too familiar with the code base: https://github.com/nextcloud/server/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22 |
@nickvergessen @blizzz It's possible that we still need this to remove the deadlocks that could also occur, because we have a sub query. 🤔 |
This is a workaround to avoid Deadlocks and duplicate Keys in
UPDATE SELECT
- Statement as described in #6160 and maybe the cause for #9305 #12228 #6899 #6187