You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the deletion pacer, we kick off the deletion in a goroutine, here. This can result in many goroutines being spawned, only to be put to sleep by the pacer, here.
Here's an example summary I saw, which showed 500+ goroutines all sleeping, waiting their turn to delete the files from their respective compactions. This accounted for ~5% of the total goroutine count of this process:
When using a deletion pacer, rather than spawning a goroutine per compaction, we could consider having a single goroutine that is responsible for deletion of files. When a compaction completes, it puts the files that need to be deleted in a queue (guarded by some mutex). Then a second goroutine could pull from this queue in order to satisfy the deleting pacing constraints.
Note that when running without a pacer, this problem isn't apparent, as the files are deleted synchronously as part of the compaction.
The text was updated successfully, but these errors were encountered:
Was the high goroutine count present with and without encryption-at-rest? cockroachdb/cockroach#98051 has the potential to slow Remove FS calls and exacerbate the issue.
When a compaction completes, it puts the files that need to be deleted in a queue (guarded by some mutex). Then a >second goroutine could pull from this queue in order to satisfy the deleting pacing constraints.
We already have this queue, versionSet.obsoleteTables. We're not using it in the way which is proposed in this issue, though.
When using the deletion pacer, we kick off the deletion in a goroutine, here. This can result in many goroutines being spawned, only to be put to sleep by the pacer, here.
Here's an example summary I saw, which showed 500+ goroutines all sleeping, waiting their turn to delete the files from their respective compactions. This accounted for ~5% of the total goroutine count of this process:
When using a deletion pacer, rather than spawning a goroutine per compaction, we could consider having a single goroutine that is responsible for deletion of files. When a compaction completes, it puts the files that need to be deleted in a queue (guarded by some mutex). Then a second goroutine could pull from this queue in order to satisfy the deleting pacing constraints.
Note that when running without a pacer, this problem isn't apparent, as the files are deleted synchronously as part of the compaction.
The text was updated successfully, but these errors were encountered: