-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
obj: handle ENOMEM during vector allocations #5537
base: master
Are you sure you want to change the base?
Conversation
@@ -1013,8 +1019,16 @@ pmemobj_tx_commit(void) | |||
operation_start(tx->lane->external); | |||
|
|||
struct user_buffer_def *userbuf; | |||
VEC_FOREACH_BY_PTR(userbuf, &tx->redo_userbufs) | |||
operation_add_user_buffer(tx->lane->external, userbuf); | |||
struct operation_context *ctx = tx->lane->external; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like I indicated earlier in the issue, I don't think we need to check for this error here. Instead, we simply need to ensure that the operation_add_user_buffer
never fails (like you've already done in tx_construct_user_buffer
).
To have the FATAL/ASSERT that you added you can change the operation_add_user_buffer
to return an error (-1) if VEC_PUSH_BACK(&ctx->next, buffer_offset);
fails. This would let you avoid making all the memops structs public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure to fully follow your reasoning, but anyway I will try to push a new commit based on what I think I have understood...
src/libpmemobj/memops.h
Outdated
@@ -39,7 +40,53 @@ struct user_buffer_def { | |||
size_t size; | |||
}; | |||
|
|||
struct operation_context; | |||
enum operation_state { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better if we kept the structs private and instead relied on some inter-module API if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok....
src/libpmemobj/tx.c
Outdated
@@ -1013,8 +1015,12 @@ pmemobj_tx_commit(void) | |||
operation_start(tx->lane->external); | |||
|
|||
struct user_buffer_def *userbuf; | |||
struct operation_context *ctx = tx->lane->external; | |||
uint64_t buf_off; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buf_off is unusued.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, sorry to have missed that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 6 unresolved discussions (waiting on @bfaccini and @pbalcer)
-- commits
line 20 at r3:
you can squash all the commits into a single one, please
src/libpmemobj/memops.h
line 2 at r3 (raw file):
/* SPDX-License-Identifier: BSD-3-Clause */ /* Copyright 2016-2020, Intel Corporation */
Since you're using an Intel e-mail address for your commits, you have to update all dates to -2023
.
E.g., in this file: 2016-2023
src/libpmemobj/tx.c
line 714 at r3 (raw file):
userbuf.size - TX_INTENT_LOG_BUFFER_OVERHEAD; } else { if (operation_add_user_buffer(ctx, &userbuf) == -1);
is this ;
expected here?
Ok
Ok
Oops, sorry about the typo :-( All these changes will be in next commit. |
When missing, Handle ENOMEM during vector allocations to avoid later crashing. Ref: pmem/issues#5515 Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
6b9e943
to
be023c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @bfaccini and @pbalcer)
src/libpmemobj/memops.c
line 618 at r4 (raw file):
if (VEC_PUSH_BACK(&ctx->next, buffer_offset) != 0) return -1;
just wondering, should we clean something up in this case? I can see we're persisting something above 😉
Well, I thought this kind of concern could araise when I started to work on how to fix this, as usual for error paths... |
@bfaccini are you going to fix this PR to pass tests? |
Looks like doing so affects VEC_PUSH_BACK() execution, surprisingly for non-debug build only, preventing next vector allocation... Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Looks like embeding VEC_PUSH_BACK() in ASSERT() affects VEC_PUSH_BACK() execution, surprisingly for non-debug build only, preventing next vector allocation... |
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Well, now that I have fixed the original errors/SEGVs during "./RUNTESTS -b nondebug obj_action -s TEST0", I now get new ones/asserts/aborts, but unfortunately I am unable to reproduce any of them locally :-( |
testing return value from VEC_PUSH_BACK() equal to 0 for error is not good !!... Ref: pmem/issues#5515 Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Now that I have found that these asserts/aborts were due to the "wrong/reverse test" in my previous changes, some workflows are canceled and one has failed, but I am not able to find any reason in the logs... Can somebody help ?? |
??? |
Codecov Report
@@ Coverage Diff @@
## master #5537 +/- ##
==========================================
+ Coverage 71.62% 72.18% +0.56%
==========================================
Files 162 146 -16
Lines 24256 22655 -1601
Branches 0 3778 +3778
==========================================
- Hits 17373 16354 -1019
+ Misses 6883 6301 -582 |
I restarted the CI and most things pass, and the things that fail seem unrelated:
Can you rebase to see if this goes away? |
Unfortunately, this is a known issue (platform setup issue), it is not related to your PR. We are working on it. |
But should I rebase finally ? |
No, everything seems fine. LGTM overall. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r7, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @bfaccini and @pbalcer)
src/libpmemobj/memops.c
line 618 at r4 (raw file):
@wlemkows, can you please take a look at this piece of code...?
I allow myself to quote bfaccini's words (it was out of this issues' thread):
Well, I thought this kind of concern could araise when I started to work on how to fix this, as usual for error paths...
The problem is that I don't know what exactly does/provisions pmemops_persist() nor I have been able to identify which method/operation can revert its effects...
Also, and strangely (!!), I only found one error path after pmemops_persist() has been called (other place do not show any error path at all !!...), in src/libpmemobj/obj.c:obj_runtime_init(), and there is no clean-up there too !!
When missing, Handle ENOMEM during vector allocations to avoid later crashing.
Ref: pmem/issues #5515
Signed-off-by: Bruno Faccini bruno.faccini@intel.com
This change is