Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash in s_free(sh) in sdsMakeRoomFor function during redis pipeline call in hiredis-0.14.0 #921

Closed
sg893052 opened this issue Feb 2, 2021 · 7 comments

Comments

@sg893052
Copy link

sg893052 commented Feb 2, 2021

Hi,
Hitting a crash in s_free(sh) in sdsMakeRoomFor function during redis pipeline call in hiredis-0.14.0/sds.c
Could someone please help? Thanks!

Backtrace:
#0 0x00007fb465978fff in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fb46597a42a in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fb4659b6c00 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007fb4659bcfc6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007fb4659bd80e in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x00007fb466836462 in sdsMakeRoomFor (s=0x55f230687f05 "",
addlen=addlen@entry=234) at sds.c:230
#6 0x00007fb466836dac in sdscatlen (s=, t=0x7ffda19f00a0,
len=234) at sds.c:379
#7 0x00007fb46683acd4 in redisReaderFeed (r=0x55f23068c600,
buf=buf@entry=0x7ffda19f00a0 "*18\r\n$5\r\nindex\r\n$2\r\n37\r\n$5\r\nlanes\r\n$2\r\n44\r\n$11\r\ndescription\r\n$0\r\n\r
\n$23\r\noverride_unreliable_los\r\n$3\r\noff\r\n$12\r\nvalid_speeds\r\n$10\r\n10000,1000\r\n$3\r\nmtu\r\n$4\r\n9100\r\n$5\r
nalias\r\n$7\r\nEth1/37\r\n$12\r\nadmin_stat"...,
len=) at read.c:537
#8 0x00007fb466835734 in redisBufferRead (c=0x55f23068c340) at hiredis.c:800
#9 0x00007fb466835b90 in redisGetReply (c=0x55f23068c340,
reply=reply@entry=0x7ffda19f41d0) at hiredis.c:874
#10 0x00007fb4665b4d17 in swss::RedisReply::RedisReply (this=0x7ffda19f41d0,
db=0x55f23068b250, command=...) at redisreply.cpp:34
#11 0x00007fb4665b4f12 in swss::RedisReply::RedisReply (this=0x7ffda19f41d0,
db=, command=..., expectedType=2) at redisreply.cpp:47
#12 0x00007fb4665c4b20 in swss::RedisPipeline::push (expectedType=2,
command=..., this=0x55f23068a9d0) at ../common/redispipeline.h:83
#13 swss::Table::get (this=this@entry=0x55f22fe91cc0 <sync+1312>,
key="Ethernet36", values=std::vector of length 0, capacity 0)
at table.cpp:81

@michael-grunder
Copy link
Collaborator

Hi @sg893052 it's hard to say without code I can use to replicate the crash.

That said, v0.14.0 is very old and there have been several possible crashes fixed since then. Are you able to test with a newer version of hiredis?

@sg893052
Copy link
Author

sg893052 commented Feb 8, 2021

Hi Michael,
Sorry for the delayed response.
The current situation demands me to address this issue in the hiredis version(0.14.0) itself due to time limit.

Upon further investigation with the core file, there seem to be a corruption in sds header flag value s[-1] which is "(rediscontext->reader->buf)-1".
s[-1] value was set as 5 which is a not a valid value, resulting in crash in sdsMakeRoomFor() while freeing an invalid pointer.
In fact, s[-1] should have ideally been one of below sds header types.
#define SDS_TYPE_5 0
#define SDS_TYPE_8 1
#define SDS_TYPE_16 2
#define SDS_TYPE_32 3
#define SDS_TYPE_64 4
#define SDS_TYPE_MASK 7

Do you see a safer way to handle this in hiredis/sds code ?

Debug:

(gdb) frame 5
#5 0x00007fb466836462 in sdsMakeRoomFor (s=0x55f230687f05 "",
addlen=addlen@entry=234) at sds.c:230
230 sds.c: No such file or directory.
(gdb) p s[-1]
$31 = 5 '\005'
(gdb)

(gdb) p oldtype
$8 = 5 '\005'

sdsHdrSize(char type) returns 0 as oldtype was invalid value 5

s_free(sh) crashes

Also,
I was able to simulate the crash with the hiredis_test program available in hiredis package by corrupting s[-1] with 5 in gdb and proceeding further.
*** Error in `/home/admin/hiredis-test': free(): invalid pointer: 0x0000000000610663 ***

code:

sds sdsMakeRoomFor(sds s, size_t addlen) {
void *sh, *newsh;
size_t avail = sdsavail(s);
size_t len, newlen;
char type, oldtype = s[-1] & SDS_TYPE_MASK; ===> oldtype value returned as 5
int hdrlen;

/* Return ASAP if there is enough space left. */
if (avail >= addlen) return s;

len = sdslen(s);
sh = (char*)s-sdsHdrSize(oldtype);   ======> sdsHdrSize(char type) returns 0 as oldtype was invalid value 5
newlen = (len+addlen);
if (newlen < SDS_MAX_PREALLOC)
    newlen *= 2;
else
    newlen += SDS_MAX_PREALLOC;

type = sdsReqType(newlen);

/* Don't use type 5: the user is appending to the string and type 5 is
 * not able to remember empty space, so sdsMakeRoomFor() must be called
 * at every appending operation. */
if (type == SDS_TYPE_5) type = SDS_TYPE_8;

hdrlen = sdsHdrSize(type);
if (oldtype==type) {
    newsh = s_realloc(sh, hdrlen+newlen+1);
    if (newsh == NULL) return NULL;
    s = (char*)newsh+hdrlen;
} else {
    /* Since the header size changes, need to move the string forward,
     * and can't use realloc */
    newsh = s_malloc(hdrlen+newlen+1);
    if (newsh == NULL) return NULL;
    memcpy((char*)newsh+hdrlen, s, len+1);
    s_free(sh);    ========================================> place of crash  -> free of invalid pointer
    s = (char*)newsh+hdrlen;
    s[-1] = type;
    sdssetlen(s, len);
}
sdssetalloc(s, newlen);
return s;

}

@michael-grunder
Copy link
Collaborator

Hi @sg893052 I'm not sure I totally follow. Obviously corrupting the sdshdr could cause a crash, but the question I would have is how is the header becoming corrupted?

Have you attempted to run your program through valgrind or clang's sanitizers to catch the corruption?

@sg893052
Copy link
Author

sg893052 commented Feb 9, 2021

Hi Michael, The crash is not reproducible. I was able to find the memory corruption(sdshdr) from the core file debugging. Attempting the valgrind run for the application in parallel. I was hoping if we could up with a protection fix in sds side to check it's validity and safely return to the application complaining of the corruption instead of crashing.

@michael-grunder
Copy link
Collaborator

michael-grunder commented Feb 9, 2021

The issue with checking for corruption inside of SDS is that it doesn't actually solve your problem and could even make it worse. What you really want to do is catch the corruption when it happens.

I mentioned valgrind and ASAN but you might also give rr a try. With rr you would only need to catch the bug once and then replay it as many times as you want.

I've used it to track down really difficult to reproduce concurrency bugs by automating the runs and doing them in parallel. If your application is multi-threaded you should also look into running rr with the chaos mode enabled which will mess with the OS scheduler at a low level.

Good luck!

@sg893052
Copy link
Author

Sure, Thanks for the inputs!

@michael-grunder
Copy link
Collaborator

Closing old issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants