Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] CSV log is not null-terminated #223

Closed
gmarkall opened this issue Dec 27, 2019 · 10 comments
Closed

[BUG] CSV log is not null-terminated #223

gmarkall opened this issue Dec 27, 2019 · 10 comments
Labels
bug Something isn't working

Comments

@gmarkall
Copy link
Contributor

Describe the bug
The CSV log placed in buffer by rmmGetLog(char *buffer, size_t buffer_size) is not null-terminated, but is probably expected to be (e.g. the Cython code for the Python interface seems to be written as though it is null-terminated.

Steps/Code to reproduce bug
I don't have a small and reliable reproducer, but I note that generally printing the CSV log from Python in the context of larger programs results in exceptions such as:

Exception ignored in: <finalize object at 0x7f205c5a5220; dead>
Traceback (most recent call last):
  File "/home/nfs/gmarkall/numbadev/numba/numba/utils.py", line 678, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File "/home/nfs/gmarkall/numbadev/rmm/python/rmm/rmm.py", line 471, in finalizer
    print(csv_log())
  File "/home/nfs/gmarkall/numbadev/rmm/python/rmm/rmm.py", line 144, in csv_log
    return librmm.rmm_csv_log()
  File "rmm/_lib/lib.pyx", line 157, in rmm._lib.lib.rmm_csv_log
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb1 in position 1562: invalid start byte

(Note that the traceback shows rmm.py because I added a print of the CSV log for each allocation and deallocation so I could keep an eye on what was going on during execution). Other times, when there is no exception, trailing gibberish appears at the end of the log:

Event Type,Device ID,Address,Stream,Size (bytes),Free Memory,Total Memory,Current Allocs,Start,End,Elapsed,Location
Alloc,0,0x7f1f83400000,0,80,0,0,1,11.1588,11.1604,0.00163166,/home/nfs/gmarkall/numbadev/numba/numba/cuda/cudadrv/driver.py:689
Alloc,0,0x7f1f83400200,0,80,0,0,2,11.1625,11.1625,3.0472e-05,/home/nfs/gmarkall/numbadev/numba/numba/cuda/cudadrv/driver.py:689
Free,0,0x7f1f83400000,0,0,0,0,1,11.1658,11.1658,1.627e-05,/home/nfs/gmarkall/numbadev/numba/numba/utils.py:678
Free,0,0x7f1f83400200,0,0,0,0,0,11.1659,11.1671,0.00122874,/home/nfs/gmarkall/numbadev/numba/numba/utils.py:678
Alloc,0,0x7f1f83400000,0,128,0,0,1,11.2621,11.2635,0.00144595,/home/nfs/gmarkall/numbadev/numba/numba/cuda/cudadrv/driver.py:689
Free,0,0x7f1f83400000,0,0,0,0,0,11.2651,11.2658,0.000690474,/home/nfs/gmarkall/numbadev/numba/numba/utils.py:678
Alloc,0,0x7f1f83400000,0,128,0,0,1,11.3738,11.3747,0.000891601,/home/nfs/gmarkall/numbadev/numba/numba/cuda/cudadrv/driver.py:689
Free,0,0x7f1f83400000,0,0,0,0,0,11.3765,11.3771,0.000616803,/home/nfs/gmarkall/numbadev/numba/numba/utils.py:678
q�

(q� on the last line)

Expected behavior
One possibility is that the log should have a null terminated byte at the end of it. However, I'm not sure that's the best fix that can be applied here - getting the log also requires generating it twice - once to determine its length for the caller to allocate a buffer, and again for the log to be written into the buffer.

Perhaps a better solution is for rmmGetLog to accept a stringstream and write the log to it instead, for example.

@gmarkall gmarkall added the bug Something isn't working label Dec 27, 2019
gmarkall added a commit to gmarkall/rmm that referenced this issue Dec 27, 2019
The CSV log is not null-terminated, so a slice is needed to
correctly assign it to a Python bytes object (see Issue rapidsai#223).
@gmarkall
Copy link
Contributor Author

#224 is a quick fix to make the Python side work correctly with the existing interface - if a maintainer could offer some guidance as to whether/how the interface should change (e.g. to accept an ostringstream, to null-terminate the log on the C++ side, etc.) I'd be happy to implement the suggested change and make a new PR for it.

@gmarkall
Copy link
Contributor Author

I forgot when I wrote the original bug description that I did write a failing test which can be used to reproduce the issue:

diff --git a/tests/memory_tests.cpp b/tests/memory_tests.cpp
index 87964e3..1b2b5f4 100644
--- a/tests/memory_tests.cpp
+++ b/tests/memory_tests.cpp
@@ -220,6 +220,33 @@ TYPED_TEST(MemoryManagerTest, GetInfo) {
     ASSERT_GE(totalBefore, 0);
 }
 
+TYPED_TEST(MemoryManagerTest, GetLog) {
+    size_t logSize = rmmLogSize();
+    char* buffer = (char*)malloc(sizeof(char) * logSize);
+    rmmGetLog(buffer, logSize);
+
+    // Check that the log string is null-terminated
+    ASSERT_EQ(buffer[logSize - 1], '\0');
+
+    // Reference of the column headers we expect to see at the beginning of the
+    // output
+    const char* columnHeaders = \
+        "Event Type,Device ID,Address,Stream,Size (bytes),Free Memory," \
+        "Total Memory,Current Allocs,Start,End,Elapsed,Location\n";
+
+    // First check that there are at least as many characters in the buffer as
+    // there are in the reference column headers
+    ASSERT_GE(logSize, strlen(columnHeaders + 1));
+
+    // For comparison with the reference column headers, we copy the same number
+    // of bytes from the buffer as there are in the reference column headers
+    char* headerBuffer = (char*)malloc(sizeof(char) * strlen(columnHeaders + 1));
+    strncpy(headerBuffer, buffer, strlen(columnHeaders));
+
+    // Check that the column headers are as expected
+    ASSERT_STREQ(headerBuffer, columnHeaders);
+}
+
 TYPED_TEST(MemoryManagerTest, AllocationOffset) {
     char *a = nullptr, *b = nullptr;
     ptrdiff_t offset = -1;

This test also checks that the output at least looks like the start of a CSV log, as a basic sanity check.

@gmarkall
Copy link
Contributor Author

gmarkall commented Jan 3, 2020

A small update - now that #224 is merged, the original comments about reproduction using the CSV log in Python no longer apply - it will now produce correct output. However, the GetLog test from the previous comment will still fail.

@jakirkham
Copy link
Member

Yeah I think this is because C++ strings are not null terminated. So this is expected behavior.

That said, it should have been handled better at the Python-level. Thanks for doing that.

Have made a slight update on your change with PR ( #278 ), which just dumps the data directly into a bytes object. This avoids any further concerns about whether strings are null terminated as this doesn't play a role.

@jakirkham
Copy link
Member

@harrism, does what I said above sound accurate to you? If so, should we close this?

@harrism
Copy link
Member

harrism commented Feb 14, 2020

We're going to scrap the entire current log functionality and adopt a third party logging library (TBD).

@jakirkham
Copy link
Member

What do you have in mind? Glog?

@harrism
Copy link
Member

harrism commented Feb 14, 2020

Or g3log or spdlog.

@jakirkham
Copy link
Member

xref: #295

@harrism
Copy link
Member

harrism commented Oct 19, 2020

Closing as this is stale (logging has been rewritten and none of the APIs discussed above exist anymore). Please comment if this is still an issue @gmarkall

@harrism harrism closed this as completed Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants