-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer overflow in cursor create_name_map #931
Buffer overflow in cursor create_name_map #931
Conversation
cc @mkleehammer @keitherskine This is a very similar buffer overflow to the one fixed in #881. We encountered it with Simba Spark ODBC driver. |
PyErr_NoMemory(); | ||
goto done; | ||
} | ||
goto retry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think goto is justified here to not repeat that logic...
nameLen = cchName + 1; | ||
if (!pyodbc_realloc((BYTE**) &szName, (nameLen + 1) * sizeof(ODBCCHAR))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like in my previous PR, I prefer to do some defensive double +1 here to rather have a too big buffer, and space for an extra character (like a termination character) than a too small one.
I am curious to learn what databases will let you use such long column names, and what demand there is for that capability in the first place -- 300 already seems very generous. I can understand error messages could be extremely verbose, but a column name doesn't seem like it would make sense being so long. SQL Server has a maximum of 128. The first paragraph of my comment is 308 characters long. |
@v-chojas Apache Spark allows it. Simba ODBC driver for Spark allows it. |
@v-chojas the tests I added show that MS Access, Informix, SQLLite and SQL DW also allow it. |
MS Access has a limit of 64, and SQL DW is based on SQL Server, so I would expect the same limit applies there. My research shows that Informix also has 128, and SQLite is the only one that claims "no limit". |
Function create_name_map used a static buffer `szName[300]` for the column name in call to `SQLDescribeColW`. That function would truncate the column name to the size of the provided buffer, but return the actual length of the column name in the `&cchName` return parameter. That `cchName` output was used to compute `cbName` parameter for the call to `TextBufferToObject(enc, szName, cbName)`. When the actual column name was longer than the buffer, cbName was to big, and was causing a buffer overflow, resulting in crashes, invalid data, or errors. Fix this by making szName a dynamic buffer. This buffer overflow is very similar to the one fixed in mkleehammer#881. I check through the rest of the code base, and I don't see more places with static buffers that could have a similar issue. Upstream PR: mkleehammer#931
@v-chojas The tests added here suggest that they are nevertheless happy to create a table with column names 500 and 600 characters long, to query it, and to return such long column names. |
I don't know how you tested, but I have access to a DW instance and...
|
tests3/sqldwtests.py
Outdated
def test_long_column_name(self): | ||
"ensure super long column names are handled correctly." | ||
c1 = 'abcdefghij' * 50 | ||
c2 = 'klmnopqrst' * 60 | ||
self.cursor = self.cnxn.cursor() | ||
|
||
self.cursor.execute("create table t1({} int, {} int)".format(c1, c2)) | ||
self.cursor.execute("select * from t1") | ||
|
||
names = [ t[0] for t in self.cursor.description ] | ||
names.sort() | ||
|
||
self.assertEqual(names, [ c1, c2 ]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@v-chojas I don't have access to a SQL DW instance, so I depended on the CI...
Oh, I get it. https://app.travis-ci.com/github/mkleehammer/pyodbc/jobs/524518887 The CI actually runs only SQL Server, PG SQL and MySql tests...
I removed tests for Access, Informix, SQL DW
Therefore, it seems it can't be tested with the existing set of tests. Thanks @v-chojas for checking all these systems. |
Because I couldn't test it with other systems, I adapted python3 test suites to run on Spark, and added the test there.
|
Thanks for all the work you put into this! |
Sorry everyone, I somehow merged this without the rebase. I thought I configured Github to require that or make it the default. |
Function create_name_map used a static buffer
szName[300]
for the column name in call toSQLDescribeColW
. That function would truncate the column name to the size of the provided buffer, but return the actual length of the column name in the&cchName
return parameter.That
cchName
output was used to computecbName
parameter for the call toTextBufferToObject(enc, szName, cbName)
. When the actual column name was longer than the buffer, cbName was to big, and was causing a buffer overflow, resulting in crashes, invalid data, or errors.Fix this by making szName a dynamic buffer. This buffer overflow is very similar to the one fixed in #881. I check through the rest of the code base, and I don't see more places with static buffers that could have a similar issue.
Other systems do not support such long column names.
Because I couldn't test it with other systems, I adapted python3 test suites to run on Spark, and added the test there.