-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: DataFrame.shift with axis=1 shifts object columns to the next column with object dtype #26929
Comments
Haven't read through everything but is this example fully minimal? At a glance, it seems overly complex to demonstrate the issue. http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports |
@TomAugspurger I have minimized the example as much as possible, but it is a problem that requires detailed description, that is why it seems complex. The code generates a dataframe from a string and this is why it looks long, in reality the first 15 lines of the code are assigning a dataset to a variable, so it can easily be copied and pasted. I have looked through and I believe every piece of information on there is necessary to demonstrate and recreate the issue. |
OK. Let us know if you figure anything out. I don't plan to look at this any more closely in the near term. |
@TomAugspurger I Made some changes to the issue to make it more concise, I have been testing this out for the past couple of days and have not been able to find a solution to this issue. |
Generally, it looks like shifting object columns will automatically shift to the next column that had an object dtype.
|
I have found a work around for this problem while this bug is being fixed, just in case anyone had a similar issue. I convert all the data in the fields to strings and then perform the shift, after which I convert the data back into a csv string, and then read this string into a dataframe again, in order to maintain their previous data types.
|
string dtype dataframe does not respond to axis df = pd.DataFrame([['2', '3', '4'], ['K', 'L', 'M']], dtype='string')
df.shift(1, axis=1)
Out:
0 1 2
0 <NA> <NA> <NA>
1 2 3 4 example workaround (that works for any positive shifts shift_positive = 1
df.iloc[:, shift_positive:] = df.slice_shift(shift_positive, axis=1)
df.iloc[:, :shift_positive] = pd.NA
df
Out:
0 1 2
0 <NA> 2 3
1 <NA> K L |
closed by #35578 |
Code Sample
Problem description
I am trying to use the
Dataframe. shift()
function to move certain rows of data into their correct columns, and theDataframe.shift()
function is doing some weird things, it is creating an empty column ofnull(s)
and moves one of the columns to the end of thedataframe
.Screenshots
Original data:
Output after execution of the code
As seen above, the data that was originally in column 10 has been move to column 15 for some reason. Also a column with
null
values has been created in column 6.I expect that the data should be moved to the right by one step, that is the data in each column should move to the column to the left of it, and the current behaviour is confusing, based on the documentation of what this function should do.
My expected output:
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.14.final.0
python-bits: 32
OS: Windows
The text was updated successfully, but these errors were encountered: