Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

group_by broken #23

Closed
jotwin opened this issue Aug 31, 2020 · 9 comments
Closed

group_by broken #23

jotwin opened this issue Aug 31, 2020 · 9 comments

Comments

@jotwin
Copy link

jotwin commented Aug 31, 2020

I haven't been able to use anything with group_by in it since upgrading to pandas to 1.1.0+

pd.DataFrame({'a':range(10)}) >> count('a==0')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-8092b63f45c6> in <module>
----> 1 pd.DataFrame({'a':range(10)}) >> count('a==0')

/usr/local/lib/python3.7/site-packages/plydata/operators.py in __rrshift__(self, other)
    122         self.data = other
    123         func = get_verb_function(self.data, self.__class__.__name__)
--> 124         return func(self)
    125 
    126     def __call__(self, data):

/usr/local/lib/python3.7/site-packages/plydata/dataframe/helpers.py in count(verb)
     63     verb.add_ = True
     64     verb.data = group_by(verb)
---> 65     data = tally(verb)
     66 
     67     # Restore original groups

/usr/local/lib/python3.7/site-packages/plydata/dataframe/helpers.py in tally(verb)
     49 
     50     verb.expressions = [Expression(stmt, 'n')]
---> 51     data = summarize(verb)
     52     if verb.sort:
     53         data = data.sort_values(by='n', ascending=False)

/usr/local/lib/python3.7/site-packages/plydata/dataframe/one_table.py in summarize(verb)
    169             verb,
    170             keep_index=False,
--> 171             keep_groups=False).process()
    172     return data
    173 

/usr/local/lib/python3.7/site-packages/plydata/dataframe/common.py in process(self)
    217         gdfs = self._get_group_dataframes()
    218         egdfs = self._evaluate_expressions(gdfs)
--> 219         edata = self._concat(egdfs)
    220         return edata
    221 

/usr/local/lib/python3.7/site-packages/plydata/dataframe/common.py in _concat(self, egdfs)
    307             Evaluated data
    308         """
--> 309         egdfs = list(egdfs)
    310         edata = pd.concat(egdfs, axis=0, ignore_index=False, copy=False)
    311 

/usr/local/lib/python3.7/site-packages/plydata/dataframe/common.py in <genexpr>(.0)
    264             Result dataframes for each group
    265         """
--> 266         return (self._evaluate_group_dataframe(gdf) for gdf in gdfs)
    267 
    268     def _evaluate_group_dataframe(self, gdf):

/usr/local/lib/python3.7/site-packages/plydata/dataframe/common.py in _evaluate_group_dataframe(self, gdf)
    290             else:
    291                 _create_column(data, expr.column, value)
--> 292         data = _add_group_columns(data, gdf)
    293         return data
    294 

/usr/local/lib/python3.7/site-packages/plydata/dataframe/common.py in _add_group_columns(data, gdf)
     57     n = len(data)
     58     if isinstance(gdf, GroupedDataFrame):
---> 59         for i, col in enumerate(gdf.plydata_groups):
     60             if col not in data:
     61                 group_values = [gdf[col].iloc[0]] * n

TypeError: 'NoneType' object is not iterable
@has2k1
Copy link
Owner

has2k1 commented Aug 31, 2020

Yes pandas v1.1.0 broke grouping plydata. There is a PR at pandas-dev/pandas#35688 to fix the issue but it has not been merged yet.

has2k1 added a commit that referenced this issue Sep 12, 2020
has2k1 added a commit that referenced this issue Sep 12, 2020
has2k1 added a commit that referenced this issue Sep 12, 2020
@has2k1
Copy link
Owner

has2k1 commented Oct 14, 2020

Fix will ship in Pandas 1.1.4.

@antonio-yu
Copy link

antonio-yu commented Nov 3, 2020

Fix will ship in Pandas 1.1.4.

Appreciate your working on plydata and plotnine.
Does plydata 0.4.2 only support pandas under 1.1.0?
My pandas is 1.1.3.
when I tried to install plydata,it collected pandas 1.0.5.

@has2k1
Copy link
Owner

has2k1 commented Nov 3, 2020

Fix will ship in Pandas 1.1.4.

Appreciate your working on plydata and plotnine.
Does plydata 0.4.2 only support pandas under 1.1.0?

Yes, but see below.

My pandas is 1.1.3.
when i tried to install plydata,i collected pandas 1.0.5.

I just noticed that pandas 1.1.4 shipped 4 days ago. I will also make a release.

@antonio-yu
Copy link

Thanks for your reply.
Then,my friends and I will wait for your new release.
We are all satisfied when seeing the update after 3 years.

@has2k1
Copy link
Owner

has2k1 commented Nov 3, 2020

I spoke too soon, the fix in pandas 1.1.4 was incomplete. I will have to wait for pandas-dev/pandas#37461 to ship, if accepted it is marked to be included v1.1.5.

@has2k1
Copy link
Owner

has2k1 commented Nov 4, 2020

/remind me on 30th November 2020. Pandas v1.1.5 will ship or Pandas v1.2.0 will be out.

@antonio-yu
Copy link

antonio-yu commented Dec 8, 2020

Hi,Pandas V1.1.5 has shipped.
By the way, have any plans to add a dplyr-style function of filter so that filtering rows by regex is more convenient?

@has2k1
Copy link
Owner

has2k1 commented Dec 8, 2020

plydata v0.4.3 is out.

@has2k1 has2k1 closed this as completed Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants