Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #192 warnings for scipy mad func and other deprecated params #194

Merged
merged 3 commits into from
Aug 1, 2022

Conversation

johnarevalo
Copy link
Member

@johnarevalo johnarevalo commented May 4, 2022

Description

Change deprecated methods from external libraries.

What is the nature of your change?

  • Bug fix (fixes an issue).

Checklist

  • I have read the CONTRIBUTING.md guidelines.
  • My code follows the style guidelines of this project.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • My changes generate no new warnings.
  • New and existing unit tests pass locally with my changes.
  • I have deleted all non-relevant text in this pull request template.

@johnarevalo johnarevalo linked an issue May 4, 2022 that may be closed by this pull request
@codecov-commenter
Copy link

codecov-commenter commented Jul 8, 2022

Codecov Report

Merging #194 (8ec8829) into master (b4d32d3) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #194   +/-   ##
=======================================
  Coverage   95.51%   95.51%           
=======================================
  Files          53       53           
  Lines        2695     2697    +2     
=======================================
+ Hits         2574     2576    +2     
  Misses        121      121           
Flag Coverage Δ
unittests 95.51% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pycytominer/cyto_utils/util.py 98.93% <ø> (ø)
pycytominer/tests/test_cyto_utils/test_modz.py 100.00% <ø> (ø)
pycytominer/tests/test_cyto_utils/test_output.py 100.00% <ø> (ø)
pycytominer/operations/noise_removal.py 90.00% <100.00%> (ø)
pycytominer/operations/transform.py 100.00% <100.00%> (ø)
pycytominer/tests/test_cyto_utils/test_cells.py 100.00% <100.00%> (ø)
...ycytominer/tests/test_operations/test_transform.py 100.00% <100.00%> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@johnarevalo johnarevalo marked this pull request as ready for review July 8, 2022 20:32
@johnarevalo
Copy link
Member Author

@gwaybio . This PR is ready. 5 warnings still persist. Two are related to np.corrcoef in constant columns, expected (?).

I found the remainder three harder to fix. This method is generating duplicated column names when pd.merge with suffixes.

def merge_single_cells(

@johnarevalo johnarevalo requested a review from gwaybio July 8, 2022 20:48
Copy link
Member

@gwaybio gwaybio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One inline comment.

The merge error is interesting. In your digging, did you notice any more details about what was going on? Is it safe to ignore this warning (pasted below)? (It's the first time I'm seeing it, to be honest)

 pycytominer/tests/test_cyto_utils/test_cells.py::test_merge_single_cells
  /opt/hostedtoolcache/Python/3.7.13/x64/lib/python3.7/site-packages/pandas/core/frame.py:9203: FutureWarning: Passing 'suffixes' which cause duplicate columns {'ObjectNumber_cytoplasm'} in the result is deprecated and will raise a MergeError in a future version.
    validate=validate,

Given that 5 errors still persist, and @shntnu opened #192, I think he should have a say in how we proceed.

A couple other notes that might help our decision:

  • We're moving away from the SingleCells().merge_single_cells() functionality, and solving this future error might not be necessary (unless the warning is pointing to some logic error! 😬 )
  • We're also moving away from sqlalchemy, and same comment about these other 3 errors.
  • My vote for how to proceed is to merge after addressing comments, but Shantanu, please chime in if you have any other concerns.

pycytominer/operations/transform.py Show resolved Hide resolved
@johnarevalo
Copy link
Member Author

Thanks for this review Greg.

did you notice any more details about what was going on?

IIUC merge_single_cells method expects certain naming convention generated by pandas to address duplicate column names after merging two DataFrames (e.g. add _a and _b suffixes). Future pandas will not resolve duplicate columns issues, but will raise an error instead.

We're moving away from the SingleCells().merge_single_cells() functionality, and solving this future error might not be necessary (unless the warning is pointing to some logic error! 😬 )

I'd expect this too. Removing such method would solve this warning.

@shntnu
Copy link
Member

shntnu commented Jul 11, 2022

  • We're moving away from the SingleCells().merge_single_cells() functionality, and solving this future error might not be necessary (unless the warning is pointing to some logic error! 😬 )

It's hard for me to tell if this is a logic error. We can't exclude the possibility that it is a logic error, but (1) it might just be a bad test fixture (2) it is very likely some edge case.

It's happening here (in four similar calls to merge_single_cells; see details below):

sc_df = sc_df.merge(
self.load_compartment(compartment=right_compartment),
left_on=self.merge_cols + [left_link_col],
right_on=self.merge_cols + [right_link_col],
suffixes=merge_suffix,
)

  • My vote for how to proceed is to merge after addressing comments, but Shantanu, please chime in if you have any other concerns.

I like this plan


I took screenshots of the debugger for 3/4 instances where this warning occurs. Note the duplication of ObjectNumber_cytoplasm in sc_df.columns

The duplication is happening iff left_link_col == "Cytoplasm_Parent_New" and that's happening when called from

new_sc_merge_df = ap_new.merge_single_cells()
and 3 other locations where ap_new.merge_single_cells is called in that file.

image image image

@gwaybio
Copy link
Member

gwaybio commented Aug 1, 2022

Sorry for my delay in getting back to this (it dropped off the radar, to be honest!). Off the radar until today when @axiomcura noted an error in his pycytominer testing (and pycytominer's requirements for scipy =>=1.3). Thanks for your detective work Erik!

It seems that this PR will fix this error, and keep our scipy requirements up-to-date

It seems to me that we're all set to merge this in after Shantanu's 👍 :

My vote for how to proceed is to merge after addressing comments, but Shantanu, please chime in if you have any other concerns.

I like this plan


p.s. - I opened #217 to capture Shantanu's sleuthing in #194 (comment)

@gwaybio gwaybio self-requested a review August 1, 2022 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix warnings
4 participants