Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util.testing.assert_almost_equal() gives Key Error #11584

Closed
dickster77 opened this issue Nov 12, 2015 · 8 comments
Closed

util.testing.assert_almost_equal() gives Key Error #11584

dickster77 opened this issue Nov 12, 2015 · 8 comments
Labels
Error Reporting Incorrect or improved errors from pandas Testing pandas testing functions or related to the test suite
Milestone

Comments

@dickster77
Copy link

Test 1 works successfully with a square dataframe

import numpy as np
import pandas as pd

np.random.seed(1)
a = pd.DataFrame(np.random.rand(5, 5))
np.random.seed(1)
b = pd.DataFrame(np.random.rand(5, 5))
b += 0.00001
pd.util.testing.assert_almost_equal(a,b)

Test 2 fails with a dataframe where number rows > number of columns

np.random.seed(1)
a = pd.DataFrame(np.random.rand(6, 5))
np.random.seed(1)
b = pd.DataFrame(np.random.rand(6, 5))
b += 0.00001
pd.util.testing.assert_almost_equal(a,b)

KeyError: 5L

@sinhrks sinhrks added Testing pandas testing functions or related to the test suite Error Reporting Incorrect or improved errors from pandas labels Nov 12, 2015
@sinhrks
Copy link
Member

sinhrks commented Nov 12, 2015

You should use pd.util.testing.assert_frame_equal(a,b) for DataFrame comparison.

Better to add docstring of assert_almost_equal to describe it.

@dickster77
Copy link
Author

Sorry sinhrks - I meant to highlight the fact there are small differences in the dataframes. Hence i specifically need assert_almost_equal()

@sinhrks
Copy link
Member

sinhrks commented Nov 12, 2015

Did you updated the example? Use pd.util.testing.assert_almost_equal(a.values, b.values).

Strangely assert_frame_equal(a, b, check_less_precise=True) should work for your data by definition, it raises.

@dickster77
Copy link
Author

Yes see update in the example b += 0.00001

When I use the .values property to return an ndarray the test assert_almost_equal() works for me.

However I would want to do a dataframe test. One that ensure indexes etc. match. Yet it allows for the small variation in the values.

@sinhrks
Copy link
Member

sinhrks commented Nov 12, 2015

Currently it should be possible by check_less_precise option, but it seems not applied to all cases.

More flexible comparison tolerance is being discussed in #10788.

@dickster77
Copy link
Author

Thanks - when should I use .assert_almost_equal() with respect to dataframes?

@sinhrks
Copy link
Member

sinhrks commented Nov 12, 2015

Let me summarize:

  • assert_almost_equal shoun't be used for DataFrame
  • assert_frame_equal can compare index and columns. It should be always used for DataFrame comparison.
  • assert_frame_equal(check_less_precise=True) should compare 3 digits after decimal points (similar to assert_almost_equal). (Your case should be covered by the option, but it looks not work properly because of a bug)

As a workaround until the bug is fixed, you can use assert_almost_equal(a.values, b.values) to compare DataFrame values (ndarray). To compare index and columns, use assert_index_equal each.

@jreback
Copy link
Contributor

jreback commented Nov 12, 2015

see #9457 for the enhancement issue to add equiv of np.allclose to this.

@sinhrks let's rerpose this issue to enable assert_almost_equal to raise a nicer error message (and suggest using one of the assert_*pandasobject*_equal methods

I suppose we could just call one of these methods directly, no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants