-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the excel reader: hours & header #4404
Conversation
line = self._next_line() | ||
else: | ||
#if there is just 1 header row to be parsed, this works. | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a syntax error here...not aligned with the except
clause below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I checked. The code works.
Please check with: https://github.com/timmie/example_code_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u hook up travis ci? https://github.com/pydata/pandas/wiki/Testing#hook-up-travis-ci thanks |
uff, never done this. I will see what I can do in the next days. |
@timmie it's actually really easy to set up. All you need to do is log in to Travis with your github account and follow a few instructions. The upside is that Travis then tests your code against multiple versions of Python and multiple different environments, which can really help iron out bugs. |
it's important for these kinds of prs especially since we don't want a bunch of failures simply because the underlying lib doesn't exist for, say, python 3.2 |
ok, I added the service hook in the GitHub settings. What is the next step? Do you also wanna have a test with the data from https://github.com/timmie/example_code_data? THX. |
I find Travis really useful, TBH. You can kind of throw a bunch of things |
we need that locally! i'm working on something to make |
git commit --amend -C HEAD |
so, no I see a lot of errors for code I never touched: |
many of those are related to your change |
actually ALL of them are |
mmh. worked with my files... How can I test this on my machine? |
|
line = self._next_line() | ||
else: | ||
#if there is just 1 header row to be parsed, this works. | ||
line = self.data[header[0]] | ||
|
||
this_columns = [] | ||
for i, c in enumerate(line): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also why did you change the permissions of this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clone to a FAT32? I do not know. Usually, I work with bzr which can ignore this.
I assume that this is not what breaks the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it's probably not
git tracks that information (in a somewhat reduced capacity) i think there are only 2 or 3 different sets of permissions that are possible with git
on my local I get:
|
you're probably not in the pandas directory |
I am: |
try |
find . -name 'test_pytables.py' |
not sure what the issue is |
Sounds good |
Ironically I didn't start to understand git until I started contributing more.. Just keep at it :) |
what does this mean
|
The w option only accepts directories |
the |
I tried on anther machine.
Apparently, one would need to complie befor that works ;-( Sorry, this is really getting beyond my capabilities and time allowances. |
just pull the upstream and then my changes on top. |
@timmie why did you revert the doc change? |
@jtratner |
@timmie Can you add a test case (and, potentially, test data) so that you can reproduce and test the error you noted? That way, we'll be sure not to break this again in future changes. |
OK, a test is there and pass. |
May I ask for your help? Travis sais all but one test job pass:
with python 2.6
https://s3.amazonaws.com/archive.travis-ci.org/jobs/10194623/log.txt Why is xlrd not available there? |
nevermind, let me troubleshoot my code a bit. |
@timmie - I was at work and couldn't reply. It's pretty simple: not all the |
I'll take a look.
|
@jtratner Thanks. |
Timmie, please look at the test case In io/tests/test_parsers around line 1974 (exact line is in the Travis test failure output). That's what's failing. The test is actually failing on every version of Python (it doesn't cause the one 2.7 runner to fail because it only runs tests marked as 'slow'). I'm not totally clear on what exactly you are testing there. For example, testing the '!=' case without testing the case that does output that is dangerous (eg that would pass if another error message was thrown as well, etc.). Alternately, maybe better to test what the entire output should be. Going to look at it a bit more in depth now. 2 other things:
If that command complains that upstream doesnt exist, run
Next, you need to rebuild pandas so that everything is up to date. To do that, run:
|
I posted the previous comment because these tests ought to be failing on your system too. That said, before I can remove this, please go back and remove all the print statements and extra if clauses you added in for testing. It makes it harder to review your PR with all that extra noise in it. Also, for the excel tests you added, instead of defining separate functions, you should be able to know in advance exactly what should result (eg a time delta of 10mins or a datetime) and test against that instead. Finally, to get the pandas core team to accept your pull request, you need to make it clear what you are changing. From reviewing the associated PRs it looks like you want to:
Can you lay out how (3) will work? Are you passing some parameter to the function? Finally, does this PR also add some kind of change to error reporting in read_excel? If it does, you also need to document those changes (and how they are enabled or disabled) Thanks! |
Sorry writing on my phone! I meant 'before I can review this' not remove this. |
I know. But this test was there. I did not know how to modify it. for me, it seems that "verbose=True" does not produce results.
yes, as I did not understand the full and existing test, I tried to disable with "!=" to see if the others at least pass.
Yes, sure. thanks for the reminder. actually, I wanted to provide you with some feedback about the difficulties I encountered as non-core dev working with this code. It too all of us so much time with this PR. Let's follow up on this after the PR is done.
as above, I am aware that there are leftovers from "experiements". They should be disabled. If tests go through, I will remove them.
Yes.
Yes/No. this was part of the inital problem. So I wanted to catch that, too.
Yes.
I defined a function to get the header from the data read in: this is called in if the current row is equal to the header row.
No, nothing at all. If you are referring to the "verbose=True" this was an existing test I wasn't able to debug.
you mean like hardcode the result? |
Wow, it passes now! So great. I feel good... Thanks again! |
I'm glad to hear that it's working now! (and i'm not a core dev either, I |
Nice. so this is also your after work thingy? Comments are at: |
oh, you just merged and closed it! I am in the middle of trying to clean the merging & conflict chaos and to clean up things. What shall we do? |
@jreback did you merge this? I don't think this should have been merged. |
I'm actually unclear if this was merged at all - don't see the commit on pydata/master. Maybe it's a github issue? |
@timmie I didn't merge this, I think this is a hash collision, nothing added nor deleted |
This is what I merged, bet github confused it 9ea0d44 |
We should report this to Github as a bug. @timmie sorry that you have to deal with yet another stumbling point here! |
I'll give them a report |
this happened once before when we were messing with remote branches for |
in the excel.py there is a fix enabling reading xlsx files with both
datemodes: (see #4332)
in the parsers.py there is the fix for readinh the header even if there
are additional rows (to be skipped) between a header and the data (see: #4340)
I hope this PR is adequate. If not, please let me know.
I can supply the sample file for testing proving that it works. See also:
https://github.com/timmie/example_code_data