-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_html infers wrong datatype #7032
Comments
I have exactly the same problem and I have to rely on |
The rational is that it doesn't do anything except convert the result of the parse into strings which will happen anyway if you have eg numerical columns that have strange values in them. |
Same issue here. In the example below, Original html table here.
|
@cpcloud Has
|
No it's still there. This is actually a bug that slipped thru the cracks, the nats are wrong and should be fixed. I'll see what I can do over the weekend. |
Only when this behavior is fixed can we consider deprecating infer types. Infer types was originally there because the original implementation didn't use the Csv parser machinery. Now it does, but the date parsing is somehow being forces where it shouldn't. |
Cool man. |
No problem dude, glad you like! |
@clarkfitzg check out the pr if you want .... fixes this weird date issue. turns out it was because i was "forcing" |
@cpcloud nice! |
As can be seen in the below code, column 3, 8, 9, and 10 were misinterpreted as datetime objects. Columns 1, 6 and 7 should be integer. How do I force the columns to be interpreted as the proper type? Only 2, 4, 5 and 11 appear to have been read properly. I can pass 'infer_types=False' I suppose and do manual conversion afterwards, but since infer_types is going away, this won't work.
The text was updated successfully, but these errors were encountered: