-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
to_sql function takes forever to insert in oracle database #14315
Comments
What database driver are you using? |
I used cx_Oracle driver to connect Both the databases are on same machine (I used a lubuntu Virtual Machine for this comparison), hence connection speed shouldn't be an issue ? |
@addresseerajat Can you have a look at the discussion in #8953 ? |
@jorisvandenbossche: I looked at the solution and tried using a similar approach. The relevant code is as follows:
The above line gives me an error:
My database version is oracle 11g. However when I execute the following command, I am able to insert into the database. The only problem: it takes a lot of time to insert.
|
Were there any other findings here? I've discovered that when pushing data into oracle using cx_oracle it's painfully slow. 10 rows can take 15 seconds to insert. The server we're using is decent (32GB of RAM and 8 core). |
我最近遇到了同样的问题。最后,我找到了解决问题的方法。 |
As mentioned by @wuhaochen I have also ran into this problem. For me the issue was that oracle was creating columns of CLOB data type for all the string columns of the pandas dataframe. I sped-up the code by explicitly setting the I think this should be the default behavior of |
Could you provide example for the varchar conversion? numbers always work quickly. thanks |
Sorry, the correct parameter of |
to_sql() is still practically broken when working with Oracle without using the workaround recommended above. |
It's not clear that there's a pandas specific fix for this issue so going to close |
I am using pandas to do some analysis on a excel file, and once that analysis is complete, I want to insert the resultant dataframe into a database. The size of this dataframe is around 300,000 rows and 27 columns.
I am using
pd.to_sql
method to insert dataframe in the database. When I use aMySQL
database, insertion in the database takes place around 60-90 seconds. However when I try to insert the same dataframe using the same function in anoracle
database, the process takes around 2-3 hours to complete.Relevant code can be found below:
I tried using different
chunk_size
s (from 50 to 3000), but the difference in time was only of the order of 10 minutes.Any solution to the above problem ?
The text was updated successfully, but these errors were encountered: