-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Initial work on DBMSProcessor batch entry insertion into ENTRY table #5814
[WIP] Initial work on DBMSProcessor batch entry insertion into ENTRY table #5814
Conversation
@koppor @tobiasdiez I'm curious what you think of the underlying algorithm I have (tried to) create. The problem I encountered was that when each entry was inserted into the My solution, still not completely implemented, is to get the generated keys from that batch Is this a valid way of setting the shared ID to local |
Your algorithm looks good to me. Another option would be to use the DuplicationChecker, which should be able to identify the entries and thus let you assign the id (at least if the entries are sufficiently different). However, I'm wondering why |
We are correct in reading the generated Ids from the database - The reason for generating the unique Ids on the side of the database is because of multi-user-access. Even though one could generate UUIDs on client-side, this feels wrong (unnecessary size of key, ...). I would not use the duplication checker. This also feels hacky. Maybe using the BibTeX keys - and if there isn't a key, the remaining entries could be compared. But still, this feels hacky, since there should be a reliable (and quick) way to read the generated IDs. @NorwayMaple Did add tests for the ResultSet? I would also rely on https://stackoverflow.com/a/50944078/873282 that the driver returns the generated keys inorder. Nevertheless, test cases should exist. |
No tests yet. Would I be better off writing the tests before all this code? On batches, I wasn't using @tobiasdiez I think you are right because multiple users might then have different shared IDs for the same |
If you write a test now, it should fail. Then you can experiment with the different options until the test pass. I would suggest that the test generates a bunch of random entries (say 100) and then inserts them. Afterwards you can check that the shared id is the correct one for all entries. |
Okay. So far I've been hard-coding the example entries for the tests. I could do, say, 5, or even 100, but are you thinking I might use some other means to generate the random entries? Also, are we satisfied making sure the tests pass, given it is technically possible for the JDBC implementations to change at any time and there is no guarantee that they will maintain orderedness? |
I guess a for loop from i = 1 to 100 which generates entries ala I would be satisfied if this tests passes. If, in the future, the implementation changes then we get notified by the test, which then fails. |
I am very confused about a bug that appears to be in |
@@ -118,7 +118,8 @@ protected void insertIntoEntryTable(List<BibEntry> entries) { | |||
} | |||
insertEntryQuery.append(" SELECT * FROM DUAL"); | |||
LOGGER.info(insertEntryQuery.toString()); | |||
try (PreparedStatement preparedEntryStatement = connection.prepareStatement(insertEntryQuery.toString())) { | |||
try (PreparedStatement preparedEntryStatement = connection.prepareStatement(insertEntryQuery.toString(), | |||
new String[]{"SHARED_ID"})) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, but this looks odd to me with the String[]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just indicates the column where we want to put the auto-generated keys. See https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#prepareStatement(java.lang.String,%20int[])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into Stack Overflow and the Oracle JDBC driver source code and it looks like the Oracle JDBC doesn't support getGeneratedKeys
on INSERT ALL
statements. So I'll have to find another way of getting the shared IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OMG. This sounds like we should really drop Oracle support. There is no need for Oracle in 2020, is it? - Postgres is the way to go, isn't it?
We should IMHO also drop MySQL support as it does not offer automatic updates on changes on the server.
@NorwayMaple Is it something about a We are planning to release the 5.0 version this weekend. Do you think, you can finish this PR by Friday? If not, I will drop Oracle support. Causes too much trouble. I think, we should go for https://www.jooq.org/. WDYT? |
If you look at the Oracle JDBC source code, the reason Oracle says the SQL is not properly ended is because the JDBC secretly inserts a Finally, as a note on Oracle, I should be able to make Oracle work iteratively with a SQL statement for each entry (but not one per field). This would mean it wasn't optimized like Postgres and MySQL, but this would be an easy and fast fix on my end, and we wouldn't even have to completely drop Oracle. I'm not sure about Friday, I may have full-time work starting tomorrow on a temp job starting tomorrow, and an interview on Wednesday. I can try and do Saturday afternoon, but I'm in the US East time zone so that might be Saturday evening and night for you, and I still have to prepare for my interview. However, if you ignore Oracle, I think I can complete the PR much faster, even without www.jooq.org, so there may not be a lot of work left. |
I did the main fixes I wanted to do so they work on Postgres and MySQL. Now I just have to test the Oracle, manually test it in the GUI, fix |
@koppor I've done the manual testing and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick update! The changes look good, and the tests are passing so I'll merge now.
Looks like we are now getting |
* upstream/master: Try to fix linux pdf opening again (#5945) [WIP] Initial work on DBMSProcessor batch entry insertion into ENTRY table (#5814) followup fix Fixfetcher (#5948) Bump byte-buddy-parent from 1.10.7 to 1.10.8 (#5952) Added MenuButtons to IntegrityCheckDialog (#5955) Reimplement custom entry types dialog (#5799) Bump unirest-java from 3.4.03 to 3.5.00 (#5953) MySQL: Allow public key retrieval (#5909) Restructure and improve docs for setting up IntelliJ (#5960) Change syntax for Oracle multi-row insert SQL statement (#5837)
I am working on making
DBMSProcessor
insert multiple entries with only a small constant number of queries. It currently does not compile. But the basic ideas are there.