-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write unmodified entries to bib file in the same format as they were read #391
Conversation
Nice. Hm, what to do when two entries share a line? E.g. |
In that case, the current implementation fails. But that gives me an idea: Isn't |
There is the issue of comments, which potentially can contain |
The |
@oscargus: that would not be a problem. It's more of a problem if an @koppor: This is work in progress. I'll start commenting when I have something closer to finalization. Do we want to parse the file twice? If so, then jbibtex might be an option. Can you tell me if jbibtex writes back an entry in exactly the same format (up to every line feed character) as it was read? @tobiasdiez Thanks, I am aware of that. This is work in progress. I'd first like to arrive at a solution that does the round-trip for a well-structured database. Then, I'll modify it to match incomplete entries. |
Conflicts: src/main/java/net/sf/jabref/model/entry/BibtexEntry.java
Ok, now this works nicely with my large local database that doesn't include analomies. Now comes the hard part: Properly reading and writing files with strange content. This is also the reason why some tests are failing at the moment. I might need to discuss some of these aspects and will ping you in this thread if I am unsure. |
Dammit, I wanted to get some work done today and now I am only coding JabRef... Anyways, this seems to work (and passes the tests)! I ditched my previous parsing logic and rewrote it in integration with the I am now able to store each entry exactly as it is in the file. Each entry serialization contains all text from its end up to the end of previous entry. The main problem with this is that with this logic, the first entry also stores the file header and then the header is serialized twice when rewriting the file. I was able to get around that with some hacking, but there might still be edge cases I am not yet aware of. To sum up, this needs some more manual testing, but should be pretty close. Feedback is welcome :) |
* marks if the complete serialization, which was read from file, should be used. | ||
* Is set to false, if parts of the entry change | ||
*/ | ||
private boolean useCustomSerialization; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this (and the corresponding methods, of course) as it is always used as "if not changed then..." to "useParsedSerialization" or even to "hasChanged".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll do some renaming. On second thought, I will also replace the Deque
with a Stack
, because this is the way in which it is used.
Tested it manually with some entries and it works smoothly! Good job! 👍 |
Should we really introduce this in v3.0 already? I'm not entirely sure it works on all occaison. For instance, I haven't tested it with strings in the bib file so far. |
@matthiasgeiger But that does not really depend on the version information. The version is checked, but also every file written by JabRef 3.0 is marked as problematic. The file links upgrade really depends on whether the file contains fields named Anyways, if you really really want the version parsing, we can keep it. |
okay... I have to confess that I only performed a "find usages" search without truly checking whether the usages are really needed. So feel free to remove those lines ;-) |
…sity to do a file upgrade
I removed the old meta flag and version parsing. Version headers are now simply removed on rewrite. File upgrade functionality ultimately depends on preferences (and on whether JabRef finds errors in the file to begin with). |
Conflicts: src/main/java/net/sf/jabref/model/entry/BibtexString.java
…ot of appending ones
Also save using Eclipse Formatting rules
So, from my point of view this PR is complete and ready to merge. @koppor and I did some more testing with more advanced things like meta-data and it seems fine. I'm not guaranteeing the absence of bugs, but there should be no more obvious mistakes. Somebody else can take a final look and merge this with master. |
Looks good to me. It worked with my large .bib file without any issues. |
@simonharrer Could you check what happens if you change the sort ordering of the file? Change the ordering. If everything is OK, change the ordering and change an entry. 🐇 I it doesn't work, maybe we drop the ordering possibilities of entries within JabRef now or leave that as open issue. |
What do you think is the problem here? When I change the sort ordering, the order of the entries is changed in the file. But nothing else is. And if I also change an entry, this entry is changed as well. I do not get what you imply. |
Write unmodified entries to bib file in the same format as they were read
As discussed in #116, we want to write entries that are not modified during a session back in the exact same format as they were read. This PR is WIP in this direction and not complete yet. I will continue working on it.
To be able to write back an entry in the same fashion as it is read, we need to store it upon reading. I added a field to
BibtexEntry
for storing this and tried to modifyBibtexParser
to store the file content, but to no avail. The current code of that class is pure hell (uses a globalPushBackReader
to read the file in a very confusing way) and this PR is hopefully a step towards its replacement.Instead of modifying the logic in
BibtexParser
, I extend it with additional methods that perform the new functionality on top. Current status:List<String>
of its lines before handing it over to the old parser.List<String>
and stored in the entry. Blank lines following the entry belong to the entry.That means we read the file twice, which is terrible for large files, but as this is WIP...Next step will be to detect when an entry has been changed and the modification of the writing logic.
The handling of newlines currently is not consistent.