Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escaping of escape symbols in the MetaData #2445

Merged
merged 3 commits into from
Jan 5, 2017

Conversation

lenhard
Copy link
Member

@lenhard lenhard commented Jan 2, 2017

Fixes #2426 and is ready for review.

Old discussion

Attempts to fix #2426.

Is not yet functional, since the issue goes somewhat deeper and affects LaTeX-encoding things in all MetaData, also groups. The problem is line 64 in MetaDataSerializer:

 stringBuilder.append(StringUtil.quote(dataItem, MetaData.SEPARATOR_STRING, MetaData.ESCAPE_CHARACTER)).append(MetaData.SEPARATOR_STRING);

For every MetaData item, all special characters are quoted, including the quoting symbol (\). Thus, if there is a \ in the MetaData, it is quoted during serialization, resulting in \\, which is read as such and requoted on the next serialization, resulting in \\\\ and so on.

Unfortunately, unquoting the text during parsing and quoting it again during writing does not help, since a single \ will simply be lost. E.g.: unquoting \"U would result in "U, which upon quoting anew would stay "U.

The same problem should occur for groups, if I am not mistaken.

I now have the following options to resolve the problem:

  1. Add special handling for content selectors to avoid quoting for them
  2. Change StringUtil.quote to avoid quoting the quoting char (\)
  3. Add another method in StringUtil that avoids quoting the quoting char and use it for MetaData in general.

I am afraid that changing StringUtil.quote has wider implications that I am not aware of., especially relating to groups Hence, there is a need for discussion.

  • Change in CHANGELOG.md described
  • Tests created for changes
  • Manually tested changed features in running JabRef

@lenhard lenhard requested a review from tobiasdiez January 2, 2017 16:17
@tobiasdiez
Copy link
Member

I don't understand why unquoting on parse and quoting on writing does not work.
The input \"U will be written as \\\"U (as in the java string enquoting) and upon parsing be restored as \"U. Of course, a single backslash in the bib file gets lost but JabRef never should write just one backslash but always two.

As for the StringUtil.quote method, I think, you can edit it directly. If I remember correctly, then its purpose is to ensure that strings are properly encoded for writing them in the bib-file (this applies to all kind of metadata - groups are nothing special). Probably, it would be good to move this method actually to the metadata serializer / parser.

@lenhard
Copy link
Member Author

lenhard commented Jan 5, 2017

@tobiasdiez You are correct, of course. If we avoid that JabRef writes a single backslash into the file, then everything works fine. My bad, for not getting this immediately. The current PR should be ready for review.

@lenhard lenhard added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Jan 5, 2017
@tobiasdiez
Copy link
Member

Build is fine expect a fetcher problem -> Merge

@tobiasdiez tobiasdiez merged commit fa51393 into master Jan 5, 2017
@tobiasdiez tobiasdiez deleted the contentselector-escaping branch January 5, 2017 21:26
Siedlerchr added a commit that referenced this pull request Jan 13, 2017
* upstream/master: (67 commits)
  Medline fix test (#2463)
  Fix conversion of tilde n (#2459)
  ctrl+f selects current query while the searchbar is focusd (#2457)
  incorrect log name of JabRefExecutorService (#2452)
  unregister DateChangeListener in manual update method (#2450)
  Escaping of escape symbols in the MetaData (#2445)
  Fix typo
  Update gradle from 3.2.1 to 3.3
  Use instanceof
  Remove unused import
  Avoid ClassCastException in AutoCompleteListener
  Fix typo in CHANGELOG.md
  Add support for pages in the format 2:1-2:33 (#2440)
  L10N-ru update (#2441)
  Change https to http
  Update DBLP API endpoint
  Revert "Chistmas edition colors"
  Show development information
  Release v3.8.1
  Result of generate-authors.sh
  ...

# Conflicts:
#	CHANGELOG.md
#	src/main/java/net/sf/jabref/gui/date/DatePickerButton.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Backslashes are repeatedly escaped on save in Content-Selectors
2 participants