Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes making paths of linked files relative (web urls will not be touched anymore) #5879

Merged
merged 3 commits into from
Jan 29, 2020

Conversation

systemoperator
Copy link
Contributor

@systemoperator systemoperator commented Jan 27, 2020

Fixes #5861
Description:

  • Cleaning up entries with "Make paths of linked files relative (if possible)" broke web URLs and resulted in various other issues subsequently. This PR fixes this issue.
    It has been tested for local files and web urls.

  • Change in CHANGELOG.md described (if applicable)

  • Manually tested changed features in running JabRef (always required)

  • Checked documentation: Is the information available and up to date? If not: Issue created at https://github.com/JabRef/user-documentation/issues.

Description:
- Cleanup entries: "Make paths of linked files relative (if possible)" broke web URLs and resulted in various other issues subsequently. This PR fixes this issue.
It has been tested for local files and web urls.
@Siedlerchr
Copy link
Member

Thank you very much for your contribution!
fix looks good!

@systemoperator
Copy link
Contributor Author

Is there still something I need to do or is everything fine, because one check failed?

@Siedlerchr
Copy link
Member

Siedlerchr commented Jan 28, 2020

you can ignore the failing check. It's just the upload of the binaries which sometimes fails.
Our Code Quality Policy requires a review from a second reviewer for external developers before it can be merged.

I think you also should add a Changelog entry since it's at least an error from the beta/previous version

@Siedlerchr Siedlerchr added the status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers label Jan 28, 2020
Copy link
Member

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR. The change looks good to me and I've only one small remark where an existing method can be reused.

* @param filePath file path to check
* @return <code>true</code>, if the given file path is a web url, <code>false</code> otherwise
*/
private static boolean isWebUrl(String filePath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class LinkedFile already contains a method that does this job:

public boolean isOnlineLink() {
return isOnlineLink(link.get());
}

Can you please use it. I would also suggest to add the normalization trim().toLowerCase() there, which I find a nice addition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning isOnlineLink(): I would recommend removing "toCheck.contains("www.")", because a path to a local file might contain this pattern coincidentally. What do you say?

@calixtus
Copy link
Member

calixtus commented Jan 28, 2020

Would a proper implementation of rfc 3986 be an overkill? Maybe this could be achieved with a proper regex?
https://tools.ietf.org/html/rfc3986

Spotted this
https://stackoverflow.com/questions/30847/regex-to-validate-uris#30858

@systemoperator
Copy link
Contributor Author

systemoperator commented Jan 28, 2020

@calixtus Checking a path as follows could be a simple alternative https://www.geeksforgeeks.org/check-if-url-is-valid-or-not-in-java/ But nevertheless, I also see advantages with the current approach, since it also detects urls as web urls, even if they are potentially broken in some way. This was the reason, why I created it like that. I think the main goal is to identify a web url rather than checking its validity.

@Siedlerchr
Copy link
Member

@systemoperator That should work using new URL() in practice and I think should be sufficient for most cases as an indicator

@systemoperator
Copy link
Contributor Author

systemoperator commented Jan 28, 2020

I e.g. think about the following scenario: Given are broken web urls like:

I think, in this screnario, it is more important to identify whether it is a web url or not, rather than making it more strict, saying checking whether it is a VALID web url, because in the case, that it would be a invalid web url (as e.g. shown above) then in this scenario the url would again get cleaned up and thus it will become even more "destoryed", as it would be interpreted as a local file url again and many other issues would again show up.

On the other hand, if a broken url gets identified as a web url, then it will be processed properly and everything will work out, but when opening the url, it just cannot be loaded.

In my opinion, this method is all about identifying a web url, not validating one. (That's why it is called isOnlineLink() rather than isValidOnlineLink().)

@calixtus
Copy link
Member

calixtus commented Jan 28, 2020

Somewhere I think, you're some kind of right too.

On the other hand, where do we want to decide, what's still a acceptable and what not? What about "file://" (Im aware that normally a file path is printed without this here, but it still could happen). We also had a few months ago someone here using JabRef as database for genetics stuff and not for books (proofing the mutability of JabRef).
Please don't misunderstand me, I get your point and I don't exactly know the right answer...

@systemoperator
Copy link
Contributor Author

Before implementing it, I thought about many of those things. With the current implementation a file link starting with "file://" will not be interpreted as a web url anyway, so this would be fine as well.

@calixtus
Copy link
Member

At least the javadoc should then be adapted... ;-)

@systemoperator
Copy link
Contributor Author

Of course. I would like to hear @tobiasdiez's recommendation about that as well. Then I will implement it as desired. :)

…a file link is an online link; improving isOnlineLink()
@tobiasdiez
Copy link
Member

I think the current implementation is a good compromise. In the ideal world, the URL is valid and a proper validation as @calixtus would be preferable. However, the data quality we get from some fetchers is suboptimal so that it is a good idea to error on the save side (e.g. better identify to many paths as web urls). I suggest to merge now and come back to it later if users complain / provide problematic use cases.

@tobiasdiez tobiasdiez merged commit eb988b6 into JabRef:master Jan 29, 2020
Siedlerchr added a commit that referenced this pull request Jan 30, 2020
* master: (297 commits)
  Replace link to Workspace set-up with new one (#5896)
  Fixes making paths of linked files relative (web urls will not be touched anymore) (#5879)
  Switch to our IntelliJ config (#5881)
  Bump unirest-java from 3.4.00 to 3.4.01 (#5874)
  Bump junit-vintage-engine from 5.5.2 to 5.6.0 (#5875)
  Bump checkstyle from 8.28 to 8.29 (#5876)
  Bump junit-jupiter from 5.5.2 to 5.6.0 (#5877)
  Bump junit-platform-launcher from 1.5.2 to 1.6.0 (#5878)
  Change \ to /
  Bump byte-buddy-parent from 1.10.6 to 1.10.7 (#5873)
  Fix opening pdf with okular in linux (#5253) (#5855)
  Fixed Test
  Refactored constructors, PreferencesService and some minor improvements.
  Remove ampersand escape when writing to bib file (#5869)
  Fix #5862. It was indeed the throttler (at least it is working now for me) (#5868)
  duplicate query parameter removed (#5865)
  New Crowdin translations (#5864)
  Minor refactoring, and changed comment
  Bump antlr4 from 4.7.2 to 4.8-1 (#5852)
  Reintroducing master table index column (#5844)
  ...
koppor pushed a commit that referenced this pull request Jan 30, 2020
…ched anymore) (#5879)

* Fixes #5861
Description:
- Cleanup entries: "Make paths of linked files relative (if possible)" broke web URLs and resulted in various other issues subsequently. This PR fixes this issue.
It has been tested for local files and web urls.

* changelog updated

* refactoring: use existing method isOnlineLink() for checking whether a file link is an online link; improving isOnlineLink()
Siedlerchr added a commit that referenced this pull request Jan 31, 2020
* upstream/master:
  Update development-strategy.md
  Replace link to Workspace set-up with new one (#5896)
  Fixes making paths of linked files relative (web urls will not be touched anymore) (#5879)
  Switch to our IntelliJ config (#5881)
Siedlerchr added a commit that referenced this pull request Jan 31, 2020
…tomizeEntrydlg

* 'customizeEntrydlg' of github.com:JabRef/jabref:
  Replace link to Workspace set-up with new one (#5896)
  Fixes making paths of linked files relative (web urls will not be touched anymore) (#5879)
Siedlerchr added a commit that referenced this pull request Feb 3, 2020
* upstream/master:
  Replace link to Workspace set-up with new one (#5896)
  Fixes making paths of linked files relative (web urls will not be touched anymore) (#5879)
Siedlerchr added a commit that referenced this pull request Feb 3, 2020
* master:
  reference to issue added (#5911)
  Fix properly resolves OrFields of required fields (#5903)
  IDE setup updated and extended (#5901)
  Squashed 'src/main/resources/csl-styles/' changes from e71363e..c531528
  Fixes problems with managing external file types (issue 5846) (#5894)
  Squashed 'src/main/resources/csl-locales/' changes from 41da445acc..4fa753374e
  Squashed 'src/main/resources/csl-styles/' changes from f0c7374..e71363e
  Update development-strategy.md
  Replace link to Workspace set-up with new one (#5896)
  Fixes making paths of linked files relative (web urls will not be touched anymore) (#5879)
Siedlerchr added a commit that referenced this pull request Feb 5, 2020
* Fixes making paths of linked files relative (web urls will not be touched anymore) (#5879)

* Fixes #5861
Description:
- Cleanup entries: "Make paths of linked files relative (if possible)" broke web URLs and resulted in various other issues subsequently. This PR fixes this issue.
It has been tested for local files and web urls.

* changelog updated

* refactoring: use existing method isOnlineLink() for checking whether a file link is an online link; improving isOnlineLink()

* Replace link to Workspace set-up with new one (#5896)

As the current link just pointed to a page with another link, I'd propose to directly put that second link.

* update to 13.0.2

Co-authored-by: systemoperator <3658393+systemoperator@users.noreply.github.com>
Co-authored-by: Thomas F. Duellmann <duelle@users.noreply.github.com>
koppor pushed a commit that referenced this pull request Feb 15, 2022
eb97405 Create frattura-ed-integrita-strutturale-fracture-and-structural-inte… (#5877)
b33ebfc make journal names title case (#5900)
1c7ecf2 Update uclouvain-centre-charles-de-visscher-pour-le-droit-internation… (#5901)
1a2ea12 Create trinity-college-dublin-zoology-botany-environmental-sciences-h… (#5893)
6bd742c Update harvard-anglia-ruskin-university.csl (#5840)
bc8f258 Update chicago-author-date.csl (#5836)
80aded6 Fix missing prefix for URL field (#5894)
b850a0d Update american-society-of-civil-engineers.csl (#5891)
94c1cb3 Update deutsche-gesellschaft-fur-psychologie.csl (#5861)
f66f384 Update universite-cheikh-anta-diop-faculte-de-medecine-de-pharmacie-et-dodontologie.csl (#5863)
82cf786 Update and rename acta-psychiatrica-scandinavica.csl to dependent/acta-psychiatrica-scandinavica.csl (#5879)

git-subtree-dir: buildres/csl/csl-styles
git-subtree-split: eb97405
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Making paths of linked files relative breaks urls
4 participants