You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File type verification fails when using Pathlib to construct paths and passing that file to pdf2bib.
Expected behavior:
Providing a path to the file using Pathlib should work.
Actual behavior
The following error is thrown:
File "/home/myusername/gitrepos/projectname/.devenv/state/venv/lib/python3.11/site-packages/pdf2bib/main.py", line 89, in pdf2bib
if not (filename.lower()).endswith('.pdf'):
^^^^^^^^^^^^^^
AttributeError: 'PosixPath' object has no attribute 'lower'. Did you mean: 'owner'?
Minimal example to reproduce behavior
[myusername@mycomputer:~/gitrepos/myproject]$ python
Python 3.11.8 (main, Feb 6 2024, 21:21:21) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pathlib import Path
>>> mypdf=Path('watching/s41567-024-02510-3.pdf')
>>> import pdf2bib
>>> pdf2bib.pdf2bib(mypdf)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/myusername/gitrepos/projectname/.devenv/state/venv/lib/python3.11/site-packages/pdf2bib/main.py", line 89, in pdf2bib
if not (filename.lower()).endswith('.pdf'):
^^^^^^^^^^^^^^
AttributeError: 'PosixPath' object has no attribute 'lower'. Did you mean: 'owner'?
I'll be working on a workaround in my own code (probably just wrapping my path in str when passing it to pdf2bib), but it may be worth investigating whether a surgical change is due, like changing the above line to
if not (str(filename).lower()).endswith('.pdf'):
or if it would be worthwhile to use Pathlib instead of os.path in the pdf2bib project. In that case, there are calls like this that could be used when doing the check (although still requiring a lowercase string comparison like in your current code):
Thanks for bringing this up. Yes, the input variable of the function pdf2bib needs to be a string in the current implementation. You are very welcome to submit a PR with your code.
Probably the simplest way would be to add a code similar to the one that you mentioned at the very beginning of the function, in order to convert a (potential) Pathlib object into a string.
Issue:
File type verification fails when using Pathlib to construct paths and passing that file to pdf2bib.
Expected behavior:
Providing a path to the file using Pathlib should work.
Actual behavior
The following error is thrown:
Minimal example to reproduce behavior
Additional info:
Here's a link to the line being considered:
pdf2bib/pdf2bib/main.py
Line 89 in f86a053
I'll be working on a workaround in my own code (probably just wrapping my path in
str
when passing it to pdf2bib), but it may be worth investigating whether a surgical change is due, like changing the above line toor if it would be worthwhile to use Pathlib instead of os.path in the pdf2bib project. In that case, there are calls like this that could be used when doing the check (although still requiring a lowercase string comparison like in your current code):
https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix
The text was updated successfully, but these errors were encountered: