Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjusting file verification when Pathlib used #15

Closed
Jdogzz opened this issue Jun 7, 2024 · 1 comment · Fixed by #16
Closed

Adjusting file verification when Pathlib used #15

Jdogzz opened this issue Jun 7, 2024 · 1 comment · Fixed by #16

Comments

@Jdogzz
Copy link
Contributor

Jdogzz commented Jun 7, 2024

Issue:

File type verification fails when using Pathlib to construct paths and passing that file to pdf2bib.

Expected behavior:

Providing a path to the file using Pathlib should work.

Actual behavior

The following error is thrown:

 File "/home/myusername/gitrepos/projectname/.devenv/state/venv/lib/python3.11/site-packages/pdf2bib/main.py", line 89, in pdf2bib
    if not (filename.lower()).endswith('.pdf'):
            ^^^^^^^^^^^^^^
AttributeError: 'PosixPath' object has no attribute 'lower'. Did you mean: 'owner'?

Minimal example to reproduce behavior

[myusername@mycomputer:~/gitrepos/myproject]$ python
Python 3.11.8 (main, Feb  6 2024, 21:21:21) [GCC 13.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pathlib import Path
>>> mypdf=Path('watching/s41567-024-02510-3.pdf')
>>> import pdf2bib
>>> pdf2bib.pdf2bib(mypdf)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/myusername/gitrepos/projectname/.devenv/state/venv/lib/python3.11/site-packages/pdf2bib/main.py", line 89, in pdf2bib
    if not (filename.lower()).endswith('.pdf'):
            ^^^^^^^^^^^^^^
AttributeError: 'PosixPath' object has no attribute 'lower'. Did you mean: 'owner'?

Additional info:

Here's a link to the line being considered:

if not (filename.lower()).endswith('.pdf'):

I'll be working on a workaround in my own code (probably just wrapping my path in str when passing it to pdf2bib), but it may be worth investigating whether a surgical change is due, like changing the above line to

if not (str(filename).lower()).endswith('.pdf'):

or if it would be worthwhile to use Pathlib instead of os.path in the pdf2bib project. In that case, there are calls like this that could be used when doing the check (although still requiring a lowercase string comparison like in your current code):

https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix

@MicheleCotrufo
Copy link
Owner

Thanks for bringing this up. Yes, the input variable of the function pdf2bib needs to be a string in the current implementation. You are very welcome to submit a PR with your code.

Probably the simplest way would be to add a code similar to the one that you mentioned at the very beginning of the function, in order to convert a (potential) Pathlib object into a string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants