Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add pathlib.Path interface #43

Merged
merged 1 commit into from
Aug 14, 2023
Merged

Conversation

chrisjsewell
Copy link
Contributor

The Firecrest class is good, but it lacks a simple, programmatic, way to interact with the file system, consistent with how you would interact with a local file system.

This PR adds the FcPath class, which directly mimics the https://docs.python.org/3/library/pathlib.html interface, to provide such an interface. For example:

In [1]: from firecrest.path import FcPath
In [2]: root = FcPath.from_env_variables("cluster", "/home/service-account-firecrest-sample")
In [3]: path = root / "test.txt"
In [4]: path
Out[4]: FcPath('http://localhost:8000/', 'cluster', '/home/service-account-firecrest-sample/test.txt')
In [5]: path.exists()
Out[5]: False
In [6]: path.write_text("Hallo!")
In [7]: path.is_file()
Out[7]: True
In [8]: path.read_text()
Out[8]: 'Hallo!'
In [9]: path.checksum()
Out[9]: '357a57fe73d6c63bb1923e970f79a924db3a3609eb9797547784a94ee38fc3df'
In [10]: list(root.iterdir())
Out[10]: 
[FcPath('http://localhost:8000/', 'cluster', '/home/service-account-firecrest-sample/.bash_logout'),
 FcPath('http://localhost:8000/', 'cluster', '/home/service-account-firecrest-sample/.bash_profile'),
 FcPath('http://localhost:8000/', 'cluster', '/home/service-account-firecrest-sample/.bashrc'),
 FcPath('http://localhost:8000/', 'cluster', '/home/service-account-firecrest-sample/test.txt')]

In order to limit calls to the REST API, I have also added the concept of caching.
For example, when using list_files, the generated paths have the mode cached:

In [11]: for p in root.iterdir():
    ...:     print(p._cache)
    ...: 
_Cache(st_mode=None, lst_mode=33188, link_target='')
_Cache(st_mode=None, lst_mode=33188, link_target='')
_Cache(st_mode=None, lst_mode=33188, link_target='')
_Cache(st_mode=None, lst_mode=33204, link_target='')
_Cache(st_mode=None, lst_mode=33204, link_target='')

You have to specifically enable use of the cache, then methods like is_file, is_dir will use this cached value.
For example, this requires only a single API call:

In [12]: root.cache_enabled = True
In [13]: for p in root.iterdir():
    ...:     print(p.is_file())
    ...: 
True
True
True
True
True

I have also created https://pypi.org/project/virtual_glob/ which can work with this implementation to provide recursive globbing:

In [14]: from virtual_glob import glob
In [15]: root.cache_enabled = True
In [16]: for p in glob(path, "**/job.sh"):
    ...:     print(p.path)

I haven't added any pytest tests in the PR yet, but will do if you agree that this should go in to this package


Note, one pain point at the moment is eth-cscs/firecrest#171, which means you can't currently get the file mode from the /utilities/stat API, and have to use /utilities/ls to workaround it.
Hopefully this will be fixed soon 🤞

@chrisjsewell
Copy link
Contributor Author

Oh another thing I'd note, is that I added a dependency on typing_extensions for python < 3.11, to use https://peps.python.org/pep-0673/

Its not strictly necessary, but it works well with #42

@chrisjsewell
Copy link
Contributor Author

Note, I made some fixes/improvements to this in https://github.com/aiidateam/aiida-firecrest/blob/main/aiida_firecrest/remote_path.py, so should get round to up-streaming them soon

@ekouts
Copy link
Collaborator

ekouts commented Aug 2, 2023

Hi @chrisjsewell , this look good already! Do you have time to push the fixes/improvements upstream and add maybe some small unittests? The most important part is the fixes 😄 If not, I can try to finalize the PR myself and merge it and we can make new PRs for further improvements.

@chrisjsewell
Copy link
Contributor Author

Heya, feel free to have a look at finalizing it yourself. I will try to have a look at some point, but can't promise I will have time right now 😅

@ekouts
Copy link
Collaborator

ekouts commented Aug 7, 2023

No problem, thanks for your efforts anyway! We will try to merge it this week.

Copy link
Collaborator

@ekouts ekouts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge and add fixes/unittests in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants