Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utils restructured in subfolders and removed unused imports #275

Merged
merged 28 commits into from
Sep 27, 2024

Conversation

allaffa
Copy link
Collaborator

@allaffa allaffa commented Aug 19, 2024

No description provided.

@allaffa allaffa added the enhancement New feature or request label Aug 19, 2024
@allaffa allaffa self-assigned this Aug 19, 2024
@allaffa
Copy link
Collaborator Author

allaffa commented Aug 19, 2024

@pzhanggit @jychoi-hpc
I think I restructured all the imports.
Since this is a big PR, please let us take time to review it.
I do not want to merge it before September 30.
However, I would like to make gradual changes based on your feedback during this time.

@allaffa
Copy link
Collaborator Author

allaffa commented Aug 22, 2024

@RylieWeaver @AdithyaRaman @erdemcaliskan

Please let me know if you need help on learning how to review a PR on GitHub.

Copy link
Collaborator

@erdemcaliskan erdemcaliskan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, we have a utils folder in the main and one more in hydragnn/utils which is confusing.

@@ -0,0 +1,19 @@
from .abstractbasedataset import AbstractBaseDataset
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any issues here but everything is repeated since they have the filename import class name which are mostly the same. A small suggestion - this would require a lot of refactoring but using importlib may yield cleaner code here:

import importlib

# List of datasets and their corresponding classes
datasets = [
    ("abstractbasedataset", "AbstractBaseDataset"),
    ("abstractrawdataset", "AbstractRawDataset"),
    ("adiosdataset", ["AdiosDataset", "AdiosWriter"]),
    ("cfgdataset", "CFGDataset"),
    ("distdataset", "DistDataset"),
    ("lsmsdataset", "LSMSDataset"),
    ("pickledataset", ["SimplePickleDataset", "SimplePickleWriter"]),
    ("serializeddataset", ["SerializedDataset", "SerializedWriter"]),
    ("xyzdataset", "XYZDataset"),
]

# Import classes dynamically
for module_name, class_names in datasets:
    module = importlib.import_module(f".{module_name}", package=__name__)
    if isinstance(class_names, list):
        for class_name in class_names:
            globals()[class_name] = getattr(module, class_name)
    else:
        globals()[class_names] = getattr(module, class_names)

# Import functions from compositional_data_splitting
from .compositional_data_splitting import (
    get_keys,
    get_elements_list,
    get_max_graph_size,
    create_dictionary_from_elements_list,
    create_dataset_categories,
    duplicate_unique_data_samples,
    generate_partition,
    compositional_stratified_splitting,
)

So now we can import them in diffent files as shown below:

from datasets import AdiosDataset

adios_dataset = AdiosDataset()

Copy link
Collaborator Author

@allaffa allaffa Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erdemcaliskan
This seems a nice restructuring, but as you said, it also seem to require a substantial restructuring which may deserve a standalone PR to be implemented.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 30 should be from hydragnn.utils.datasets.adiosdataset import AdiosWriter, AdiosDataset

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I installed adios2 from conda forge and tried running this on my local but couldn't run it due to below errors:

python train_ising.py
[2024-08-22 12:08:46,496] [WARNING] [real_accelerator.py:162:get_accelerator] Setting accelerator to CPU. If you have GPU or other accelerator, we were unable to detect it.
[2024-08-22 12:08:46,496] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cpu (auto detect)
Distributed data parallel: gloo master at 127.0.0.1:8889
0: Command: train_ising.py

INFO (rank 0): Adios load
0: Adios reading: /home/ecalisk1/libs/git/HydraGNN/examples/ising_model/./datasets/ising_model_3_1000.bp
Traceback (most recent call last):
File "/home/ecalisk1/libs/git/HydraGNN/examples/ising_model/train_ising.py", line 319, in
trainset = AdiosDataset(fname, "trainset", comm, **opt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ecalisk1/libs/git/HydraGNN/hydragnn/utils/datasets/adiosdataset.py", line 382, in init
with ad2.open(self.filename, "r", self.comm) as f:
^^^^^^^^
AttributeError: module 'adios2' has no attribute 'open'
Exception ignored in: <function AdiosDataset.del at 0x7fa9b20ec220>
Traceback (most recent call last):
File "/home/ecalisk1/libs/git/HydraGNN/hydragnn/utils/datasets/adiosdataset.py", line 733, in del
self.f.close()
^^^^^^
AttributeError: 'AdiosDataset' object has no attribute 'f'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jychoi-hpc
I remember encountering this error before.
May you confirm that this is due to the version of ADIOS2 installed?

Copy link
Collaborator

@AdithyaRaman AdithyaRaman Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allaffa @jychoi-hpc
The adios2 imports inside other examples too need to be updated to :
hydragnn.utils.datasets.adiosdataset import AdiosWriter, AdiosDataset
as pointed out by @erdemcaliskan.

multidataset, multidataset_hpo, alexandria, lsms are some of the examples which have outdated adios2 imports. I believe this line update has been made only for the ising example

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allaffa @jychoi-hpc The adios2 imports inside other examples too need to be updated to : hydragnn.utils.datasets.adiosdataset import AdiosWriter, AdiosDataset as pointed out by @erdemcaliskan.

multidataset, multidataset_hpo, alexandria, lsms are some of the examples which have outdated adios2 imports. I believe this line update has been made only for the ising example

@AdithyaRaman PR #290 addresses the fix of importing correctly the adios data classes also for the other examples.

examples/qm7x/train.py Outdated Show resolved Hide resolved
examples/ogb/train_gap.py Outdated Show resolved Hide resolved
@allaffa
Copy link
Collaborator Author

allaffa commented Sep 3, 2024

Finally, we have a utils folder in the main and one more in hydragnn/utils which is confusing.

@erdemcaliskan
I moved the lsms utils into the hydragnn.utils folder

@allaffa
Copy link
Collaborator Author

allaffa commented Sep 17, 2024

@pzhanggit
I just solved very minor conflicts due to PRs recently merged for the GB effort.
Please remember to send me your feedback by September 30.

@allaffa
Copy link
Collaborator Author

allaffa commented Sep 20, 2024

@RylieWeaver @pzhanggit @jychoi-hpc
I rebased this PR after PR#278 has been merged into the main branch.

@RylieWeaver
Copy link
Collaborator

@RylieWeaver @pzhanggit @jychoi-hpc I rebased this PR after PR#278 has been merged into the main branch.

Good, and I see you restructured the imports in the LennardJones example as well. It's hard to prove a negative, but this PR looks good to me in the parts of the codebase that I am familiar with.

@allaffa
Copy link
Collaborator Author

allaffa commented Sep 22, 2024

@pzhanggit @jychoi-hpc

As planned more than a month ago, the PR will be merged at the end of the upcoming business week unless there are reviews that require additional changes.

@allaffa allaffa merged commit 8e07b6e into ORNL:main Sep 27, 2024
2 checks passed
@allaffa allaffa deleted the organize_utils branch September 27, 2024 18:39
RylieWeaver pushed a commit to RylieWeaver/HydraGNN that referenced this pull request Oct 17, 2024
* utils renamed and black formatting applied

* bug fixed solved for tests

* black formatting fixed

* examples corrected

* test_model_loadpred.py fixed

* black formatting fixed

* test_loss_and_activation_functions.py fixed

* black formatting fixed

* reverting inadvertent automated refactoring of dataset forlder into datasets

* reverting inadvertent automated refactoring of dataset forlder into datasets

* reverting inadvertent automated refactoring of dataset forlder into datasets

* reverting inadvertent automated refactoring of dataset forlder into datasets

* reverting inadvertent automated refactoring of hydragnn into hhydragnn package

* reverting inadvertent automated refactoring of dataset forlder into datasets

* reverting inadvertent automated refactoring of dataset forlder into datasets

* reverting inadvertent automated refactoring of dataset forlder into datasets

* git formatting fixed

* Adagrad converted to Adamax

* Additional changes to fix bugs and suggestions from erdem

* imports fixed for LennardJones example

* formatting fixed

* imports in LJ_data.py fixed

* import of graph utils fixed in LJ_data.py

* import of setup.ddp() fixed in LennardJones

* setup_log call fixed

* get_summary_writer call fixed

* additional calls fixed

* black formatting fixedf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants