Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize job __hash__ and __eq__ checks. #442

Merged
merged 8 commits into from
Dec 29, 2020
9 changes: 9 additions & 0 deletions changelog.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ The **signac** package follows `semantic versioning <https://semver.org/>`_.
Version 1
=========

next
----

Changed
+++++++

- Optimized job hash and equality checks (#442).


[1.5.1] -- 2020-12-19
---------------------

Expand Down
6 changes: 4 additions & 2 deletions signac/contrib/job.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ def id(self):
return self._id

def __hash__(self):
return hash(os.path.realpath(self._wd))
return hash(self.id)

def __str__(self):
"""Return the job's id."""
Expand All @@ -146,7 +146,9 @@ def __repr__(self):
)

def __eq__(self, other):
return hash(self) == hash(other)
return (hash(self) == hash(other)) and (
os.path.realpath(self._wd) == os.path.realpath(other._wd)
)

def workspace(self):
"""Return the job's unique workspace directory.
Expand Down
4 changes: 0 additions & 4 deletions tests/test_project.py
Original file line number Diff line number Diff line change
Expand Up @@ -655,7 +655,6 @@ def test_job_move(self):
job = project_a.open_job(dict(a=0))
job_b = project_b.open_job(dict(a=0))
assert job != job_b
assert hash(job) != hash(job_b)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this test fails under the new changes since it only failed since the working directories for job and job_b were different

Copy link
Member Author

@bdice bdice Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this fails because of the relaxed definition of the hash function. job == job_b implies that hash(job) == hash(job_b), but the logical inverse (job != job_bhash(job) != hash(job_b)) does not hold. In the Python data model, it is fine for objects to be non-equal and have the same hash.

assert job not in project_a
assert job not in project_b
job.init()
Expand All @@ -665,16 +664,13 @@ def test_job_move(self):
assert job in project_b
assert job not in project_a
assert job == job_b
assert hash(job) == hash(job_b)
bdice marked this conversation as resolved.
Show resolved Hide resolved
with job:
job.document["a"] = 0
with open("hello.txt", "w") as file:
file.write("world!")
job_ = project_b.open_job(job.statepoint())
assert job == job_
assert hash(job) == hash(job_)
assert job_ == job_b
assert hash(job_) == hash(job_b)
assert job_.isfile("hello.txt")
assert job_.document["a"] == 0

Expand Down