Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-pick 'Refactor virtual site handler' (#1252 and #1284) #1285

Merged
merged 12 commits into from
May 6, 2022

Conversation

SimonBoothroyd
Copy link
Contributor

@SimonBoothroyd SimonBoothroyd commented Apr 25, 2022

Description

This PR cherry-picks the relevant parts of #1252 and #1284 into the biopolymer topology branch. The create_force changes and associated tests are omitted as this code will now live in openff-interchange.

Further, this PR drops all remaining v-site classes as v-sites will now be directly accessed on Interchange objects as they exist on a 'post-parameterisation' topology.

This PR is expected to be associated with one in the openff-interchange repo that moves the new create_force code there.

Remaining to-dos:

  • Ensure that tests from test_parameters highlighted at this link get ported to Interchange
  • Re-enable vsite_showcase notebook and get it to a working state

SimonBoothroyd and others added 2 commits April 25, 2022 13:59
* Refactor virtual site handler

* Attempts to make virtual site handler more resilient through code simplification.
* Virtual sites are now associated with a particular 'parent' atom, rather than
  with a set of atoms. In particular, when checking if a v-site has been assigned
  we now only check the main 'parent' atom associated with the v-site, rather than
  all additional orientation atoms. As an example, if a force field contained a
  bond-charge v-site that matches [O:1]=[C:2] and a monovalent lone pair that matches
  [O:1]=[C:2]-[*:3] in that order, then only the monovalent lone pair will be assigned
  to formaldehyde as the oxygen is the main atom that would be associated with both v-sites,
  and the monovalent lone pair appears later in the hierarchy. This constitutes a behaviour
  change over previous versions.
* checks have been added to enforce that the 'match' keyword complies with the SMIRNOFF
  spec.
* Molecule virtual site classes have been removed
* Sanity checks have been added when matching chemical environments for v-sites that ensure
  the environment looks like one of our expected test cases.
* Fixes di(try)talent lone pairs mixing the :1 and :2 indices.
* Fixes trivalent v-site positioning.
@mattwthompson
Copy link
Member

mattwthompson commented Apr 25, 2022

Will #1284 be included as well?

@SimonBoothroyd SimonBoothroyd changed the title Cherry-pick 'Refactor virtual site handler' (#1252) Cherry-pick 'Refactor virtual site handler' (#1252 and #1284) Apr 25, 2022
@SimonBoothroyd
Copy link
Contributor Author

Will #1284 be included as well?

It will indeed! Updated the PR to reflect this.

@codecov
Copy link

codecov bot commented Apr 25, 2022

Codecov Report

❗ No coverage uploaded for pull request base (topology-biopolymer-refactor@0107544). Click here to learn what that means.
The diff coverage is n/a.

@mattwthompson
Copy link
Member

This would be pretty hard to review, and I'm not really sure where to start. I'd personally be happy giving this an "approval" on the basis that the tests are passing (failures I saw were unrelated) and you're using some form of this in fitting experiments now. The only exception might be examples/virtual_sites/vsite_showcase.ipynb, I don't know the plans there but I'd be okay with removing that, moving it to a deprecated path, etc.

I haven't updated Interchange to use this code yet, but I think it would be the wrong order of operations to use that development as a testing ground for this. I'd like to have this merged more or less as-is, and if I find changes are needed we can figure it out from there.

Does this sound like an okay plan to you (and @j-wags)?

Copy link
Member

@j-wags j-wags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for leading this @SimonBoothroyd and for engaging @mattwthompson. I could use more time to review this for unintended interactions with the topology refactor. I'm happy to be the only reviewer and take ownership of this PR.

@j-wags j-wags requested review from j-wags and removed request for mattwthompson April 28, 2022 04:22
Copy link
Member

@j-wags j-wags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jotting this down because it's not evident in the diffs - There are two super important tests in #1252 that aren't included here, and I think the intent was that they'd get moved over to interchange. They are linked here:

@pytest.mark.parametrize(
"parameter, smiles, input_conformer, " "atoms_to_shuffle, expected_coordinates",
[
(
VirtualSiteMocking.bond_charge_parameter("[Cl:1]-[C:2]"),
"[Cl:1][C:2]([H:3])([H:4])[H:5]",
VirtualSiteMocking.sp3_conformer(),
(0, 1),
numpy.array([[0.0, 3.0, 0.0]]) * unit.angstrom,
),
(
VirtualSiteMocking.bond_charge_parameter("[C:1]#[C:2]"),
"[H:1][C:2]#[C:3][C:4]",
VirtualSiteMocking.sp1_conformer(),
(2, 3),
numpy.array([[-3.0, 0.0, 0.0], [3.0, 0.0, 0.0]]) * unit.angstrom,
),
(
VirtualSiteMocking.monovalent_parameter("[O:1]=[C:2]-[H:3]"),
"[O:1]=[C:2]([H:3])[H:4]",
VirtualSiteMocking.sp2_conformer(),
(0, 1, 2, 3),
(
VirtualSiteMocking.sp2_conformer()[0]
+ numpy.array(
[[1.0, numpy.sqrt(2), 1.0], [1.0, -numpy.sqrt(2), -1.0]]
)
* unit.angstrom
),
),
(
VirtualSiteMocking.divalent_parameter(
"[H:2][O:1][H:3]", match="once", angle=0.0 * unit.degree
),
"[H:2][O:1][H:3]",
VirtualSiteMocking.sp2_conformer()[1:, :],
(0, 1, 2),
numpy.array([[2.0, 0.0, 0.0]]) * unit.angstrom,
),
(
VirtualSiteMocking.divalent_parameter(
"[H:2][O:1][H:3]",
match="all_permutations",
angle=45.0 * unit.degree,
),
"[H:2][O:1][H:3]",
VirtualSiteMocking.sp2_conformer()[1:, :],
(0, 1, 2),
numpy.array(
[
[numpy.sqrt(2), numpy.sqrt(2), 0.0],
[numpy.sqrt(2), -numpy.sqrt(2), 0.0],
]
)
* unit.angstrom,
),
(
VirtualSiteMocking.trivalent_parameter("[N:1]([H:2])([H:3])[H:4]"),
"[N:1]([H:2])([H:3])[H:4]",
VirtualSiteMocking.sp3_conformer()[1:, :],
(0, 1, 2, 3),
numpy.array([[0.0, 2.0, 0.0]]) * unit.angstrom,
),
],
)
def test_v_site_geometry(
self,
parameter: VirtualSiteHandler.VirtualSiteType,
smiles: str,
input_conformer: unit.Quantity,
atoms_to_shuffle: Tuple[int, ...],
expected_coordinates: unit.Quantity,
):
"""An integration test that virtual sites are placed correctly relative to the
parent atoms"""
for atom_permutation in itertools.permutations(atoms_to_shuffle):
molecule = Molecule.from_mapped_smiles(smiles, allow_undefined_stereo=True)
molecule._conformers = [input_conformer]
# We shuffle the ordering of the atoms involved in the v-site orientation
# to ensure that the orientation remains invariant as expected.
shuffled_atom_order = {i: i for i in range(molecule.n_atoms)}
shuffled_atom_order.update(
{
old_index: new_index
for old_index, new_index in zip(atoms_to_shuffle, atom_permutation)
}
)
molecule = molecule.remap(shuffled_atom_order)
output_coordinates = self.generate_v_site_coordinates(
molecule, molecule.conformers[0], parameter
)
assert output_coordinates.shape == expected_coordinates.shape
def sort_coordinates(x: numpy.ndarray) -> numpy.ndarray:
# Sort the rows by first, then second, then third columns as row
# as the order of v-sites is not deterministic.
return x[numpy.lexsort((x[:, 2], x[:, 1], x[:, 0])), :]
assert numpy.allclose(
sort_coordinates(output_coordinates.value_in_unit(unit.angstrom)),
sort_coordinates(expected_coordinates.value_in_unit(unit.angstrom)),
)
_E = unit.elementary_charge
_A = unit.angstrom
_KJ = unit.kilojoule_per_mole
@pytest.mark.parametrize(
"topology, parameters, expected_parameters, expected_n_v_sites",
[
(
Topology.from_molecules(
[
VirtualSiteMocking.chloromethane(reverse=False),
VirtualSiteMocking.chloromethane(reverse=True),
]
),
[VirtualSiteMocking.bond_charge_parameter("[Cl:1][C:2]")],
[
# charge, sigma, epsilon
(0.1 * _E, 10.0 * _A, 0.0 * _KJ),
(0.2 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.2 * _E, 10.0 * _A, 0.0 * _KJ),
(0.1 * _E, 10.0 * _A, 0.0 * _KJ),
(-0.3 * _E, 4.0 * _A, 3.0 * _KJ),
(-0.3 * _E, 4.0 * _A, 3.0 * _KJ),
],
2,
),
# Check a complex case of bond charge vsite assignment and overwriting
(
Topology.from_molecules(
[
VirtualSiteMocking.formaldehyde(reverse=False),
VirtualSiteMocking.formaldehyde(reverse=True),
]
),
[
VirtualSiteMocking.bond_charge_parameter("[O:1]=[*:2]"),
VirtualSiteMocking.bond_charge_parameter(
"[O:1]=[C:2]", param_multiple=1.5
),
VirtualSiteMocking.bond_charge_parameter(
"[O:1]=[CX3:2]", param_multiple=2.0, name="EP2"
),
],
[
# charge, sigma, epsilon
(0.35 * _E, 10.0 * _A, 0.0 * _KJ),
(0.7 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.0 * _E, 10.0 * _A, 0.0 * _KJ),
(0.7 * _E, 10.0 * _A, 0.0 * _KJ),
(0.35 * _E, 10.0 * _A, 0.0 * _KJ),
(-0.45 * _E, 6.0 * _A, 4.5 * _KJ), # C=O vsite
(-0.6 * _E, 8.0 * _A, 6.0 * _KJ), # CX3=O vsite
(-0.45 * _E, 6.0 * _A, 4.5 * _KJ), # C=O vsite
(-0.6 * _E, 8.0 * _A, 6.0 * _KJ), # CX3=O vsite
],
4,
),
(
Topology.from_molecules(
[
VirtualSiteMocking.formaldehyde(reverse=False),
VirtualSiteMocking.formaldehyde(reverse=True),
]
),
[VirtualSiteMocking.monovalent_parameter("[O:1]=[C:2]-[H:3]")],
[
# charge, sigma, epsilon
(0.2 * _E, 10.0 * _A, 0.0 * _KJ),
(0.4 * _E, 10.0 * _A, 0.0 * _KJ),
(0.3 * _E, 10.0 * _A, 0.0 * _KJ),
(0.3 * _E, 10.0 * _A, 0.0 * _KJ),
(0.3 * _E, 10.0 * _A, 0.0 * _KJ),
(0.3 * _E, 10.0 * _A, 0.0 * _KJ),
(0.4 * _E, 10.0 * _A, 0.0 * _KJ),
(0.2 * _E, 10.0 * _A, 0.0 * _KJ),
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ),
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ),
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ),
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ),
],
4,
),
(
Topology.from_molecules(
[
VirtualSiteMocking.hypochlorous_acid(reverse=False),
VirtualSiteMocking.hypochlorous_acid(reverse=True),
]
),
[
VirtualSiteMocking.divalent_parameter(
"[H:2][O:1][Cl:3]", match="all_permutations"
)
],
[
# charge, sigma, epsilon
(0.2 * _E, 10.0 * _A, 0.0 * _KJ),
(0.1 * _E, 10.0 * _A, 0.0 * _KJ),
(0.3 * _E, 10.0 * _A, 0.0 * _KJ),
(0.3 * _E, 10.0 * _A, 0.0 * _KJ),
(0.1 * _E, 10.0 * _A, 0.0 * _KJ),
(0.2 * _E, 10.0 * _A, 0.0 * _KJ),
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ),
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ),
],
2,
),
(
Topology.from_molecules(
[
VirtualSiteMocking.fake_ammonia(reverse=False),
VirtualSiteMocking.fake_ammonia(reverse=True),
]
),
[VirtualSiteMocking.trivalent_parameter("[N:1]([Br:2])([Cl:3])[H:4]")],
[
# charge, sigma, epsilon
(0.1 * _E, 10.0 * _A, 0.0 * _KJ),
(0.3 * _E, 10.0 * _A, 0.0 * _KJ),
(0.2 * _E, 10.0 * _A, 0.0 * _KJ),
(0.4 * _E, 10.0 * _A, 0.0 * _KJ),
(0.4 * _E, 10.0 * _A, 0.0 * _KJ),
(0.2 * _E, 10.0 * _A, 0.0 * _KJ),
(0.3 * _E, 10.0 * _A, 0.0 * _KJ),
(0.1 * _E, 10.0 * _A, 0.0 * _KJ),
(-1.0 * _E, 5.0 * _A, 6.0 * _KJ),
(-1.0 * _E, 5.0 * _A, 6.0 * _KJ),
],
2,
),
],
)
def test_create_force(
self,
topology: Topology,
parameters: List[VirtualSiteHandler.VirtualSiteType],
expected_parameters: List[Tuple[unit.Quantity, unit.Quantity, unit.Quantity]],
expected_n_v_sites: int,
):
expected_n_total = topology.n_topology_atoms + expected_n_v_sites
# sanity check the test input
assert len(expected_parameters) == expected_n_total
handler = VirtualSiteHandler(version="0.3")
for parameter in parameters:
handler.add_parameter(parameter=parameter)
system = openmm.System()
for _ in range(topology.n_topology_atoms):
system.addParticle(10.0 * unit.amu)
handler.create_force(system, topology)
assert system.getNumParticles() == expected_n_total
assert (
sum(v_site.n_particles for v_site in topology.topology_virtual_sites)
== expected_n_v_sites
)
assert all(
not system.isVirtualSite(i) for i in range(topology.n_topology_atoms)
)
assert all(
system.isVirtualSite(i)
for i in range(topology.n_topology_atoms, expected_n_v_sites)
)
assert system.getNumForces() == 1
force: openmm.NonbondedForce = next(iter(system.getForces()))
total_charge = 0.0 * unit.elementary_charge
for i, (expected_charge, expected_sigma, expected_epsilon) in enumerate(
expected_parameters
):
charge, sigma, epsilon = force.getParticleParameters(i)
# Make sure v-sites are massless.
assert numpy.isclose(
system.getParticleMass(i).value_in_unit(unit.amu), 0.0
) == (False if i < topology.n_topology_atoms else True)
assert numpy.isclose(
expected_charge.value_in_unit(expected_charge.unit),
charge.value_in_unit(expected_charge.unit),
)
assert numpy.isclose(
expected_sigma.value_in_unit(expected_sigma.unit),
sigma.value_in_unit(expected_sigma.unit),
)
assert numpy.isclose(
expected_epsilon.value_in_unit(expected_epsilon.unit),
epsilon.value_in_unit(expected_epsilon.unit),
)
total_charge += charge
expected_total_charge = sum(
molecule.reference_molecule.total_charge.value_in_unit(
unit.elementary_charge
)
for molecule in topology.topology_molecules
)
assert numpy.isclose(
expected_total_charge, total_charge.value_in_unit(unit.elementary_charge)
)

@j-wags
Copy link
Member

j-wags commented May 6, 2022

Ok, and resolving the failing example will also require some coordination with the interchange implementation. I'll work on this with @mattwthompson during our check in tomorrow.

Copy link
Member

@j-wags j-wags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright - @mattwthompson and I just met to discuss how to move forward with this PR. We decided to merge immediately so that we avoid recursive development and have one less degree in the branch-of-a-branch-of-a-brach style development we're doing now. Matt will get vsites implemented in interchange based on current state of this branch.

There are two additional to-do items that need to be resolved before we make the 0.11.0 release, and I've added them as incomplete checklist items in the main text of this PR. Those will get inherited into the topology-biopolymer-refactor PR and marked as blocking so we make sure to resolve them before release.

@j-wags j-wags merged commit 826e937 into topology-biopolymer-refactor May 6, 2022
@j-wags j-wags deleted the cherry-pick-1252 branch May 6, 2022 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants