-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cherry-pick 'Refactor virtual site handler' (#1252 and #1284) #1285
Conversation
* Refactor virtual site handler * Attempts to make virtual site handler more resilient through code simplification. * Virtual sites are now associated with a particular 'parent' atom, rather than with a set of atoms. In particular, when checking if a v-site has been assigned we now only check the main 'parent' atom associated with the v-site, rather than all additional orientation atoms. As an example, if a force field contained a bond-charge v-site that matches [O:1]=[C:2] and a monovalent lone pair that matches [O:1]=[C:2]-[*:3] in that order, then only the monovalent lone pair will be assigned to formaldehyde as the oxygen is the main atom that would be associated with both v-sites, and the monovalent lone pair appears later in the hierarchy. This constitutes a behaviour change over previous versions. * checks have been added to enforce that the 'match' keyword complies with the SMIRNOFF spec. * Molecule virtual site classes have been removed * Sanity checks have been added when matching chemical environments for v-sites that ensure the environment looks like one of our expected test cases. * Fixes di(try)talent lone pairs mixing the :1 and :2 indices. * Fixes trivalent v-site positioning.
for more information, see https://pre-commit.ci
Will #1284 be included as well? |
It will indeed! Updated the PR to reflect this. |
Codecov Report
|
This would be pretty hard to review, and I'm not really sure where to start. I'd personally be happy giving this an "approval" on the basis that the tests are passing (failures I saw were unrelated) and you're using some form of this in fitting experiments now. The only exception might be I haven't updated Interchange to use this code yet, but I think it would be the wrong order of operations to use that development as a testing ground for this. I'd like to have this merged more or less as-is, and if I find changes are needed we can figure it out from there. Does this sound like an okay plan to you (and @j-wags)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for leading this @SimonBoothroyd and for engaging @mattwthompson. I could use more time to review this for unintended interactions with the topology refactor. I'm happy to be the only reviewer and take ownership of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jotting this down because it's not evident in the diffs - There are two super important tests in #1252 that aren't included here, and I think the intent was that they'd get moved over to interchange. They are linked here:
openff-toolkit/openff/toolkit/tests/test_parameters.py
Lines 2445 to 2772 in 9b319c0
@pytest.mark.parametrize( | |
"parameter, smiles, input_conformer, " "atoms_to_shuffle, expected_coordinates", | |
[ | |
( | |
VirtualSiteMocking.bond_charge_parameter("[Cl:1]-[C:2]"), | |
"[Cl:1][C:2]([H:3])([H:4])[H:5]", | |
VirtualSiteMocking.sp3_conformer(), | |
(0, 1), | |
numpy.array([[0.0, 3.0, 0.0]]) * unit.angstrom, | |
), | |
( | |
VirtualSiteMocking.bond_charge_parameter("[C:1]#[C:2]"), | |
"[H:1][C:2]#[C:3][C:4]", | |
VirtualSiteMocking.sp1_conformer(), | |
(2, 3), | |
numpy.array([[-3.0, 0.0, 0.0], [3.0, 0.0, 0.0]]) * unit.angstrom, | |
), | |
( | |
VirtualSiteMocking.monovalent_parameter("[O:1]=[C:2]-[H:3]"), | |
"[O:1]=[C:2]([H:3])[H:4]", | |
VirtualSiteMocking.sp2_conformer(), | |
(0, 1, 2, 3), | |
( | |
VirtualSiteMocking.sp2_conformer()[0] | |
+ numpy.array( | |
[[1.0, numpy.sqrt(2), 1.0], [1.0, -numpy.sqrt(2), -1.0]] | |
) | |
* unit.angstrom | |
), | |
), | |
( | |
VirtualSiteMocking.divalent_parameter( | |
"[H:2][O:1][H:3]", match="once", angle=0.0 * unit.degree | |
), | |
"[H:2][O:1][H:3]", | |
VirtualSiteMocking.sp2_conformer()[1:, :], | |
(0, 1, 2), | |
numpy.array([[2.0, 0.0, 0.0]]) * unit.angstrom, | |
), | |
( | |
VirtualSiteMocking.divalent_parameter( | |
"[H:2][O:1][H:3]", | |
match="all_permutations", | |
angle=45.0 * unit.degree, | |
), | |
"[H:2][O:1][H:3]", | |
VirtualSiteMocking.sp2_conformer()[1:, :], | |
(0, 1, 2), | |
numpy.array( | |
[ | |
[numpy.sqrt(2), numpy.sqrt(2), 0.0], | |
[numpy.sqrt(2), -numpy.sqrt(2), 0.0], | |
] | |
) | |
* unit.angstrom, | |
), | |
( | |
VirtualSiteMocking.trivalent_parameter("[N:1]([H:2])([H:3])[H:4]"), | |
"[N:1]([H:2])([H:3])[H:4]", | |
VirtualSiteMocking.sp3_conformer()[1:, :], | |
(0, 1, 2, 3), | |
numpy.array([[0.0, 2.0, 0.0]]) * unit.angstrom, | |
), | |
], | |
) | |
def test_v_site_geometry( | |
self, | |
parameter: VirtualSiteHandler.VirtualSiteType, | |
smiles: str, | |
input_conformer: unit.Quantity, | |
atoms_to_shuffle: Tuple[int, ...], | |
expected_coordinates: unit.Quantity, | |
): | |
"""An integration test that virtual sites are placed correctly relative to the | |
parent atoms""" | |
for atom_permutation in itertools.permutations(atoms_to_shuffle): | |
molecule = Molecule.from_mapped_smiles(smiles, allow_undefined_stereo=True) | |
molecule._conformers = [input_conformer] | |
# We shuffle the ordering of the atoms involved in the v-site orientation | |
# to ensure that the orientation remains invariant as expected. | |
shuffled_atom_order = {i: i for i in range(molecule.n_atoms)} | |
shuffled_atom_order.update( | |
{ | |
old_index: new_index | |
for old_index, new_index in zip(atoms_to_shuffle, atom_permutation) | |
} | |
) | |
molecule = molecule.remap(shuffled_atom_order) | |
output_coordinates = self.generate_v_site_coordinates( | |
molecule, molecule.conformers[0], parameter | |
) | |
assert output_coordinates.shape == expected_coordinates.shape | |
def sort_coordinates(x: numpy.ndarray) -> numpy.ndarray: | |
# Sort the rows by first, then second, then third columns as row | |
# as the order of v-sites is not deterministic. | |
return x[numpy.lexsort((x[:, 2], x[:, 1], x[:, 0])), :] | |
assert numpy.allclose( | |
sort_coordinates(output_coordinates.value_in_unit(unit.angstrom)), | |
sort_coordinates(expected_coordinates.value_in_unit(unit.angstrom)), | |
) | |
_E = unit.elementary_charge | |
_A = unit.angstrom | |
_KJ = unit.kilojoule_per_mole | |
@pytest.mark.parametrize( | |
"topology, parameters, expected_parameters, expected_n_v_sites", | |
[ | |
( | |
Topology.from_molecules( | |
[ | |
VirtualSiteMocking.chloromethane(reverse=False), | |
VirtualSiteMocking.chloromethane(reverse=True), | |
] | |
), | |
[VirtualSiteMocking.bond_charge_parameter("[Cl:1][C:2]")], | |
[ | |
# charge, sigma, epsilon | |
(0.1 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.2 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.2 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.1 * _E, 10.0 * _A, 0.0 * _KJ), | |
(-0.3 * _E, 4.0 * _A, 3.0 * _KJ), | |
(-0.3 * _E, 4.0 * _A, 3.0 * _KJ), | |
], | |
2, | |
), | |
# Check a complex case of bond charge vsite assignment and overwriting | |
( | |
Topology.from_molecules( | |
[ | |
VirtualSiteMocking.formaldehyde(reverse=False), | |
VirtualSiteMocking.formaldehyde(reverse=True), | |
] | |
), | |
[ | |
VirtualSiteMocking.bond_charge_parameter("[O:1]=[*:2]"), | |
VirtualSiteMocking.bond_charge_parameter( | |
"[O:1]=[C:2]", param_multiple=1.5 | |
), | |
VirtualSiteMocking.bond_charge_parameter( | |
"[O:1]=[CX3:2]", param_multiple=2.0, name="EP2" | |
), | |
], | |
[ | |
# charge, sigma, epsilon | |
(0.35 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.7 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.0 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.7 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.35 * _E, 10.0 * _A, 0.0 * _KJ), | |
(-0.45 * _E, 6.0 * _A, 4.5 * _KJ), # C=O vsite | |
(-0.6 * _E, 8.0 * _A, 6.0 * _KJ), # CX3=O vsite | |
(-0.45 * _E, 6.0 * _A, 4.5 * _KJ), # C=O vsite | |
(-0.6 * _E, 8.0 * _A, 6.0 * _KJ), # CX3=O vsite | |
], | |
4, | |
), | |
( | |
Topology.from_molecules( | |
[ | |
VirtualSiteMocking.formaldehyde(reverse=False), | |
VirtualSiteMocking.formaldehyde(reverse=True), | |
] | |
), | |
[VirtualSiteMocking.monovalent_parameter("[O:1]=[C:2]-[H:3]")], | |
[ | |
# charge, sigma, epsilon | |
(0.2 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.4 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.3 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.3 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.3 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.3 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.4 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.2 * _E, 10.0 * _A, 0.0 * _KJ), | |
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ), | |
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ), | |
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ), | |
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ), | |
], | |
4, | |
), | |
( | |
Topology.from_molecules( | |
[ | |
VirtualSiteMocking.hypochlorous_acid(reverse=False), | |
VirtualSiteMocking.hypochlorous_acid(reverse=True), | |
] | |
), | |
[ | |
VirtualSiteMocking.divalent_parameter( | |
"[H:2][O:1][Cl:3]", match="all_permutations" | |
) | |
], | |
[ | |
# charge, sigma, epsilon | |
(0.2 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.1 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.3 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.3 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.1 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.2 * _E, 10.0 * _A, 0.0 * _KJ), | |
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ), | |
(-0.6 * _E, 4.0 * _A, 5.0 * _KJ), | |
], | |
2, | |
), | |
( | |
Topology.from_molecules( | |
[ | |
VirtualSiteMocking.fake_ammonia(reverse=False), | |
VirtualSiteMocking.fake_ammonia(reverse=True), | |
] | |
), | |
[VirtualSiteMocking.trivalent_parameter("[N:1]([Br:2])([Cl:3])[H:4]")], | |
[ | |
# charge, sigma, epsilon | |
(0.1 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.3 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.2 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.4 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.4 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.2 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.3 * _E, 10.0 * _A, 0.0 * _KJ), | |
(0.1 * _E, 10.0 * _A, 0.0 * _KJ), | |
(-1.0 * _E, 5.0 * _A, 6.0 * _KJ), | |
(-1.0 * _E, 5.0 * _A, 6.0 * _KJ), | |
], | |
2, | |
), | |
], | |
) | |
def test_create_force( | |
self, | |
topology: Topology, | |
parameters: List[VirtualSiteHandler.VirtualSiteType], | |
expected_parameters: List[Tuple[unit.Quantity, unit.Quantity, unit.Quantity]], | |
expected_n_v_sites: int, | |
): | |
expected_n_total = topology.n_topology_atoms + expected_n_v_sites | |
# sanity check the test input | |
assert len(expected_parameters) == expected_n_total | |
handler = VirtualSiteHandler(version="0.3") | |
for parameter in parameters: | |
handler.add_parameter(parameter=parameter) | |
system = openmm.System() | |
for _ in range(topology.n_topology_atoms): | |
system.addParticle(10.0 * unit.amu) | |
handler.create_force(system, topology) | |
assert system.getNumParticles() == expected_n_total | |
assert ( | |
sum(v_site.n_particles for v_site in topology.topology_virtual_sites) | |
== expected_n_v_sites | |
) | |
assert all( | |
not system.isVirtualSite(i) for i in range(topology.n_topology_atoms) | |
) | |
assert all( | |
system.isVirtualSite(i) | |
for i in range(topology.n_topology_atoms, expected_n_v_sites) | |
) | |
assert system.getNumForces() == 1 | |
force: openmm.NonbondedForce = next(iter(system.getForces())) | |
total_charge = 0.0 * unit.elementary_charge | |
for i, (expected_charge, expected_sigma, expected_epsilon) in enumerate( | |
expected_parameters | |
): | |
charge, sigma, epsilon = force.getParticleParameters(i) | |
# Make sure v-sites are massless. | |
assert numpy.isclose( | |
system.getParticleMass(i).value_in_unit(unit.amu), 0.0 | |
) == (False if i < topology.n_topology_atoms else True) | |
assert numpy.isclose( | |
expected_charge.value_in_unit(expected_charge.unit), | |
charge.value_in_unit(expected_charge.unit), | |
) | |
assert numpy.isclose( | |
expected_sigma.value_in_unit(expected_sigma.unit), | |
sigma.value_in_unit(expected_sigma.unit), | |
) | |
assert numpy.isclose( | |
expected_epsilon.value_in_unit(expected_epsilon.unit), | |
epsilon.value_in_unit(expected_epsilon.unit), | |
) | |
total_charge += charge | |
expected_total_charge = sum( | |
molecule.reference_molecule.total_charge.value_in_unit( | |
unit.elementary_charge | |
) | |
for molecule in topology.topology_molecules | |
) | |
assert numpy.isclose( | |
expected_total_charge, total_charge.value_in_unit(unit.elementary_charge) | |
) |
Ok, and resolving the failing example will also require some coordination with the interchange implementation. I'll work on this with @mattwthompson during our check in tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright - @mattwthompson and I just met to discuss how to move forward with this PR. We decided to merge immediately so that we avoid recursive development and have one less degree in the branch-of-a-branch-of-a-brach style development we're doing now. Matt will get vsites implemented in interchange based on current state of this branch.
There are two additional to-do items that need to be resolved before we make the 0.11.0 release, and I've added them as incomplete checklist items in the main text of this PR. Those will get inherited into the topology-biopolymer-refactor PR and marked as blocking so we make sure to resolve them before release.
Description
This PR cherry-picks the relevant parts of #1252 and #1284 into the biopolymer topology branch. The
create_force
changes and associated tests are omitted as this code will now live inopenff-interchange
.Further, this PR drops all remaining v-site classes as v-sites will now be directly accessed on
Interchange
objects as they exist on a 'post-parameterisation' topology.This PR is expected to be associated with one in the
openff-interchange
repo that moves the newcreate_force
code there.Remaining to-dos:
test_parameters
highlighted at this link get ported to Interchange