Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1206 +/- ##
==========================================
+ Coverage 93.87% 93.89% +0.01%
==========================================
Files 72 72
Lines 6258 6275 +17
==========================================
+ Hits 5875 5892 +17
Misses 383 383 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
for more information, see https://pre-commit.ci
| match="all_permutations", | ||
| distance="0.8 * angstrom ** 1", | ||
| charge_increment1="0.0 * elementary_charge ** 1", | ||
| charge_increment1="0.123 * elementary_charge ** 1", |
There was a problem hiding this comment.
This needed to be a nonzero value for a nagl test, happy to make a separate fixture or put this in the test directly to avoid possible cross contamination.
| numpy.testing.assert_allclose(expected_charges_unitless, assigned_charges_unitless) | ||
|
|
||
|
|
||
| class TestNAGLChargesErrorHandling: |
There was a problem hiding this comment.
All these tests below here are largely AI-assisted and curated by me, which is to say that I'm not offended at all if you think some are only marginally valuable and deserve deletion.
mattwthompson
left a comment
There was a problem hiding this comment.
This is looking good - I'm really happy that the implementation was relatively simple; since it "quacks" much like other charge assignment methods, the investment in the complexity that already exists in smirnoff/_nonbonded.py is finally paying off in a way.
Since this is such a critical piece of infrastructure, I wan to be more through than normal when adding it in. There are a few tests/categories of that I think are missing:
- interactions with
charge_from_molecules - charge assignment logging
- making sure a system containing a large molecule takes a reasonable amount of time
openff/interchange/_tests/unit_tests/smirnoff/test_nonbonded.py
Outdated
Show resolved
Hide resolved
|
|
||
| hexane_diol.assign_partial_charges("openff-gnn-am1bcc-0.1.0-rc.3.pt") | ||
| # Leave the ToolkitAM1BCC tag in openff-2.1.0 to ensure that the NAGLCharges handler takes precedence | ||
| ff = ForceField("openff-2.1.0.offxml") |
There was a problem hiding this comment.
Is there a reason to use 2.1.0 specifically? There are some other Sage version(s) already in fixtures and re-using those would cut down on some repetitive code
There was a problem hiding this comment.
Nope! I'll remove usages of these where possible and replace with fixtures
|
|
||
| with pytest.raises(MissingPackageError, match="NAGL software isn't present"): | ||
| Interchange.from_smirnoff(force_field=ff, topology=hexane_diol.to_topology()) |
There was a problem hiding this comment.
What would happen if a user did charge_from_molecules=[hexane_diol]?
There was a problem hiding this comment.
That should succeed, and I've added to this test to ensure it does.
openff/interchange/_tests/unit_tests/smirnoff/test_nonbonded.py
Outdated
Show resolved
Hide resolved
| from openff.toolkit.utils.toolkits import GLOBAL_TOOLKIT_REGISTRY | ||
|
|
||
| partial_charge_method = parameter_handler.model_file | ||
| if "NAGL" not in GLOBAL_TOOLKIT_REGISTRY.__repr__(): | ||
| raise MissingPackageError( | ||
| "The force field has a NAGLCharges section, but the NAGL software isn't " | ||
| "present in GLOBAL_TOOLKIT_REGISTRY", | ||
| ) | ||
|
|
There was a problem hiding this comment.
I think this block is doing the toolkit's job for it - what would it look like if this was removed? I have to assume the toolkit handles this error and it would bubble up to the user through Interchange in a legible way
There was a problem hiding this comment.
So... This is masking a nasty side effect of shoving everything through Molecule.assign_partial_charges, which is that the Toolkit treats model names like any other charge method (that is, it doesn't know that the call is supposed to get routed to NAGLToolkitWrapper). If we didn't have this, the error raised here would be:
E ValueError: No registered toolkits can provide the capability "assign_partial_charges" for args "()" and kwargs "{'molecule': Molecule with name '' and SMILES '[H][O][C]([H])([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[O][H]', 'partial_charge_method': 'openff-gnn-am1bcc-0.1.0-rc.3.pt', 'use_conformers': None, 'strict_n_conformers': False, 'normalize_partial_charges': True, '_cls': <class 'openff.toolkit.topology.molecule.Molecule'>}"
E Available toolkits are: [ToolkitWrapper around The RDKit version 2025.03.4]
E ToolkitWrapper around The RDKit version 2025.03.4 <class 'openff.toolkit.utils.exceptions.ChargeMethodUnavailableError'> : partial_charge_method 'openff-gnn-am1bcc-0.1.0-rc.3.pt' is not available from RDKitToolkitWrapper. Available charge methods are {'mmff94': {}, 'gasteiger': {}}
Which doesn't tell the user the important thing they need to know in this case: they either need to install NAGL or get it in their ToolkitRegistry.
So I'm in favor of leaving this as-is.
There was a problem hiding this comment.
Actually, this made me realize there's a family of edge cases to consider, where NAGL charge assignment fails for various reasons. For example:
- If NAGL just doesn't cover some elements in the molecule, then lower-precedence charge models should be attempted (NOT the current behavior, I'm committing a test for this)
- I wonder if there's already a problem with this sort of fallback if a force field only had ToolkitAM1BCC and CIMH?
- The error message printed when this happens SHOULDN'T just be from the most recent charge method to fail, but instead should include reasons why EACH attempted charge method failed (otherwise the user might just see one error from CIMH when they really want to know why their NAGLCharges tag didn't work)
- If the NAGL model file isn't found and can't be fetched, that should be an immediate failure (this is the case now)
- If the NAGL hash comparison fails, that should be an immediate failure (this is the case now)
There was a problem hiding this comment.
I understand the highlighted code to be trying to do the job of making sure that NAGL is installed/available/etc. before running charge assignment on it. If we did s/NAGL/RDKit/ and s/charge assignment/conformer generation/ would you grant that this wiring is supposed to be handled by the toolkit?
I agree that error message is not useful for a user, but I would rather have the error bubble up in one place and it be handled as gracefully as possible. This may be a sizable block of a try followed by potentially many excepts but that feels safer and more maintainable than trying to handle this particular error here. I have yet to think through all of these failure modes, but I would rather the toolkit tell me NAGL isn't installed, a file isn't found, etc. and report that back to the user here. Probably some of these cases are already not reported to the user well and could be improved.
Put a different way, I'm concerned that this opens the door to Interchange being responsible for making sure charge assignment methods and optional dependencies are properly lined up whereas Molecule is the choke point of all paths.
There was a problem hiding this comment.
If we did s/NAGL/RDKit/ and s/charge assignment/conformer generation/ would you grant that this wiring is supposed to be handled by the toolkit?
Yes.
I agree with this being resolved into a bunch of try/excepts, and trying not to make Interchange responsible for so many details. I'll give that a shot, there's a little magic in ToolkitRegistry.call (the raise_exception_types named argument) that may make this more straightforward.
There was a problem hiding this comment.
Awesome - I hope we can get this smoothness without a ton of effort or complexity here. (If not, a less elegant solution which gets things working okay may be appropriate!)
openff/interchange/_tests/unit_tests/smirnoff/test_nonbonded.py
Outdated
Show resolved
Hide resolved
openff/interchange/_tests/unit_tests/smirnoff/test_nonbonded.py
Outdated
Show resolved
Hide resolved
openff/interchange/_tests/unit_tests/smirnoff/test_nonbonded.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Matt Thompson <mattwthompson@protonmail.com>
…cedence charge model that CAN assign charges
for more information, see https://pre-commit.ci
j-wags
left a comment
There was a problem hiding this comment.
Notes to self:
Tell matt that test_charge_assignment_logging layout is a pain to make changes to
Tell matt that almost all new tests are AI generated at my prompting and while I've reviewed and revised them they were way less work to make than they appear so I'm fine to just trash or consolidate any of them
The pytest.mark.slow decorator doesn't appear to be doing anything (the tests are run unconditionally and pytest raises an error if I pass it --runslow)
Review Codecov diff and see why its so sad https://app.codecov.io/gh/openforcefield/openff-interchange/pull/1206?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=checks&utm_campaign=pr+comments&utm_term=openforcefield
mattwthompson
left a comment
There was a problem hiding this comment.
Fantastic work - I found one detail which I think warrants further consideration, but all other comments are minor and non-blocking
| # No error should be raised if using charge_from_molecules | ||
| sage_with_nagl_charges.create_interchange( | ||
| topology=hexane_diol.to_topology(), | ||
| charge_from_molecules=[hexane_diol], | ||
| ) |
There was a problem hiding this comment.
Worth double-checking that the charges are correct (I think re-using code from a similar test would be fine). It's possible that something weird happens when ForceField.create_interchange is only given a registry with one toolkit
| @pytest.mark.xfail( | ||
| reason="charge assignment handler fallback behavior not yet implemented", | ||
| raises=ValueError, | ||
| ) |
There was a problem hiding this comment.
FTR I'm happy with pushing this implementation into the future, it's an edge case which a user ought to not come across without trying funky things
| # Create a very large molecule | ||
| large_molecule = Molecule.from_smiles("C" * 200) # 200-carbon alkane chain | ||
|
|
||
| start_time = time.time() |
There was a problem hiding this comment.
It shouldn't make a difference with 30 seconds as the threshold, and it's already probably "warmed up" from another test, but I wonder if the model is already loaded into memory (frequently a few seconds, compared to <1 seconds runtime) at this step
| def sage_with_nagl_charges(sage): | ||
| sage.get_parameter_handler( |
There was a problem hiding this comment.
| def sage_with_nagl_charges(sage): | |
| sage.get_parameter_handler( | |
| def sage_nagl(sage): | |
| sage.deregister_parameter_handler("ToolkitAM1BCC") |
(blocking) this fixture should drop AM1-BCC from Sage
- IIUC, that would get it to match what's going to be used in production
- It's hard to imagine this breaking tests since AM1BCC should not be sniffed with the tests as written, but I would really like to avoid surprises in releases 2.3.0+ if this breaks anything
test_nagl_charges_precedence_over_am1bccwould break with this change, but I think the behavior (what happens when there are both? then read the title of the test back to oneself) is worth testing, so maybe just addToolkitAM1BCCHandlerback in that test only- I can take or leave the name change, I think there is value in differentiating "Sage which happens to use NAGL changes" from "Sage which really is based on NAGL charges"
There was a problem hiding this comment.
Agree, great catch. Implementing now.
|
Taking the liberty to click "Ready for review" since ... uh, I'm already doing that |
mattwthompson
left a comment
There was a problem hiding this comment.
Changes look good, please nothing surprising seems to have come up.
I looked at the code coverage report and
- The non-test/non-tooling diff is quite small
- The coverage is great
Merge main in, update whatever docs you can think of, and I think we're good?
|
@mattwthompson I've updated the releasehistory and think this is ready to merge. This is a big PR so I'd like to turn it over to you to push the merge button. I propose calling the release 0.4.5, since that saves me a package build (OFFTK is currently depending on interchange =0.4) but no big deal if you'd prefer to change the releasenotes and call it 0.5.0. |
Description
Implements NAGLCharges as currently drafted in SMIRNOFF EP 11.
This interfaces with a PR in the OpenFF Toolkit at openforcefield/openff-toolkit#2048.
The spec change also includes the definition of some new model fetching and verification behavior, which can be found at openforcefield/openff-nagl-models#61, but which isn't required for these tests to pass.