QRef is a plugin for the crystallographic software suite Phenix enabling what in the literature commonly is referred to as "quantum refinement" (QR) in both real and reciprocal space, utilising the (for academic users) free software Orca as the quantum chemistry engine. This first version of QRef using Phenix was implemented under the paradigm "correctness and completeness first, performance and adherence to coding standards later".
Refinement of macromolecules in both real and and reciprocal space relies on previous knowledge (i.e. a Bayesian prior) for the structure. This is usually encoded as a (pseudo-energy) penalty term,
where
In this QR implementation
where index 1 in turn indicates small, but interesting, parts of the structure. Additionally another scaling factor,
implies that
where
The directory modules should be placed under $PHENIX; qref will thus be a new directory under modules, whereas the user should manually overwrite energies.py in modules/cctbx_project/cctbx/geometry_restraints and model.py in modules/cctbx_project/mmtbx/model, respectively, with the version of the file corresponding to their installation of Phenix.
There is a commented out guard clause in energies.py:
# if not os.path.exists('qm.lock') and (os.path.exists('xyz_reciprocal.lock') or os.path.exists('xyz.lock')):
This is the recommended way to use the quantum restraints, as they are not always needed. In order to make this work one has to edit the file $PHENIX/modules/phenix/phenix/refinement/xyz_reciprocal_space.py and import os as well as surround the call to mmtbx.refinement.minimization.lbfgs(...) in the method run_lbfgs in the class run_all with
with open('xyz_reciprocal.lock', 'w'):
pass
and
os.remove('xyz_reciprocal.lock')
Likewise the file $PHENIX/modules/phenix/phenix/refinement/macro_cycle_real_space.py should be edited in a similar manner, i.e. with an added import os as well as surrounding the calls to self.minimization_no_ncs() and self.minimization_ncs() in the method refine_xyz in the class run with
with open('xyz.lock', 'w'):
pass
and
os.remove('xyz.lock')
This implementation of QRef has been verified to work with Phenix versions 1.20.1-4487, 1.21-5207, 1.21.1-5286 as well as 1.21.2-5419.
The scripts in the folder scripts should be placed somewhere accessible by $PATH. The shebang in the scripts might need to be updated to point towards wherever cctbx.python is located.
Orca can be found at orcaforum.kofo.mpg.de - follow their guide for installation. QRef has been verified to work with Orca versions 5.0.4, 6.0.0 and 6.1.0.
The general procedure to set up a quantum refinement job consists of
-
Select QM region(s). Best practises for selecting proper QM region(s) can be found at for example:
-
Build a model.
- The model in the QM regions needs to make chemical sense. This for example means that the QM regions should be protonated as well as being complete.
- The model outside of the QM region (as well as the protonation of the carbon link atom) can be incomplete.
-
phenix.ready_set add_h_to_water=Truecan be useful for this purpose.
-
Prepare restraint files for unknown residues and ligands. The script
qref_prep.pywill tell you if there are any missing restraint files.- This can be achieved using
phenix.ready_setandphenix.elbow.
- This can be achieved using
-
Prepare
syst1files; these files define the QM regions.- If there is only one QM region the default is to look for a file named
syst1by the software. For multiple QM regions the recommended, and default, naming scheme issyst11,syst12, etc. - Which atoms to include in the QM regions is defined using the serial number from the PDB file describing the entire model.
- While setting
sort_atoms = Falsein the input toPhenixshould ensure that the ordering in the input model is preserved, we have encountered instances where this is not adhered to. Thus it is recommended to useiotbx.pdb.sort_atoms(supplied withPhenix) which will give you a new PDB file with the suffix_sorted.pdbwhere the atoms are sorted in the same order as that whichPhenixuses internally. It is recommended to use the_sorted.pdbfile as the input model for refinement, as well as the reference when defining thesyst1files.
- While setting
- The
syst1files allows for multiple atoms or intervals of atoms to be specified on a single line, where,orblankworks as delimiters;-is used to indicate an interval. -
#and!can be used to include comments in thesyst1files. - The second occurence of an atom in the
syst1files will indicate that this is a link atom, i.e. it will be replaced by a hydrogen at the appropriate position in the QM calculation. - Examples are included in the
examplesfolder.
- If there is only one QM region the default is to look for a file named
-
Run
qref_prep.py <model>_sorted.pdbto generateqref.dat(a file containing settings for the QR interface), as well as PDB files describing the QM regions.-
The
junctfactorfile needs to be present in the same directory as whereqref_prep.pyis run. Thejunctfactorfile contain ideal QM distances for the$C_L - H_L$ bonds for the link-atoms. If anotherjunctfactorfile is to be used this can be specified with the-jor--junctfactoroption. -
The theory used for the ideal
$C_L - H_L$ QM distance can be changed with the-lor--ltypeoption. Default is type 12. The options are:- 6: B3LYP/6-31G*
- 7: BP(RI)/6-31G*
- 8: BP(RI)/SVP
- 9: BP(RI)/def2-SV(P)
- 10: PBE(RI)/def2-SVP
- 11: B3LYP(RI)/def2-SV(P)
- 12: TPSS(RI)/def2-SV(P)
- 13: B97-D(RI)/def2-SV(P)
There needs to be a parametrisation in the
junctfactorfile for the bond one intends to cleave; it is recommended that the user inspects thejunctfactorfile to verify that there is support to cleave the intended bond type. In the case parametrisation is lacking another selection for the QM system (and in particular where the link between QM and MM occurs) needs to be made or appropriate parametrisation added to thejunctfactorfile. -
Ideally only the input model is needed as an argument for
qref_prep.py. If there was a need to prepare restraint files for novel residues or ligands in point 3 above,qref_prep.pyneeds to be made aware of these. This can be achieved with the-cor--cifoption. -
The output from
qref_prep.pyshould be$\left(1+2 n_{syst1}\right)$ files as well as recommended selection strings for both real and reciprocal space. Additionally, a warning is given if thesyst1file covers more than one conformation. In the case that all atoms in thesyst1definition belong to the same conformation, this will be indicated by analtlocspecifier.-
qref.dat, which contains the settings for the QR interface. This file can be changed manually and it is a good idea to inspect that the value fororca_binaryis the correct path for the actual Orca binary file (qref_prep.pytries to locate this file automatically but may sometimes fail). -
mm_i_c.pdb, which is the model used to calculate$E_{MM1, i}$ . -
qm_i_h.pdb, which is the model used to calculate$E_{QM1, i}$ .
The output PDB files can, and probably should, be used to inspect that the QM selection is proper.
- Two selection strings are printed on the screen, one for reciprocal space and one for real space. They are intended to be used in regards to which selection of the model to refine when crafting the input to either
phenix.refineorphenix.real_space_refine, see point 7 below.
-
-
Harmonic (bond) distance restraints can be added through the
-rdor--restraint_distanceoption, using the syntaxi atom1_serial atom2_serial desired_distance_in_Å force_constant. Experience has shown that the force constant needs to be$\geq$ 2500 to achieve adherence to the restraint. -
Harmonic (bond) angle restraints can be added through the
-raor--restraint_angleoption, using the syntaxi atom1_serial atom2_serial atom3_serial desired_angle_in_degrees force_constant, whereatom2_serialdefines the angle tip. Experience has shown that the force constant needs to be$\geq$ 10 (?) to achieve adherence to the restraint. -
Symmetry interactions are handled through the
-tor--transformoption, using the syntaxi "<atoms>" R11 R12 R13 R21 R22 R23 R31 R32 R33 t1 t2 t3for each of the desired transforms, where<atoms>follow the same syntax as forsyst1files (do note the usage of quotation marks, i.e.<atoms>should be given as a string).Ris the rotation matrix (in Cartesian coordinates) in row-wise order andtis the translation vector (in Cartesian coordinates). To obtain$R$ and$t$ , find for example the fractional rotation matrix$R_{frac}$ and the fractional translation vector$t_{frac}$ for the symmetry operator of interest (which can be found through using for example Coot), together with the desired fractional unit cell translation vector,$t_{u}$ .$R$ is then calculated as$R = S^{-1}R_{frac}S$ and$t = S^{-1}(t_{frac} + t_{u})$ , where$S$ is the Cartesian to fractional conversion matrix for the space group the crystal belongs to (which can be found in theSCALEnrecords in a PDB file). -
All available options for
qref_prep.pycan be seen through-hor--help.
-
-
The next step is to prepare the input files for Orca. Examples can be found in the
examplesfolder.- The input files should be named
qm_i.inp, whereirefers to the i:th QM region; counting starts at 1. - The level of theory should match the junction type; using the default (type 12) the corresponding input to
Orcathen becomes! TPSS D4 DEF2-SV(P). - Energy and gradient needs to be written to disk. This is achieved through the keyword
! ENGRAD. - To read coordinates from a PDB file (
qm_i_h.pdb) use*pdbfile <charge> <multiplicity> qm_i_h.pdb, where againirefers to the i:th QM region. - Custom settings can also be supplied (see the
examplesfolder).
- The input files should be named
-
At this point it is possible to start a quantum refinement. It is however recommended to first create an empty file named
qm.lock(this disables QRef - additionally, ifqref.datis not present in the folder QRef will not run, i.e. this step can be run before starting to set up the QR job) through for example thetouchcommand, then start a refinement so that an.efffile is obtained (this file contains all thePhenixrefinement settings for the current experimental data and model). A good idea is to rename this file tophenix.paramsor similar, then edit this file and make sure the following options are set:- For reciprocal space refinement (
phenix.refine):refinement.pdb_interpretation.restraints_library.cdl = Falserefinement.pdb_interpretation.restraints_library.mcl = Falserefinement.pdb_interpretation.restraints_library.cis_pro_eh99 = Falserefinement.pdb_interpretation.secondary_structure.enabled = Falserefinement.pdb_interpretation.sort_atoms = Falserefinement.pdb_interpretation.flip_symmetric_amino_acids = Falserefinement.refine.strategy = *individual_sites individual_sites_real_space rigid_body *individual_adp group_adp tls occupancies group_anomalousrefinement.refine.sites.individual = <reciprocal selection string>refinement.hydrogens.refine = *individual riding Autorefinement.hydrogens.real_space_optimize_x_h_orientation = Falserefinement.main.nqh_flips = False
- For real space refinement (
phenix.real_space_refine):refinement.run = *minimization_global rigid_body local_grid_search morphing simulated_annealing adp occupancy nqh_flipspdb_interpretation.restraints_library.cdl = Falsepdb_interpretation.restraints_library.mcl = Falsepdb_interpretation.restraints_library.cis_pro_eh99 = Falsepdb_interpretation.flip_symmetric_amino_acids = Falsepdb_interpretation.sort_atoms = Falsepdb_interpretation.secondary_structure = Falsepdb_interpretation.reference_coordinate_restraints.enabled = Truepdb_interpretation.reference_coordinate_restraints.selection = <real space selection string>pdb_interpretation.reference_coordinate_restraints.sigma = 0.01pdb_interpretation.ramachandran_plot_restraints.enabled = False
- Other options can be set as needed.
- For reciprocal space refinement (
-
To run the quantum refinement job, make sure that the
qm.lockfile has been deleted, then execute eitherphenix.refine phenix.paramsorphenix.real_space_refine phenix.params. If there is a need to restart the job with different settings forPhenix, make sure to delete the filesettings.pickle.
When using ORCA/6.0.0 either manually run (before you submit your job) or add to the beginning of your submit script the exports below:
export OMPI_MCA_btl='^uct,ofi'
export OMPI_MCA_pml='ucx'
export OMPI_MCA_mtl='^ofi'
Add symmetry support for the QM calculations.Done.Add support for distance restraints.Done.Add support for angle restraints.Done.- Refactor code to be OOP.
- Turn QRef into a proper restraint_manager class.
Lundgren, K. J. M., Caldararu, O., Oksanen, E., & Ryde, U. (2024). "Quantum refinement in real and reciprocal space using the Phenix and ORCA software", IUCrJ, 11(6), 921-937.
doi.org/10.1107/S2052252524008406