moldesign.molecules package¶
-
class
moldesign.molecules.
AtomList
(*args, **kwargs)[source]¶ Bases:
list
,moldesign.molecules.atomcollections.AtomContainer
A list of atoms that allows attribute “slicing” - accessing an attribute of the list will return a list of atom attributes.
Example
>>> atomlist.mass == [atom.mass for atom in atomlist.atoms] >>> getattr(atomlist, attr) == [getattr(atom, attr) for atom in atomlist.atoms]
-
atoms
¶ This is a synonym for self so that AtomContainer methods will work here too
-
-
class
moldesign.molecules.
Bond
(a1, a2, order=None)[source]¶ Bases:
object
A bond between two atoms.
Parameters: Notes
Comparisons and hashes involving bonds will return True if the atoms involved in the bonds are the same. Bond orders are not compared.
These objects are used to represent and pass bond data only - they are not used for storage.
-
a1
¶ Atom – First atom in the bond; assigned so that
self.a1.index < self.a2.index
-
a2
¶ Atom – Second atom in the bond; assigned so that
self.a2.index > self.a1.index
-
order
¶ int – bond order (can be
None
); not used in comparisons
-
ff
¶ mdt.forcefield.BondTerm – the force-field term for this bond (or
None
if no forcefield is present)
-
name
¶ str – name of the bond
-
-
class
moldesign.molecules.
Atom
(name=None, atnum=None, mass=None, chain=None, residue=None, formal_charge=None, pdbname=None, pdbindex=None, element=None)[source]¶ Bases:
moldesign.molecules.atoms.AtomDrawingMixin
,moldesign.molecules.atoms.AtomGeometryMixin
,moldesign.molecules.atoms.AtomPropertyMixin
,moldesign.molecules.atoms.AtomReprMixin
A data structure representing an atom.
Atom
objects store information about individual atoms within a larger molecular system, providing access to atom-specific geometric, biomolecular, topological and property information. EachMolecule
is composed of a list of atoms.Atoms can be instantiated directly, but they will generally be created automatically as part of molecules.
Parameters: - name (str) – The atom’s name (if not passed, set to the element name + the atom’s index)
- atnum (int) – Atomic number (if not passed, determined from element if possible)
- mass (units.Scalar[mass]) – The atomic mass (if not passed, set to the most abundant isotopic mass)
- chain (moldesign.Chain) – biomolecular chain that this atom belongs to
- residue (moldesign.Residue) – biomolecular residue that this atom belongs to
- pdbname (str) – name from PDB entry, if applicable
- pdbindex (int) – atom serial number in the PDB entry, if applicable
- element (str) – Elemental symbol (if not passed, determined from atnum if possible)
Atom instance attributes:
-
name
¶ str – A descriptive name for this atom
-
element
¶ str – IUPAC elemental symbol (‘C’, ‘H’, ‘Cl’, etc.)
-
index
¶ int – the atom’s current index in the molecule (
self is self.parent.atoms[ self.index]
)
-
atnum
¶ int – atomic number (synonyms: atomic_num)
-
mass
¶ u.Scalar[mass] – the atom’s mass
-
position
¶ units.Vector[length] – atomic position vector. Once an atom is part of a molecule, this quantity will refer to
self.molecule.positions[self.index]
.
-
momentum
¶ units.Vector[momentum] – atomic momentum vector. Once an atom is part of a molecule, this quantity will refer to
self.molecule.momenta[self.index]
.
-
x,y,z
u.Scalar[length] – x, y, and z components of
atom.position
-
vx, vy, vz
u.Scalar[length/time] – x, y, of
atom.velocity
-
px, py, pz
u.Scalar[momentum] – x, y, and z of
atom.momentum
-
fx, fy, fz
u.Scalar[force] – x, y, and z
atom.force
-
residue
¶ moldesign.Residue – biomolecular residue that this atom belongs to
-
chain
¶ moldesign.Chain – biomolecular chain that this atom belongs to
-
parent
¶ moldesign.Molecule – molecule that this atom belongs to
-
index
int – index in the parent molecule:
atom is atom.parent.atoms[index]
Atom methods and properties
See also methods offered by the mixin superclasses:
AtomDrawingMixin
AtomGeometryMixin
AtomPropertyMixin
AtomReprMixin
-
bond_graph
¶ Mapping[Atom, int] – dictionary of this atoms bonded neighbors, of the form
{bonded_atom1, bond_order1, ...}
-
bond_to
(other, order)[source]¶ Create or modify a bond with another atom
Parameters: Returns: bond object
Return type:
-
bonds
¶ List[Bond] – list of all bonds this atom is involved in
-
copy
(self)¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
elem
¶ str – elemental symbol
-
element
str – elemental symbol
-
force
¶ (units.Vector[force]) – atomic force vector. This quantity must be calculated - it is equivalent to
self.molecule.forces[self.index]
Raises: moldesign.NotCalculatedError
– if molecular forces have not been calculated
-
heavy_bonds
¶ List[Bond] – list of all heavy atom bonds (where BOTH atoms are not hydrogen)
Note
this returns an empty list if called on a hydrogen atom
-
nbonds
¶ int – the number of other atoms this atom is bonded to
-
num_bonds
¶ int – the number of other atoms this atom is bonded to
-
symbol
¶ str – elemental symbol
-
valence
¶ int – the sum of this atom’s bond orders
-
velocity
¶ u.Vector[length/time, 3] – velocity of this atom; equivalent to
self.momentum/self.mass
-
class
moldesign.molecules.
Entity
(name=None, molecule=None, index=None, pdbname=None, pdbindex=None, **kwargs)[source]¶ Bases:
moldesign.molecules.atomcollections.AtomContainer
Generalized storage mechanism for hierarchical representation of biomolecules, e.g. by residue, chain, etc. Permits other groupings, provided that everything is tree-like.
All children of a given entity must have unique names. An individual child can be retrieved with
entity.childname
orentity['childname']
orentity[index]
Yields: Entity or mdt.Atom – this entity’s children, in order
-
class
moldesign.molecules.
Instance
(name=None, molecule=None, index=None, pdbname=None, pdbindex=None, **kwargs)[source]¶ Bases:
moldesign.molecules.biounits.Entity
The singleton biomolecular container for each
Molecule
. Its children are generally PDB chains. Users won’t ever really see this object.
-
class
moldesign.molecules.
Residue
(**kwargs)[source]¶ Bases:
moldesign.molecules.biounits.Entity
A biomolecular residue - most often an amino acid, a nucleic base, or a solvent molecule. In PDB structures, also often refers to non-biochemical molecules.
Its children are almost always residues.
-
parent
¶ mdt.Molecule – the molecule this residue belongs to
-
chain
¶ Chain – the chain this residue belongs to
-
add
(atom, key=None)[source]¶ Add a child to this entity.
Raises: KeyError
– if an object with this key already existsParameters: - item (Entity or mdt.Atom) – the child object to add
- key (str) – Key to retrieve this item (default:
item.name
)
-
assign_template_bonds
()[source]¶ Assign bonds from bioresidue templates.
Only assigns bonds that are internal to this residue (does not connect different residues). The topologies here assume pH7.4 and may need to be corrected for other pHs
See also
moldesign.Chain.assign_biopolymer_bonds for assigning inter-residue bonds
Raises: ValueError
– ifresidue.resname
is not in bioresidue templatesKeyError
– if an atom in this residue is not recognized
-
atomnames
¶ Residue – synonym for
`self`
for for the sake of readability –`molecule.chains['A'].residues[123].atomnames['CA']`
-
atoms
¶
-
backbone
¶ AtomList – all backbone atoms for nucleic and protein residues (indentified using PDB names); returns None for other residue types
-
code
¶ str – one-letter amino acid code or two letter nucleic acid code, or ‘?’ otherwise
-
copy
()[source]¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
is_3prime_end
¶ bool – this is the last base in a strand
Raises: ValueError
– if this residue is not a DNA base
-
is_5prime_end
¶ bool – this is the first base in a strand
Raises: ValueError
– if this residue is not a DNA base
-
is_c_terminal
¶ bool – this is the first residue in a peptide
Raises: ValueError
– if this residue is not an amino acid
-
is_monomer
¶ bool – this residue is not part of a biopolymer
-
is_n_terminal
¶ bool – this is the last residue in a peptide
Raises: ValueError
– if this residue is not an amino acid
-
is_standard_residue
¶ bool – this residue is a “standard residue” for the purposes of a PDB entry.
In PDB files, this will be stored using ‘ATOM’ if this is a standard residue and ‘HETATM’ records if not.
Note
We currently define “standard” residues as those whose 3 letter residue code appears in the
moldesign.data.RESIDUE_DESCRIPTIONS
dictionary. Although this seems to work well, we’d welcome a PR with a less hacky method.References
PDB format guide: http://www.wwpdb.org/documentation/file-format
-
markdown_summary
()[source]¶ Markdown-formatted information about this residue
Returns: markdown-formatted string Return type: str
-
next_residue
¶ - Residue – The next residue in the chain (in the C-direction for proteins, 3’
- direction for nucleic acids)
Raises: NotImplementedError
– If we don’t know how to deal with this type of biopolymerStopIteration
– If there isn’t a next residue (i.e. it’s a 3’- or C-terminus)
-
prev_residue
¶ Residue –
- The next residue in the chain (in the N-direction for proteins, 5’ direction for
- nucleic acids)
Raises: NotImplementedError
– If we don’t know how to deal with this type of biopolymerStopIteration
– If there isn’t a previous residue (i.e. it’s a 5’- or N-terminus)
-
resname
¶ str – Synonym for pdbname
-
sidechain
¶ AtomList – all sidechain atoms for nucleic and protein residues (defined as non-backbone atoms); returns None for other residue types
-
type
¶ str – Classification of the residue (protein, solvent, dna, water, unknown)
-
-
class
moldesign.molecules.
Chain
(pdbname=None, **kwargs)[source]¶ Bases:
moldesign.molecules.biounits.Entity
Biomolecular chain class - its children are almost always residues.
-
parent
¶ mdt.Molecule – the molecule this residue belongs to
-
chain
¶ Chain – the chain this residue belongs to
-
assign_biopolymer_bonds
()[source]¶ Connect bonds between residues in this chain.
See also
moldesign.Residue.assign_template_bonds
Raises: ValueError
– ifresidue.resname
is not in bioresidue templatesKeyError
– if an atom in this residue is not recognized
-
c_terminal
¶ moldesign.Residue – The chain’s C-terminus (or
None
if it does not exist)
-
copy
()[source]¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
fiveprime_end
¶ moldesign.Residue – The chain’s 5’ base (or
None
if it does not exist)
-
get_ligand
()[source]¶ Return a (single) ligand if it exists; raises ValueError if there’s not exactly one
This is a utility routine to get a single ligand from a chain. If there’s exactly one residue, it is returned. If not, ValueError is raised - use
Chain.unclassified_residues()
to get an iterator over all unclassified residues.Returns: ligand residue Return type: moldesign.Residue Raises: ValueError
– if the chain does not contain exactly one unclassifiable residue
-
n_terminal
¶ moldesign.Residue – The chain’s N-terminus (or
None
if it does not exist)
-
nresidues
¶
-
num_residues
¶
-
numresidues
¶
-
polymer_residues
¶
-
residues
¶ ChildList – list of residues in this chain
-
sequence
¶ str – this chain’s residue sequence with one-letter residue codes
-
solvent_residues
¶
-
threeprime_end
¶ moldesign.Residue – The chain’s 3’ base (or
None
if it does not exist)
-
type
¶ str – the type of chain - protein, DNA, solvent, etc.
This field returns the type of chain, classified by the following rules: 1) If the chain contains only one type of residue, it is given that classification
(so a chain containing only ions has type “ion”- If the chain contains a biopolymer + ligands and solvent, it is classified as a biopolymer (i.e. ‘protein’, ‘dna’, or ‘rna’). This is the most common case with .pdb files from the PDB.
- If the chain contains multiple biopolymer types, it will be given a hybrid classification (e.g. ‘dna/rna’, ‘protein/dna’) - this is rare!
- If it contains multiple kinds of non-biopolymer residues, it will be called “solvent” (if all non-bio residues are water/solvent/ion) or given a hybrid name as in 3)
-
unclassified_residues
¶
-
-
class
moldesign.molecules.
MolecularProperties
(mol, **properties)[source]¶ Bases:
moldesign.utils.classes.DotDict
Stores property values for a molecule. These objects will be generally created and updated by EnergyModels, not by users.
-
class
moldesign.molecules.
Molecule
(atomcontainer, name=None, bond_graph=None, copy_atoms=False, pdbname=None, charge=None, electronic_state_index=0)[source]¶ Bases:
moldesign.molecules.atomcollections.AtomContainer
,moldesign.molecules.molecule.MolConstraintMixin
,moldesign.molecules.molecule.MolPropertyMixin
,moldesign.molecules.molecule.MolDrawingMixin
,moldesign.molecules.molecule.MolReprMixin
,moldesign.molecules.molecule.MolTopologyMixin
,moldesign.molecules.molecule.MolSimulationMixin
Molecule
objects store a molecular system, including atoms, 3D coordinates, molecular properties, biomolecular entities, and other model-specific information. Interfaces with simulation models take place through the molecule object.Molecule objects will generally be created by reading files or parsing other input; see, for example:
moldesign.read()
,moldesign.from_smiles()
,moldesign.from_pdb()
, etc.This constructor is useful, however for copying other molecular structures (see examples below).
Parameters: - atomcontainer (AtomContainer or AtomList or List[moldesign.Atom]) –
atoms that make up this molecule.
Note
If the passed atoms don’t already belong to a molecule, they will be assigned to this one. If they DO already belong to a molecule, they will be copied, leaving the original molecule untouched.
- name (str) – name of the molecule (automatically generated if not provided)
- bond_graph (dict) – dictionary specifying bonds between the atoms - of the form
{atom1:{atom2:bond_order, atom3:bond_order}, atom2:...}
This structure must be symmetric; we requirebond_graph[atom1][atom2] == bond_graph[atom2][atom1]
- copy_atoms (bool) – Create the molecule with copies of the passed atoms (they will be copied automatically if they already belong to another molecule)
- pdbname (str) – Name of the PDB file
- charge (units.Scalar[charge]) – molecule’s formal charge
- electronic_state_index (int) – index of the molecule’s electronic state
Examples
Use the
Molecule
class to create copies of other molecules and substructures thereof: >>> benzene = mdt.from_name(‘benzene’) >>> benzene_copy = mdt.Molecule(benzene, name=’benzene copy’)>>> protein = mdt.from_pdb('3AID') >>> carbon_copies = mdt.Molecule([atom for atom in protein.atoms if atom.atnum==6]) >>> first_residue_copy = mdt.Molecule(protein.residues[0])
Molecule instance attributes:
-
atoms
¶ AtomList – List of all atoms in this molecule.
-
bond_graph
¶ dict – symmetric dictionary specifying bonds between the atoms:
bond_graph = {atom1:{atom2:bond_order, atom3:bond_order}, atom2:...}
bond_graph[atom1][atom2] == bond_graph[atom2][atom1]
-
residues
¶ List[moldesign.Residue] – flat list of all biomolecular residues in this molecule
-
chains
¶ Dict[moldesign.Chain] – Biomolecular chains - individual chains can be accessed as
mol.chains[list_index]
ormol.chains[chain_name]
-
name
¶ str – A descriptive name for molecule
-
charge
¶ units.Scalar[charge] – molecule’s formal charge
-
constraints
¶ List[moldesign.geom.GeometryConstraint] – list of constraints
-
ndims
¶ int – length of the positions, momenta, and forces arrays (usually 3*self.num_atoms)
-
num_atoms
¶ int – number of atoms (synonym: natoms)
-
num_bonds
¶ int – number of bonds (synonym: nbonds)
-
positions
¶ units.Array[length] – Nx3 array of atomic positions
-
momenta
¶ units.Array[momentum] – Nx3 array of atomic momenta
-
masses
¶ units.Vector[mass] – vector of atomic masses
-
dim_masses
¶ units.Array[mass] – Nx3 array of atomic masses (for numerical convenience - allows you to calculate velocity, for instance, as
velocity = mol.momenta/mol.dim_masses
-
time
¶ units.Scalar[time] – current time in dynamics
-
energy_model
¶ moldesign.models.base.EnergyModelBase – Object that calculates molecular properties - driven by mol.calculate()
-
integrator
¶ moldesign.integrators.base.IntegratorBase – Object that drives movement of 3D coordinates in time, driven by mol.run()
-
is_biomolecule
¶ bool – True if this molecule contains at least 2 biochemical residues
Molecule methods and properties
See also methods offered by the mixin superclasses:
moldesign.molecules.AtomContainer
moldesign.molecules.MolPropertyMixin
moldesign.molecules.MolDrawingMixin
moldesign.molecules.MolSimulationMixin
moldesign.molecules.MolTopologyMixin
moldesign.molecules.MolConstraintMixin
moldesign.molecules.MolReprMixin
-
addatom
(newatom)[source]¶ Add a new atom to the molecule
Parameters: newatom (moldesign.Atom) – The atom to add (it will be copied if it already belongs to a molecule)
-
addatoms
(newatoms)[source]¶ Add new atoms to this molecule. For now, we really just rebuild the entire molecule in place.
Parameters: newatoms (List[moldesign.Atom]) –
-
bonds
¶ Iterator over all bonds in the molecule
Yields: moldesign.atoms.Bond – bond object
-
deletebond
(bond)[source]¶ Remove this bond from the molecule’s topology
Parameters: Bond – bond to remove
-
is_small_molecule
¶ bool – True if molecule’s mass is less than 500 Daltons (not mutually exclusive with
self.is_biomolecule
)
-
nbonds
¶ int – number of chemical bonds in this molecule
-
newbond
(a1, a2, order)[source]¶ Create a new bond
Parameters: - a1 (moldesign.Atom) – First atom in the bond
- a2 (moldesign.Atom) – Second atom in the bond
- order (int) – order of the bond
Returns: moldesign.Bond
-
num_bonds
int – number of chemical bonds in this molecule
-
velocities
¶ u.Vector[length/time] – Nx3 array of atomic velocities
- atomcontainer (AtomContainer or AtomList or List[moldesign.Atom]) –
-
class
moldesign.molecules.
Trajectory
(mol, unit_system=None, first_frame=False)[source]¶ Bases:
object
A
Trajectory
stores information about a molecule’s motion and how its properties change as it moves.- A trajectory object contains
- a reference to the
moldesign.Molecule
it describes, and - a list of
Frame
objects, each one containing a snapshot of the molecule at a
- a reference to the
particular point in its motion.
Parameters: - mol (moldesign.Molecule) – the trajectory will describe the motion of this molecule
- unit_system (u.UnitSystem) – convert all attributes to this unit system (default:
moldesign.units.default
) - first_frame (bool) – Create the trajectory’s first
Frame
from the molecule’s current position
-
mol
¶ moldesign.Molecule – the molecule object that this trajectory comes from
-
frames
¶ List[Frame] – a list of the trajectory frames in the order they were created
-
info
¶ str – text describing this trajectory
-
unit_system
¶ u.UnitSystem – convert all attributes to this unit system
-
DONOTAPPLY
= set(['kinetic_energy'])¶
-
MOL_ATTRIBUTES
= ['positions', 'momenta', 'time']¶
-
align_orbital_phases
(reference_frame=None)[source]¶ Try to remove orbital sign flips between frames. If reference_frame is not passed, we’ll start with frame 0 and align successive pairs of orbitals. If reference_frame is an int, we’ll move forwards and backwards from that frame number. Otherwise, we’ll try to align every orbital frame to those in reference_frame
Parameters: reference_frame (int or Frame) – Frame
containing the orbitals to align with (default: align each frame with the previous one)
-
apply_frame
(frame)[source]¶ Reconstruct the underlying molecule with the given frame. Right now, any data not passed is ignored, which may result in properties that aren’t synced up with each other ...
-
draw
(**kwargs)¶ TrajectoryViewer: create a trajectory visualization
Parameters: **kwargs (dict) – keyword arguments for ipywidgets.Box
-
draw3d
(**kwargs)[source]¶ TrajectoryViewer: create a trajectory visualization
Parameters: **kwargs (dict) – keyword arguments for ipywidgets.Box
-
kinetic_energy
¶
-
kinetic_temperature
¶
-
new_frame
(properties=None, **additional_data)[source]¶ Create a new frame, EITHER from the parent molecule or from a list of properties
Parameters: Returns: frame number (0-based)
Return type:
-
num_frames
¶ int – number of frames in this trajectory
-
plot
(x, y, **kwargs)[source]¶ Create a matplotlib plot of property x against property y
Parameters: Returns: the lines that were plotted
Return type: List[matplotlib.lines.Lines2D]
-
rmsd
(atoms=None, reference=None)[source]¶ Calculate root-mean-square displacement for each frame in the trajectory.
The RMSD between times \(t\) and \(t0\) is given by
\(\text{RMSD}(t;t_0) =\sqrt{\sum_{i \in \text{atoms}} \left( \mathbf{R}_i(t) - \mathbf{R}_i(t_0) \right)^2}\),
where \(\mathbf{R}_i(t)\) is the position of atom i at time t.
Parameters: - atoms (list[mdt.Atom]) – list of atoms to calculate the RMSD for (all atoms in the
Molecule
) - reference (u.Vector[length]) – Reference positions for RMSD. (default:
traj.frames[0].positions
)
Returns: list of RMSD displacements for each frame in the trajectory
Return type: u.Vector[length]
- atoms (list[mdt.Atom]) – list of atoms to calculate the RMSD for (all atoms in the
-
slice_frames
(key, missing=None)[source]¶ Return an array of giving the value of
key
at each frame.Parameters: - key (str) – name of the property, e.g., time, potential_energy, annotation, etc
- missing – value to return if a given frame does not have this property
Returns: - vector containing the value at each frame, or the value given
in the
missing
keyword) (len= len(self) )
Return type: moldesign.units.Vector
Submodules¶
moldesign.molecules.atomcollections module¶
-
class
moldesign.molecules.atomcollections.
AtomContainer
(*args, **kwargs)[source]¶ Bases:
object
Mixin functions for objects that have a
self.atoms
attribute with a list of atoms-
atoms
¶ List[Atom] – a list of atoms
-
angle
(a1, a2, a3)[source]¶ Calculate the angle between three atoms.
Atoms can be passed as the atoms themselves or as the atom names
Parameters: a2, a3 (a1,) – atoms defining the angle Returns: units.Scalar[angle]
-
atoms_within
(radius, other=None, include_self=False)[source]¶ Return all atoms in an object within a given radius of this object
Parameters: - radius (u.Scalar[length]) – radius to search for atoms
- other (AtomContainer) – object containing the atoms to search (default:self.parent)
- include_self (bool) – if True, include the atoms from this object (since, by definition, their distance from this object is 0)
Returns: list of the atoms within
radius
of this objectReturn type:
-
calc_displacements
()[source]¶ Calculate an array of displacements between all atoms in this object
Returns: array of pairwise displacements between atoms Return type: u.Array[length] Example
>>> displacements = self.calc_displacements(other) >>> displacements[i, j] == (self.atoms[i].position - self.atoms[j].position)
-
calc_distance_array
(other=None)[source]¶ Calculate an array of pairwise distance between all atoms in self and other
Parameters: other (AtomContainer) – object to calculate distances to (default: self) Returns: 2D array of pairwise distances between the two objects Return type: u.Array[length] Example
>>> dists = self.calc_distance_array(other) >>> dists[i, j] == self.atoms[i].distance(other.atoms[j])
-
center_of_mass
¶ units.Vector[length] – The (x,y,z) coordinates of this object’s center of mass
-
com
¶ units.Vector[length] – The (x,y,z) coordinates of this object’s center of mass
-
copy
()[source]¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
dihedral
(a1, a2, a3=None, a4=None)[source]¶ Calculate the dihedral angle between atoms a1, a2, a3, a4.
Atoms can be passed as the atoms themselves or as the atom names
Parameters: a2, a3, a4 (a1,) – atoms defining the dihedral Returns: units.Scalar[angle]
-
distance
(other)[source]¶ Returns closest distance between this and the other entity
Parameters: other (AtomContainer) – object to calculate distance to Returns: closest distance between self and other Return type: u.Scalar[length] Example
>>> distance = self.distance(other) >>> distance == self.calc_distance_array(other).min()
-
draw
(width=500, height=500, show_2dhydrogens=None, display=False)[source]¶ Visualize this molecule (Jupyter only).
Creates a 3D viewer, and, for small molecules, a 2D viewer).
Parameters: Returns: moldesign.ui.SelectionGroup
-
draw2d
(highlight_atoms=None, show_hydrogens=None, **kwargs)[source]¶ Draw this object in 2D. Jupyter only.
Parameters: - highlight_atoms (List[Atom]) – atoms to highlight when the structure is drawn
- show_hydrogens (bool) – whether to draw the hydrogens or not (default: True if there are 10 or less heavy atoms, false otherwise)
Returns: 2D viewer object
Return type: mdt.ChemicalGraphViewer
-
draw3d
(highlight_atoms=None, **kwargs)[source]¶ Draw this object in 3D. Jupyter only.
Parameters: highlight_atoms (List[Atom]) – atoms to highlight when the structure is drawn Returns: 3D viewer object Return type: mdt.GeometryViewer
-
get_atoms
(**queries)[source]¶ Allows keyword-based atom queries.
Parameters: **queries (dict) – parameters to match Returns: the atoms matching this query Return type: AtomList
-
heavy_atoms
¶ AtomList – a list of all heavy atoms (i.e., non-hydrogen) in this object
-
mass
¶ u.Scalar[mass] – total mass of this object
-
natoms
¶ int – number of atoms in this object
-
num_atoms
¶ int – number of atoms in this object
-
positions
¶ u.Array[length] – (Nx3) array of atomic positions
-
rotate
(angle, axis, center=None)[source]¶ Rotate this object in 3D space
Parameters: - angle (u.Scalar[angle]) – angle to rotate by
- axis (u.Vector[length]) – axis to rotate about (len=3)
- center (u.Vector[length]) – center of rotation (len=3) (default: origin)
-
-
class
moldesign.molecules.atomcollections.
AtomList
(*args, **kwargs)[source]¶ Bases:
list
,moldesign.molecules.atomcollections.AtomContainer
A list of atoms that allows attribute “slicing” - accessing an attribute of the list will return a list of atom attributes.
Example
>>> atomlist.mass == [atom.mass for atom in atomlist.atoms] >>> getattr(atomlist, attr) == [getattr(atom, attr) for atom in atomlist.atoms]
-
atoms
¶ This is a synonym for self so that AtomContainer methods will work here too
-
moldesign.molecules.atoms module¶
-
class
moldesign.molecules.atoms.
Atom
(name=None, atnum=None, mass=None, chain=None, residue=None, formal_charge=None, pdbname=None, pdbindex=None, element=None)[source]¶ Bases:
moldesign.molecules.atoms.AtomDrawingMixin
,moldesign.molecules.atoms.AtomGeometryMixin
,moldesign.molecules.atoms.AtomPropertyMixin
,moldesign.molecules.atoms.AtomReprMixin
A data structure representing an atom.
Atom
objects store information about individual atoms within a larger molecular system, providing access to atom-specific geometric, biomolecular, topological and property information. EachMolecule
is composed of a list of atoms.Atoms can be instantiated directly, but they will generally be created automatically as part of molecules.
Parameters: - name (str) – The atom’s name (if not passed, set to the element name + the atom’s index)
- atnum (int) – Atomic number (if not passed, determined from element if possible)
- mass (units.Scalar[mass]) – The atomic mass (if not passed, set to the most abundant isotopic mass)
- chain (moldesign.Chain) – biomolecular chain that this atom belongs to
- residue (moldesign.Residue) – biomolecular residue that this atom belongs to
- pdbname (str) – name from PDB entry, if applicable
- pdbindex (int) – atom serial number in the PDB entry, if applicable
- element (str) – Elemental symbol (if not passed, determined from atnum if possible)
Atom instance attributes:
-
name
¶ str – A descriptive name for this atom
-
element
¶ str – IUPAC elemental symbol (‘C’, ‘H’, ‘Cl’, etc.)
-
index
¶ int – the atom’s current index in the molecule (
self is self.parent.atoms[ self.index]
)
-
atnum
¶ int – atomic number (synonyms: atomic_num)
-
mass
¶ u.Scalar[mass] – the atom’s mass
-
position
¶ units.Vector[length] – atomic position vector. Once an atom is part of a molecule, this quantity will refer to
self.molecule.positions[self.index]
.
-
momentum
¶ units.Vector[momentum] – atomic momentum vector. Once an atom is part of a molecule, this quantity will refer to
self.molecule.momenta[self.index]
.
-
x,y,z
u.Scalar[length] – x, y, and z components of
atom.position
-
vx, vy, vz
u.Scalar[length/time] – x, y, of
atom.velocity
-
px, py, pz
u.Scalar[momentum] – x, y, and z of
atom.momentum
-
fx, fy, fz
u.Scalar[force] – x, y, and z
atom.force
-
residue
¶ moldesign.Residue – biomolecular residue that this atom belongs to
-
chain
¶ moldesign.Chain – biomolecular chain that this atom belongs to
-
parent
¶ moldesign.Molecule – molecule that this atom belongs to
-
index
int – index in the parent molecule:
atom is atom.parent.atoms[index]
Atom methods and properties
See also methods offered by the mixin superclasses:
-
bond_graph
¶ Mapping[Atom, int] – dictionary of this atoms bonded neighbors, of the form
{bonded_atom1, bond_order1, ...}
-
bond_to
(other, order)[source]¶ Create or modify a bond with another atom
Parameters: Returns: bond object
Return type:
-
bonds
¶ List[Bond] – list of all bonds this atom is involved in
-
copy
(self)¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
elem
¶ str – elemental symbol
-
element
str – elemental symbol
-
force
¶ (units.Vector[force]) – atomic force vector. This quantity must be calculated - it is equivalent to
self.molecule.forces[self.index]
Raises: moldesign.NotCalculatedError
– if molecular forces have not been calculated
-
heavy_bonds
¶ List[Bond] – list of all heavy atom bonds (where BOTH atoms are not hydrogen)
Note
this returns an empty list if called on a hydrogen atom
-
nbonds
¶ int – the number of other atoms this atom is bonded to
-
num_bonds
¶ int – the number of other atoms this atom is bonded to
-
symbol
¶ str – elemental symbol
-
valence
¶ int – the sum of this atom’s bond orders
-
velocity
¶ u.Vector[length/time, 3] – velocity of this atom; equivalent to
self.momentum/self.mass
-
class
moldesign.molecules.atoms.
AtomDrawingMixin
[source]¶ Bases:
object
Functions for creating atomic visualizations.
Note
This is a mixin class designed only to be mixed into the
Atom
class. Routines are separated are here for code organization only - they could be included in the main Atom class without changing any functionality-
draw
(width=300, height=300)[source]¶ Draw a 2D and 3D viewer with this atom highlighted (notebook only)
Parameters: Returns: viewer object
Return type: ipy.HBox
-
-
class
moldesign.molecules.atoms.
AtomGeometryMixin
[source]¶ Bases:
object
Functions measuring distances between atoms and other things.
Note
This is a mixin class designed only to be mixed into the
Atom
class. Routines are separated are here for code organization only - they could be included in the main Atom class without changing any functionality-
atoms_within
(self, radius, other=None, include_self=False)¶ Return all atoms in an object within a given radius of this object
Parameters: - radius (u.Scalar[length]) – radius to search for atoms
- other (AtomContainer) – object containing the atoms to search (default:self.parent)
- include_self (bool) – if True, include the atoms from this object (since, by definition, their distance from this object is 0)
Returns: list of the atoms within
radius
of this objectReturn type:
-
calc_distances
(*args, **kwargs)¶ calc_distance_array(self, other=None) Calculate an array of pairwise distance between all atoms in self and other
- Args:
- other (AtomContainer): object to calculate distances to (default: self)
- Returns:
- u.Array[length]: 2D array of pairwise distances between the two objects
- Example:
>>> dists = self.calc_distance_array(other) >>> dists[i, j] == self.atoms[i].distance(other.atoms[j])
-
distance
(self, other)¶ Returns closest distance between this and the other entity
Parameters: other (AtomContainer) – object to calculate distance to Returns: closest distance between self and other Return type: u.Scalar[length] Example
>>> distance = self.distance(other) >>> distance == self.calc_distance_array(other).min()
-
-
class
moldesign.molecules.atoms.
AtomPropertyMixin
[source]¶ Bases:
object
Functions accessing computed atomic properties.
Note
This is a mixin class designed only to be mixed into the
Atom
class. Routines are separated are here for code organization only - they could be included in the main Atom class without changing any functionality-
basis_functions
¶ List[mdt.orbitals.AtomicBasisFunction] – This atom’s basis functions, if available (
None
otherwise)
-
ff
¶ moldesign.utils.DotDict – This atom’s force field parameters, if available (
None
otherwise)
-
properties
¶ moldesign.utils.DotDict – Returns any calculated properties for this atom
-
-
class
moldesign.molecules.atoms.
AtomReprMixin
[source]¶ Bases:
object
Functions for printing out various strings related to the atom.
Note
This is a mixin class designed only to be mixed into the
Atom
class. Routines are separated are here for code organization only - they could be included in the main Atom class without changing any functionality
moldesign.molecules.biounits module¶
-
class
moldesign.molecules.biounits.
ChildList
(parent)[source]¶ Bases:
moldesign.molecules.atomcollections.AtomContainer
A list of biochemical objects that can be accessed by name or by index.
-
atoms
¶ AtomList – a sorted list of all atoms in this entity and/or its children
-
-
class
moldesign.molecules.biounits.
Entity
(name=None, molecule=None, index=None, pdbname=None, pdbindex=None, **kwargs)[source]¶ Bases:
moldesign.molecules.atomcollections.AtomContainer
Generalized storage mechanism for hierarchical representation of biomolecules, e.g. by residue, chain, etc. Permits other groupings, provided that everything is tree-like.
All children of a given entity must have unique names. An individual child can be retrieved with
entity.childname
orentity['childname']
orentity[index]
Yields: Entity or mdt.Atom – this entity’s children, in order
-
class
moldesign.molecules.biounits.
Instance
(name=None, molecule=None, index=None, pdbname=None, pdbindex=None, **kwargs)[source]¶ Bases:
moldesign.molecules.biounits.Entity
The singleton biomolecular container for each
Molecule
. Its children are generally PDB chains. Users won’t ever really see this object.
moldesign.molecules.bonds module¶
-
class
moldesign.molecules.bonds.
Bond
(a1, a2, order=None)[source]¶ Bases:
object
A bond between two atoms.
Parameters: Notes
Comparisons and hashes involving bonds will return True if the atoms involved in the bonds are the same. Bond orders are not compared.
These objects are used to represent and pass bond data only - they are not used for storage.
-
a1
¶ Atom – First atom in the bond; assigned so that
self.a1.index < self.a2.index
-
a2
¶ Atom – Second atom in the bond; assigned so that
self.a2.index > self.a1.index
-
order
¶ int – bond order (can be
None
); not used in comparisons
-
ff
¶ mdt.forcefield.BondTerm – the force-field term for this bond (or
None
if no forcefield is present)
-
name
¶ str – name of the bond
-
moldesign.molecules.chain module¶
-
class
moldesign.molecules.chain.
Chain
(pdbname=None, **kwargs)[source]¶ Bases:
moldesign.molecules.biounits.Entity
Biomolecular chain class - its children are almost always residues.
-
parent
¶ mdt.Molecule – the molecule this residue belongs to
-
chain
¶ Chain – the chain this residue belongs to
-
assign_biopolymer_bonds
()[source]¶ Connect bonds between residues in this chain.
See also
moldesign.Residue.assign_template_bonds
Raises: ValueError
– ifresidue.resname
is not in bioresidue templatesKeyError
– if an atom in this residue is not recognized
-
c_terminal
¶ moldesign.Residue – The chain’s C-terminus (or
None
if it does not exist)
-
copy
()[source]¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
fiveprime_end
¶ moldesign.Residue – The chain’s 5’ base (or
None
if it does not exist)
-
get_ligand
()[source]¶ Return a (single) ligand if it exists; raises ValueError if there’s not exactly one
This is a utility routine to get a single ligand from a chain. If there’s exactly one residue, it is returned. If not, ValueError is raised - use
Chain.unclassified_residues()
to get an iterator over all unclassified residues.Returns: ligand residue Return type: moldesign.Residue Raises: ValueError
– if the chain does not contain exactly one unclassifiable residue
-
n_terminal
¶ moldesign.Residue – The chain’s N-terminus (or
None
if it does not exist)
-
nresidues
¶
-
num_residues
¶
-
numresidues
¶
-
polymer_residues
¶
-
residues
¶ ChildList – list of residues in this chain
-
sequence
¶ str – this chain’s residue sequence with one-letter residue codes
-
solvent_residues
¶
-
threeprime_end
¶ moldesign.Residue – The chain’s 3’ base (or
None
if it does not exist)
-
type
¶ str – the type of chain - protein, DNA, solvent, etc.
This field returns the type of chain, classified by the following rules: 1) If the chain contains only one type of residue, it is given that classification
(so a chain containing only ions has type “ion”- If the chain contains a biopolymer + ligands and solvent, it is classified as a biopolymer (i.e. ‘protein’, ‘dna’, or ‘rna’). This is the most common case with .pdb files from the PDB.
- If the chain contains multiple biopolymer types, it will be given a hybrid classification (e.g. ‘dna/rna’, ‘protein/dna’) - this is rare!
- If it contains multiple kinds of non-biopolymer residues, it will be called “solvent” (if all non-bio residues are water/solvent/ion) or given a hybrid name as in 3)
-
unclassified_residues
¶
-
moldesign.molecules.coord_arrays module¶
This module contains python “descriptors” (nothing to do with chemoinformatic “descriptors”) that help maintain the links between an atom’s coordinates and its molecule’s coordinates
-
class
moldesign.molecules.coord_arrays.
AtomArray
(atomname, moleculename)[source]¶ Bases:
moldesign.molecules.coord_arrays.ProtectedArray
Descriptor for atom coordinates are stored in the parent molecule.
Makes sure that the arrays and their references are maintained during both reassignment and copying/pickling :param atomname: name of the attribute in the atom instance :type atomname: str :param parentname: name of the corresponding attribute in the molecule instance :type parentname: str
moldesign.molecules.molecule module¶
-
class
moldesign.molecules.molecule.
MolConstraintMixin
[source]¶ Bases:
object
Functions for applying and managing geometrical constraints.
Note
This is a mixin class designed only to be mixed into the
Molecule
class. Routines are separated are here for code organization only - they could be included in the main Molecule class without changing any functionality-
clear_constraints
()[source]¶ Clear all geometry constraints from the molecule.
Note
This does NOT clear integrator options - such as “constrain H bonds”
-
constrain_angle
(atom1, atom2, atom3, angle=None)[source]¶ Constrain the bond angle atom1-atom2-atom3
Parameters: - atom1 (moldesign.Atom) –
- atom2 (moldesign.Atom) –
- atom3 (moldesign.Atom) –
- angle ([angle]) – angle value (default: current angle)
Returns: constraint object
Return type: moldesign.geometry.AngleConstraint
-
constrain_atom
(atom, pos=None)[source]¶ Constrain the position of an atom
Parameters: - atom (moldesign.Atom) – The atom to constrain
- pos (moldesign.units.MdtQuantity) – position to fix this atom at (default: atom.position) [length]
Returns: constraint object
Return type: moldesign.geometry.FixedPosition
-
constrain_dihedral
(atom1, atom2, atom3, atom4, angle=None)[source]¶ Constrain the bond angle atom1-atom2-atom3
Parameters: - atom1 (moldesign.Atom) –
- atom2 (moldesign.Atom) –
- atom3 (moldesign.Atom) –
- atom4 (moldesign.Atom) –
- angle ([angle]) – angle value (default: current angle)
Returns: constraint object
Return type:
-
constrain_distance
(atom1, atom2, dist=None)[source]¶ Constrain the distance between two atoms
Parameters: - atom1 (moldesign.Atom) –
- atom2 (moldesign.Atom) –
- dist ([length]) – distance value (default: current distance)
Returns: constraint object
Return type: moldesign.geometry.DistanceConstraint
-
-
class
moldesign.molecules.molecule.
MolDrawingMixin
[source]¶ Bases:
object
Methods for visualizing molecular structure.
Note
This is a mixin class designed only to be mixed into the
Molecule
class. Routines are separated are here for code organization only - they could be included in the main Molecule class without changing any functionality
-
class
moldesign.molecules.molecule.
MolPropertyMixin
[source]¶ Bases:
object
Functions for calculating and accessing molecular properties.
Note
This is a mixin class designed only to be mixed into the
Molecule
class. Routines are separated are here for code organization only - they could be included in the main Molecule class without changing any functionality-
calc_dipole
(**kwargs)¶ Calculate forces and return them
Returns: dipole moment at this position (len=3) Return type: units.Vector[length*charge]
-
calc_forces
(**kwargs)¶ Calculate forces and return them
Returns: units.Vector[force]
-
calc_potential_energy
(**kwargs)¶ Calculate potential energy and return it
Returns: potential energy at this position Return type: units.Scalar[energy]
-
calc_property
(name, **kwargs)[source]¶ Calculate the given property if necessary and return it
Parameters: name (str) – name of the property (e.g. ‘potential_energy’, ‘forces’, etc.) Returns: the requested property Return type: object
-
calc_wfn
(**kwargs)¶ Calculate forces and return them
Returns: electronic wavefunction object Return type: moldesign.orbitals.ElectronicWfn
-
calculate_dipole
(**kwargs)[source]¶ Calculate forces and return them
Returns: dipole moment at this position (len=3) Return type: units.Vector[length*charge]
-
calculate_potential_energy
(**kwargs)[source]¶ Calculate potential energy and return it
Returns: potential energy at this position Return type: units.Scalar[energy]
-
calculate_wfn
(**kwargs)[source]¶ Calculate forces and return them
Returns: electronic wavefunction object Return type: moldesign.orbitals.ElectronicWfn
-
dipole
¶ units.Vector[length*charge] – return the molecule’s dipole moment, if calculated (len=3).
Raises: NotCalculatedError
– If the dipole moment has not yet been calculated at this geometry
-
dynamic_dof
¶ int – Count the number of spatial degrees of freedom of the system, taking into account any constraints
Note
If there are other DOFs not taken into account here, this quantity can be set explicitly
-
forces
¶ units.Vector[force] – return the current force on the molecule, if calculated.
Raises: NotCalculatedError
– If the forces have not yet been calculated at this geometry
-
get_property
(name)[source]¶ Return the given property if already calculated; raise NotCalculatedError otherwise
Parameters: name (str) – name of the property (e.g. ‘potential_energy’, ‘forces’, etc.) Raises: NotCalculatedError
– If the molecular property has not yet been calculated at this geometryReturns: the requested property Return type: object
-
homo
¶ int – The array index (0-based) of the highest occupied molecular orbital (HOMO).
Note
This assumes a closed shell ground state!
-
kinetic_energy
¶ u.Scalar[energy] – Classical kinetic energy \(\sum_{\text{atoms}} \frac{p^2}{2m}\)
-
kinetic_temperature
¶ [temperature] – temperature calculated using the equipartition theorem,
\(\frac{2 E_{\text{kin}}}{k_b f}\),
where \(E_{\text{kin}}\) is the kinetic energy and \(f\) is the number of degrees of freedom (see
dynamic_dof
)
-
lumo
¶ int – The array index (0-based) of the lowest unoccupied molecular orbital (LUMO).
Note
This assumes a closed shell ground state!
-
mass
¶ u.Scalar[mass] – the molecule’s mass
-
num_electrons
¶ int – The number of electrons in the system, based on the atomic numbers and self.charge
-
potential_energy
¶ units.Scalar[energy] – return the molecule’s current potential energy, if calculated.
Raises: NotCalculatedError
– If the potential energy has not yet been calculated at this geometry
-
properties
¶ MolecularProperties – Molecular properties calculated at this geometry
-
update_properties
(properties)[source]¶ This is intended mainly as a callback for long-running property calculations. When they are finished, they can call this method to update the molecule’s properties.
Parameters: properties (dict) – properties-like object. MUST contain a ‘positions’ attribute.
-
wfn
¶ moldesign.orbitals.ElectronicWfn – return the molecule’s current electronic state, if calculated.
Raises: NotCalculatedError
– If the electronic state has not yet been calculated at this geometry
-
-
class
moldesign.molecules.molecule.
MolReprMixin
[source]¶ Bases:
object
Methods for creating text-based representations of the molecule
Note
This is a mixin class designed only to be mixed into the
Molecule
class. Routines are separated are here for code organization only - they could be included in the main Molecule class without changing any functionality-
biomol_summary_markdown
()[source]¶ A markdown description of biomolecular structure.
Returns: Markdown string Return type: str
-
-
class
moldesign.molecules.molecule.
MolSimulationMixin
[source]¶ Bases:
object
Functions calculating energies, running dynamics, and minimizing geometry.
Note
This is a mixin class designed only to be mixed into the
Molecule
class. Routines are separated are here for code organization only - they could be included in the main Atom class without changing any functionality-
calculate
(requests=None, wait=True, use_cache=True)[source]¶ Runs a potential energy calculation on the current geometry, returning the requested quantities. If requests is not passed, the properties specified in the energy_models DEFAULT_PROPERTIES will be calculated.
Parameters: - requests (List[str]) – list of quantities for the model to calculate, e.g. [‘dipole’, ‘forces’]
- wait (bool) – if True, wait for the calculation to complete before returning. If false, return a job object - this will not update the molecule’s properties!
- use_cache (bool) – Return cached results if possible
Returns: MolecularProperties
-
configure_methods
()[source]¶ Interactively configure this molecule’s simulation methods (notebooks only)
Returns: configuration widget Return type: ipywidgets.Box
-
minimize
(mol, nsteps=20, force_tolerance=<Quantity(0.00514220566583, 'eV / ang')>, frame_interval=None, _restart_from=0, _restart_energy=None, assert_converged=False)[source]¶ Perform an energy minimization (aka geometry optimization or relaxation).
If
force_tolerance
is not specified, the program defaults are used. If specified, the largest force component must be less than force_tolerance and the RMSD must be less than 1/3 of it. (based on GAMESS OPTTOL keyword)Parameters: assert_converged (bool) – Raise an exception if the minimization does not converged. Returns: moldesign.trajectory.Trajectory
-
run
(run_for)[source]¶ Starts the integrator’s default integration
Parameters: run_for (int or [time]) – number of steps or amount of time to run for Returns: moldesign.trajectory.Trajectory
-
set_energy_model
(model, **params)[source]¶ Associate an energy model with this molecule
Parameters: - model (moldesign.methods.EnergyModelBase) – The energy model to associate with this molecule
- **params (dict) – a dictionary of parameters for the model
Note
For convenience,
model
can be an instance, a class, or a constructor (with call signaturemodel(**params) -> model instance)
-
set_integrator
(integrator, **params)[source]¶ Associate an integrator with this molecule
Parameters: - integrator (moldesign.integrators.IntegratorBase) – The integrator to associate with this molecule
- **params (dict) – a dictionary of parameters for the integrator
Note
For convenience,
integrator
can be an instance, a class, or a constructor (with call signatureintegrator(**params) -> integrator instance)
-
-
class
moldesign.molecules.molecule.
MolTopologyMixin
[source]¶ Bases:
object
Functions for building and keeping track of bond topology and biochemical structure.
Note
This is a mixin class designed only to be mixed into the
Molecule
class. Routines are separated are here for code organization only - they could be included in the main Atom class without changing any functionality-
assert_atom
(atom)[source]¶ If passed an integer, just return self.atoms[atom]. Otherwise, assert that the atom belongs to this molecule
-
-
class
moldesign.molecules.molecule.
MolecularProperties
(mol, **properties)[source]¶ Bases:
moldesign.utils.classes.DotDict
Stores property values for a molecule. These objects will be generally created and updated by EnergyModels, not by users.
-
class
moldesign.molecules.molecule.
Molecule
(atomcontainer, name=None, bond_graph=None, copy_atoms=False, pdbname=None, charge=None, electronic_state_index=0)[source]¶ Bases:
moldesign.molecules.atomcollections.AtomContainer
,moldesign.molecules.molecule.MolConstraintMixin
,moldesign.molecules.molecule.MolPropertyMixin
,moldesign.molecules.molecule.MolDrawingMixin
,moldesign.molecules.molecule.MolReprMixin
,moldesign.molecules.molecule.MolTopologyMixin
,moldesign.molecules.molecule.MolSimulationMixin
Molecule
objects store a molecular system, including atoms, 3D coordinates, molecular properties, biomolecular entities, and other model-specific information. Interfaces with simulation models take place through the molecule object.Molecule objects will generally be created by reading files or parsing other input; see, for example:
moldesign.read()
,moldesign.from_smiles()
,moldesign.from_pdb()
, etc.This constructor is useful, however for copying other molecular structures (see examples below).
Parameters: - atomcontainer (AtomContainer or AtomList or List[moldesign.Atom]) –
atoms that make up this molecule.
Note
If the passed atoms don’t already belong to a molecule, they will be assigned to this one. If they DO already belong to a molecule, they will be copied, leaving the original molecule untouched.
- name (str) – name of the molecule (automatically generated if not provided)
- bond_graph (dict) – dictionary specifying bonds between the atoms - of the form
{atom1:{atom2:bond_order, atom3:bond_order}, atom2:...}
This structure must be symmetric; we requirebond_graph[atom1][atom2] == bond_graph[atom2][atom1]
- copy_atoms (bool) – Create the molecule with copies of the passed atoms (they will be copied automatically if they already belong to another molecule)
- pdbname (str) – Name of the PDB file
- charge (units.Scalar[charge]) – molecule’s formal charge
- electronic_state_index (int) – index of the molecule’s electronic state
Examples
Use the
Molecule
class to create copies of other molecules and substructures thereof: >>> benzene = mdt.from_name(‘benzene’) >>> benzene_copy = mdt.Molecule(benzene, name=’benzene copy’)>>> protein = mdt.from_pdb('3AID') >>> carbon_copies = mdt.Molecule([atom for atom in protein.atoms if atom.atnum==6]) >>> first_residue_copy = mdt.Molecule(protein.residues[0])
Molecule instance attributes:
-
atoms
¶ AtomList – List of all atoms in this molecule.
-
bond_graph
¶ dict – symmetric dictionary specifying bonds between the atoms:
bond_graph = {atom1:{atom2:bond_order, atom3:bond_order}, atom2:...}
bond_graph[atom1][atom2] == bond_graph[atom2][atom1]
-
residues
¶ List[moldesign.Residue] – flat list of all biomolecular residues in this molecule
-
chains
¶ Dict[moldesign.Chain] – Biomolecular chains - individual chains can be accessed as
mol.chains[list_index]
ormol.chains[chain_name]
-
name
¶ str – A descriptive name for molecule
-
charge
¶ units.Scalar[charge] – molecule’s formal charge
-
constraints
¶ List[moldesign.geom.GeometryConstraint] – list of constraints
-
ndims
¶ int – length of the positions, momenta, and forces arrays (usually 3*self.num_atoms)
-
num_atoms
¶ int – number of atoms (synonym: natoms)
-
num_bonds
¶ int – number of bonds (synonym: nbonds)
-
positions
¶ units.Array[length] – Nx3 array of atomic positions
-
momenta
¶ units.Array[momentum] – Nx3 array of atomic momenta
-
masses
¶ units.Vector[mass] – vector of atomic masses
-
dim_masses
¶ units.Array[mass] – Nx3 array of atomic masses (for numerical convenience - allows you to calculate velocity, for instance, as
velocity = mol.momenta/mol.dim_masses
-
time
¶ units.Scalar[time] – current time in dynamics
-
energy_model
¶ moldesign.models.base.EnergyModelBase – Object that calculates molecular properties - driven by mol.calculate()
-
integrator
¶ moldesign.integrators.base.IntegratorBase – Object that drives movement of 3D coordinates in time, driven by mol.run()
-
is_biomolecule
¶ bool – True if this molecule contains at least 2 biochemical residues
Molecule methods and properties
See also methods offered by the mixin superclasses:
moldesign.molecules.AtomContainer
moldesign.molecules.MolPropertyMixin
moldesign.molecules.MolDrawingMixin
moldesign.molecules.MolSimulationMixin
moldesign.molecules.MolTopologyMixin
moldesign.molecules.MolConstraintMixin
moldesign.molecules.MolReprMixin
-
addatom
(newatom)[source]¶ Add a new atom to the molecule
Parameters: newatom (moldesign.Atom) – The atom to add (it will be copied if it already belongs to a molecule)
-
addatoms
(newatoms)[source]¶ Add new atoms to this molecule. For now, we really just rebuild the entire molecule in place.
Parameters: newatoms (List[moldesign.Atom]) –
-
bonds
¶ Iterator over all bonds in the molecule
Yields: moldesign.atoms.Bond – bond object
-
deletebond
(bond)[source]¶ Remove this bond from the molecule’s topology
Parameters: Bond – bond to remove
-
is_small_molecule
¶ bool – True if molecule’s mass is less than 500 Daltons (not mutually exclusive with
self.is_biomolecule
)
-
nbonds
¶ int – number of chemical bonds in this molecule
-
newbond
(a1, a2, order)[source]¶ Create a new bond
Parameters: - a1 (moldesign.Atom) – First atom in the bond
- a2 (moldesign.Atom) – Second atom in the bond
- order (int) – order of the bond
Returns: moldesign.Bond
-
num_bonds
int – number of chemical bonds in this molecule
-
velocities
¶ u.Vector[length/time] – Nx3 array of atomic velocities
- atomcontainer (AtomContainer or AtomList or List[moldesign.Atom]) –
moldesign.molecules.residue module¶
-
class
moldesign.molecules.residue.
Residue
(**kwargs)[source]¶ Bases:
moldesign.molecules.biounits.Entity
A biomolecular residue - most often an amino acid, a nucleic base, or a solvent molecule. In PDB structures, also often refers to non-biochemical molecules.
Its children are almost always residues.
-
parent
¶ mdt.Molecule – the molecule this residue belongs to
-
chain
¶ Chain – the chain this residue belongs to
-
add
(atom, key=None)[source]¶ Add a child to this entity.
Raises: KeyError
– if an object with this key already existsParameters: - item (Entity or mdt.Atom) – the child object to add
- key (str) – Key to retrieve this item (default:
item.name
)
-
assign_template_bonds
()[source]¶ Assign bonds from bioresidue templates.
Only assigns bonds that are internal to this residue (does not connect different residues). The topologies here assume pH7.4 and may need to be corrected for other pHs
See also
moldesign.Chain.assign_biopolymer_bonds for assigning inter-residue bonds
Raises: ValueError
– ifresidue.resname
is not in bioresidue templatesKeyError
– if an atom in this residue is not recognized
-
atomnames
¶ Residue – synonym for
`self`
for for the sake of readability –`molecule.chains['A'].residues[123].atomnames['CA']`
-
atoms
¶
-
backbone
¶ AtomList – all backbone atoms for nucleic and protein residues (indentified using PDB names); returns None for other residue types
-
code
¶ str – one-letter amino acid code or two letter nucleic acid code, or ‘?’ otherwise
-
copy
()[source]¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
is_3prime_end
¶ bool – this is the last base in a strand
Raises: ValueError
– if this residue is not a DNA base
-
is_5prime_end
¶ bool – this is the first base in a strand
Raises: ValueError
– if this residue is not a DNA base
-
is_c_terminal
¶ bool – this is the first residue in a peptide
Raises: ValueError
– if this residue is not an amino acid
-
is_monomer
¶ bool – this residue is not part of a biopolymer
-
is_n_terminal
¶ bool – this is the last residue in a peptide
Raises: ValueError
– if this residue is not an amino acid
-
is_standard_residue
¶ bool – this residue is a “standard residue” for the purposes of a PDB entry.
In PDB files, this will be stored using ‘ATOM’ if this is a standard residue and ‘HETATM’ records if not.
Note
We currently define “standard” residues as those whose 3 letter residue code appears in the
moldesign.data.RESIDUE_DESCRIPTIONS
dictionary. Although this seems to work well, we’d welcome a PR with a less hacky method.References
PDB format guide: http://www.wwpdb.org/documentation/file-format
-
markdown_summary
()[source]¶ Markdown-formatted information about this residue
Returns: markdown-formatted string Return type: str
-
next_residue
¶ - Residue – The next residue in the chain (in the C-direction for proteins, 3’
- direction for nucleic acids)
Raises: NotImplementedError
– If we don’t know how to deal with this type of biopolymerStopIteration
– If there isn’t a next residue (i.e. it’s a 3’- or C-terminus)
-
prev_residue
¶ Residue –
- The next residue in the chain (in the N-direction for proteins, 5’ direction for
- nucleic acids)
Raises: NotImplementedError
– If we don’t know how to deal with this type of biopolymerStopIteration
– If there isn’t a previous residue (i.e. it’s a 5’- or N-terminus)
-
resname
¶ str – Synonym for pdbname
-
sidechain
¶ AtomList – all sidechain atoms for nucleic and protein residues (defined as non-backbone atoms); returns None for other residue types
-
type
¶ str – Classification of the residue (protein, solvent, dna, water, unknown)
-
moldesign.molecules.trajectory module¶
-
class
moldesign.molecules.trajectory.
Frame
[source]¶ Bases:
moldesign.utils.classes.DotDict
A snapshot of a molecule during its motion. This is really just a dictionary of properties. These properties are those accessed as
molecule.properties
, and can vary substantially depending on the origin of the trajectory. They also include relevant dynamical data and metadata (such astime
,momenta
,minimization_step
, etc.)Properties can be accessed either as attributes (
frame.property_name
) or as keys (frame['property_name']
)Some properties with specific meaning are described below:
-
annotation
¶ str – text describing this frame (will be displayed automatically when visualized)
-
minimization_step
¶ int – for minimization trajectories
-
time
¶ u.Scalar[time] – time during a dynamical trajectory
Example
>>> mol = mdt.from_name('benzene') >>> mol.set_potential_model(moldesign.methods.models.RHF(basis='3-21g')) >>> traj = mol.minimize() >>> starting_frame = traj.frames[0] >>> assert starting_frame.potential_energy >= traj.frames[-1].potential_energy >>> assert starting_frame.minimization_step == 0
-
-
class
moldesign.molecules.trajectory.
SubSelection
[source]¶ Bases:
object
Descriptors to get bits of the trajectory trajectory.atoms[3].position -> array of positions trajectory.atoms[5].distance( mol.chains[‘B’].residues[5] ) -> array of distances trajectory.chains[‘A’].residue[23].com -> array of COMs NOT IMPLEMENTED YET
-
class
moldesign.molecules.trajectory.
Trajectory
(mol, unit_system=None, first_frame=False)[source]¶ Bases:
object
A
Trajectory
stores information about a molecule’s motion and how its properties change as it moves.- A trajectory object contains
- a reference to the
moldesign.Molecule
it describes, and - a list of
Frame
objects, each one containing a snapshot of the molecule at a
- a reference to the
particular point in its motion.
Parameters: - mol (moldesign.Molecule) – the trajectory will describe the motion of this molecule
- unit_system (u.UnitSystem) – convert all attributes to this unit system (default:
moldesign.units.default
) - first_frame (bool) – Create the trajectory’s first
Frame
from the molecule’s current position
-
mol
¶ moldesign.Molecule – the molecule object that this trajectory comes from
-
frames
¶ List[Frame] – a list of the trajectory frames in the order they were created
-
info
¶ str – text describing this trajectory
-
unit_system
¶ u.UnitSystem – convert all attributes to this unit system
-
DONOTAPPLY
= set(['kinetic_energy'])¶
-
MOL_ATTRIBUTES
= ['positions', 'momenta', 'time']¶ List[str] – Always store these molecular attributes
-
align_orbital_phases
(reference_frame=None)[source]¶ Try to remove orbital sign flips between frames. If reference_frame is not passed, we’ll start with frame 0 and align successive pairs of orbitals. If reference_frame is an int, we’ll move forwards and backwards from that frame number. Otherwise, we’ll try to align every orbital frame to those in reference_frame
Parameters: reference_frame (int or Frame) – Frame
containing the orbitals to align with (default: align each frame with the previous one)
-
apply_frame
(frame)[source]¶ Reconstruct the underlying molecule with the given frame. Right now, any data not passed is ignored, which may result in properties that aren’t synced up with each other ...
-
draw
(**kwargs)¶ TrajectoryViewer: create a trajectory visualization
Parameters: **kwargs (dict) – keyword arguments for ipywidgets.Box
-
draw3d
(**kwargs)[source]¶ TrajectoryViewer: create a trajectory visualization
Parameters: **kwargs (dict) – keyword arguments for ipywidgets.Box
-
kinetic_energy
¶
-
kinetic_temperature
¶
-
new_frame
(properties=None, **additional_data)[source]¶ Create a new frame, EITHER from the parent molecule or from a list of properties
Parameters: Returns: frame number (0-based)
Return type:
-
num_frames
¶ int – number of frames in this trajectory
-
plot
(x, y, **kwargs)[source]¶ Create a matplotlib plot of property x against property y
Parameters: Returns: the lines that were plotted
Return type: List[matplotlib.lines.Lines2D]
-
rmsd
(atoms=None, reference=None)[source]¶ Calculate root-mean-square displacement for each frame in the trajectory.
The RMSD between times \(t\) and \(t0\) is given by
\(\text{RMSD}(t;t_0) =\sqrt{\sum_{i \in \text{atoms}} \left( \mathbf{R}_i(t) - \mathbf{R}_i(t_0) \right)^2}\),
where \(\mathbf{R}_i(t)\) is the position of atom i at time t.
Parameters: - atoms (list[mdt.Atom]) – list of atoms to calculate the RMSD for (all atoms in the
Molecule
) - reference (u.Vector[length]) – Reference positions for RMSD. (default:
traj.frames[0].positions
)
Returns: list of RMSD displacements for each frame in the trajectory
Return type: u.Vector[length]
- atoms (list[mdt.Atom]) – list of atoms to calculate the RMSD for (all atoms in the
-
slice_frames
(key, missing=None)[source]¶ Return an array of giving the value of
key
at each frame.Parameters: - key (str) – name of the property, e.g., time, potential_energy, annotation, etc
- missing – value to return if a given frame does not have this property
Returns: - vector containing the value at each frame, or the value given
in the
missing
keyword) (len= len(self) )
Return type: moldesign.units.Vector