The Top-Level Namespace¶
The functions and classes listed here comprise the toolkit’s public API. These are functions and objects that can be accessed as
>>> import moldesign as mdt
>>> mdt.[name]
-
moldesign.
from_smiles
(smi, name=None, wait=True)[source]¶ Translate a smiles string to a 3D structure. This method uses OpenBabel to generate a plausible 3D conformation of the 2D SMILES topology. We only use the first result from the conformation generator.
Parameters: Returns: the translated molecule
Return type:
-
moldesign.
read
(f, format=None)[source]¶ Read in a molecule from a file, file-like object, or string. Will also depickle a pickled object.
Note
Files with
.bz2
or.gz
suffixes will be automatically decompressed. Currently does not support files with more than one record - only returns the first recordParameters: - f (str or file-like) – Either a path to a file, OR a string with the file’s contents, OR a file-like object
- format (str) – molecule format (pdb, xyz, sdf, etc.) or pickle format (recognizes p, pkl, or pickle); guessed from filename if not passed
Returns: - molecule parsed from the file (or python object, for
pickle files)
Return type: moldesign.Molecule or object
Raises: ValueError
– iff
isn’t recognized as a string, file path, or file-like object
-
moldesign.
write
(obj, filename=None, format=None, mode='w')[source]¶ Write a molecule to a file or string. Will also pickle arbitrary python objects.
Note
Files with
.bz2
or.gz
suffixes will be automatically compressed.Parameters: Returns: if filename is none, return the output file as a string (otherwise returns
None
)Return type:
-
moldesign.
write_trajectory
(traj, filename=None, format=None, overwrite=True)[source]¶ Write trajectory a file (if filename provided) or file-like buffer
Parameters: - traj (moldesign.molecules.Trajectory) – trajectory to write
- filename (str) – name of file (return a file-like object if not passed)
- format (str) – file format (guessed from filename if None)
- overwrite (bool) – overwrite filename if it exists
Returns: file-like object (only if filename not passed)
Return type:
-
moldesign.
from_pdb
(pdbcode, usecif=False)[source]¶ Import the given molecular geometry from PDB.org
Parameters: Returns: molecule object
Return type:
-
moldesign.
from_name
(name)[source]¶ Attempt to convert an IUPAC or common name to a molecular geometry.
Parameters: name (str) – molecular name (generally IUPAC - some common names are also recognized) Returns: molecule object Return type: moldesign.Molecule
-
moldesign.
distance
(a1, a2)[source]¶ Return distance between two atoms
Parameters: a1,a2 (mdt.Atom) – the two atoms Returns: the distance Return type: u.Scalar[length]
-
moldesign.
angle
(a1, a2, a3)[source]¶ The angle between bonds a2-a1 and a2-a3
Parameters: a1,a2,a3 (mdt.Atom) – the atoms describing the angle Returns: the distance Return type: u.Scalar[length]
-
moldesign.
dihedral
(a1, a2=None, a3=None, a4=None)[source]¶ Twist angle of bonds a1-a2 and a4-a3 around around the central bond a2-a3
- Can be called as
dihedral(a1, a2, a3, a4)
- OR
dihedral(a2, a2)
ORdihedral(bond)
Parameters: - a1 (mdt.Bond) – the central bond in the dihedral. OR
- a1,a2 (mdt.Atom) – the atoms describing the dihedral
- a3,a4 (mdt.Atom) – (optional) if not passed,
a1
anda2
will be treated as the central atoms in this bond, and a3 and a4 will be inferred.
Returns: angle - [0, 2 pi) radians
Return type: (units.Scalar[angle])
- Can be called as
-
moldesign.
distance_gradient
(a1, a2)[source]¶ Gradient of the distance between two atoms,
\[\frac{\partial \mathbf{R}_1}{\partial \mathbf{r}} ||\mathbf{R}_1 - \mathbf{R}_2|| = \frac{\mathbf{R}_1 - \mathbf{R}_2}{||\mathbf{R}_1 - \mathbf{R}_2||}\]Parameters: a1,a2 (mdt.Atom) – the two atoms Returns: (gradient w.r.t. first atom, gradient w.r.t. second atom) Return type: Tuple[u.Vector[length], u.Vector[length]]
-
moldesign.
angle_gradient
(a1, a2, a3)[source]¶ Gradient of the angle between bonds a2-a1 and a2-a3
\[\nabla \theta_{ijkl} = \frac{\partial \theta_{ijkl}}{\partial \mathbf R}\]Parameters: a1,a2,a3 (mdt.Atom) – the atoms describing the vector References
-
moldesign.
dihedral_gradient
(a1, a2, a3, a4)[source]¶ Cartesian gradient of a dihedral coordinate,
\[\nabla \theta_{ijkl} = \frac{\partial \theta_{ijkl}}{\partial \mathbf R}\]Parameters: a1,a2,a3,a4 (mdt.Atom) – the atoms describing the dihedral References
-
moldesign.
set_distance
(a1, a2, newlength, adjustmol=True)[source]¶ Set the distance between two atoms. They will be adjusted along the vector separating them. If the two atoms are A) bonded, B) not part of the same ring system, and C)
adjustmol
is True, then the entire molecule’s positions will be modified as wellParameters: - a1,a2 (mdt.Atom) – atoms to adjust
- newlength (u.Scalar[length]) – new length to set
- adjustmol (bool) – Adjust all atoms on either side of this bond?
-
moldesign.
set_angle
(a1, a2, a3, theta, adjustmol=True)[source]¶ Set the angle between bonds a1-a2 and a3-a2. The atoms will be adjusted along the gradient of the angle. If
adjustmol
is True and the topology is unambiguous, then the entire molecule’s positions will be modified as wellParameters: - a1,a2,a3 (mdt.Atom) – atoms to adjust
- theta (u.Scalar[angle]) – new angle to set
- adjustmol (bool) – Adjust all atoms on either side of this bond?
-
moldesign.
set_dihedral
(a1, a2=None, a3=None, a4=None, theta=None, adjustmol=True)[source]¶ Set the twist angle of atoms a1 and a4 around the central bond a2-a3. The atoms will be adjusted along the gradient of the angle.
- Can be called as
set_dihedral(a1, a2, a3, a4, theta, adjustmol=True)
- OR
set_dihedral(a2, a2, theta, adjustmol=True)
ORset_dihedral(bond, theta, adjustmol=True)
If
adjustmol
is True and the topology is unambiguous, then the entire molecule’s positions will be modified as wellParameters: - a1 (mdt.Bond) – central bond in dihedral
- a1,a2 (mdt.Atom) – atoms around central bond in dihedral
- a4 (a3,) –
- theta (u.Scalar[angle]) – new angle to set
- adjustmol (bool) – Adjust all atoms on either side of this bond?
- Can be called as
-
moldesign.
gradient_descent
(*args, **kwargs)¶ A careful (perhaps overly careful) gradient descent implementation designed to relax structures far from equilibrium.
A backtracking line search is performed along the steepest gradient direction.
The maximum move for any single atom is also limited by
max_atom_move
Note
- This algorithm is good at stably removing large forces, but it’s very poorly suited to
- locating any type of critical point; don’t use this to find a minimum!
References
https://www.math.washington.edu/~burke/crs/408/lectures/L7-line-search.pdf
Parameters: - mol (moldesign.Molecule) – molecule to minimize
- max_atom_move (Scalar[length]) – maximum displacement of a single atom
- scaling (Scalar[length/force]) – unit of displacement per unit force
- gamma (float) – number between 0 and 1 indicating scale factor for backtracking search
- control (float) – threshold for terminating line search; this is a proportion (0<=``control``<=1) of the expected function decrease
- **kwargs (dict) – kwargs from
MinimizerBase
-
moldesign.
minimize
(*args, **kwargs)¶ Uses gradient descent until forces fall below a threshold, then switches to BFGS (unconstrained) or SLSQP (constrained).
Parameters: gd_threshold (u.Scalar[force]) – Use gradient descent if there are any forces larger than this; use an approximate hessian method (BFGS or SLSQP) otherwise Note
Not really that smart.
-
class
moldesign.
BasisSet
(mol, orbitals, name=None, h1e=None, overlaps=None, angulartype=None, **kwargs)[source]¶ Stores a basis, typically of atomic orbitals.
This is a special orbital type
-
class
moldesign.
AtomList
(*args, **kwargs)[source]¶ A list of atoms that allows attribute “slicing” - accessing an attribute of the list will return a list of atom attributes.
Example
>>> atomlist.mass == [atom.mass for atom in atomlist.atoms] >>> getattr(atomlist, attr) == [getattr(atom, attr) for atom in atomlist.atoms]
-
atoms
¶ This is a synonym for self so that AtomContainer methods will work here too
-
-
class
moldesign.
Bond
(a1, a2, order=None)[source]¶ A bond between two atoms.
Parameters: Notes
Comparisons and hashes involving bonds will return True if the atoms involved in the bonds are the same. Bond orders are not compared.
These objects are used to represent and pass bond data only - they are not used for storage.
-
a1
¶ Atom – First atom in the bond; assigned so that
self.a1.index < self.a2.index
-
a2
¶ Atom – Second atom in the bond; assigned so that
self.a2.index > self.a1.index
-
order
¶ int – bond order (can be
None
); not used in comparisons
-
ff
¶ mdt.forcefield.BondTerm – the force-field term for this bond (or
None
if no forcefield is present)
-
name
¶ str – name of the bond
-
-
class
moldesign.
Atom
(name=None, atnum=None, mass=None, chain=None, residue=None, formal_charge=None, pdbname=None, pdbindex=None, element=None)[source]¶ A data structure representing an atom.
Atom
objects store information about individual atoms within a larger molecular system, providing access to atom-specific geometric, biomolecular, topological and property information. EachMolecule
is composed of a list of atoms.Atoms can be instantiated directly, but they will generally be created automatically as part of molecules.
Parameters: - name (str) – The atom’s name (if not passed, set to the element name + the atom’s index)
- atnum (int) – Atomic number (if not passed, determined from element if possible)
- mass (units.Scalar[mass]) – The atomic mass (if not passed, set to the most abundant isotopic mass)
- chain (moldesign.Chain) – biomolecular chain that this atom belongs to
- residue (moldesign.Residue) – biomolecular residue that this atom belongs to
- pdbname (str) – name from PDB entry, if applicable
- pdbindex (int) – atom serial number in the PDB entry, if applicable
- element (str) – Elemental symbol (if not passed, determined from atnum if possible)
Atom instance attributes:
-
name
¶ str – A descriptive name for this atom
-
element
¶ str – IUPAC elemental symbol (‘C’, ‘H’, ‘Cl’, etc.)
-
index
¶ int – the atom’s current index in the molecule (
self is self.parent.atoms[ self.index]
)
-
atnum
¶ int – atomic number (synonyms: atomic_num)
-
mass
¶ u.Scalar[mass] – the atom’s mass
-
position
¶ units.Vector[length] – atomic position vector. Once an atom is part of a molecule, this quantity will refer to
self.molecule.positions[self.index]
.
-
momentum
¶ units.Vector[momentum] – atomic momentum vector. Once an atom is part of a molecule, this quantity will refer to
self.molecule.momenta[self.index]
.
-
x,y,z
u.Scalar[length] – x, y, and z components of
atom.position
-
vx, vy, vz
u.Scalar[length/time] – x, y, of
atom.velocity
-
px, py, pz
u.Scalar[momentum] – x, y, and z of
atom.momentum
-
fx, fy, fz
u.Scalar[force] – x, y, and z
atom.force
-
residue
¶ moldesign.Residue – biomolecular residue that this atom belongs to
-
chain
¶ moldesign.Chain – biomolecular chain that this atom belongs to
-
parent
¶ moldesign.Molecule – molecule that this atom belongs to
-
index
int – index in the parent molecule:
atom is atom.parent.atoms[index]
Atom methods and properties
See also methods offered by the mixin superclasses:
AtomDrawingMixin
AtomGeometryMixin
AtomPropertyMixin
AtomReprMixin
-
bond_graph
¶ Mapping[Atom, int] – dictionary of this atoms bonded neighbors, of the form
{bonded_atom1, bond_order1, ...}
-
bond_to
(other, order)[source]¶ Create or modify a bond with another atom
Parameters: Returns: bond object
Return type:
-
bonds
¶ List[Bond] – list of all bonds this atom is involved in
-
copy
(self)¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
elem
¶ str – elemental symbol
-
element
str – elemental symbol
-
force
¶ (units.Vector[force]) – atomic force vector. This quantity must be calculated - it is equivalent to
self.molecule.forces[self.index]
Raises: moldesign.NotCalculatedError
– if molecular forces have not been calculated
-
heavy_bonds
¶ List[Bond] – list of all heavy atom bonds (where BOTH atoms are not hydrogen)
Note
this returns an empty list if called on a hydrogen atom
-
nbonds
¶ int – the number of other atoms this atom is bonded to
-
num_bonds
¶ int – the number of other atoms this atom is bonded to
-
symbol
¶ str – elemental symbol
-
valence
¶ int – the sum of this atom’s bond orders
-
velocity
¶ u.Vector[length/time, 3] – velocity of this atom; equivalent to
self.momentum/self.mass
-
class
moldesign.
Entity
(name=None, molecule=None, index=None, pdbname=None, pdbindex=None, **kwargs)[source]¶ Generalized storage mechanism for hierarchical representation of biomolecules, e.g. by residue, chain, etc. Permits other groupings, provided that everything is tree-like.
All children of a given entity must have unique names. An individual child can be retrieved with
entity.childname
orentity['childname']
orentity[index]
Yields: Entity or mdt.Atom – this entity’s children, in order
-
class
moldesign.
Instance
(name=None, molecule=None, index=None, pdbname=None, pdbindex=None, **kwargs)[source]¶ The singleton biomolecular container for each
Molecule
. Its children are generally PDB chains. Users won’t ever really see this object.
-
class
moldesign.
Residue
(**kwargs)[source]¶ A biomolecular residue - most often an amino acid, a nucleic base, or a solvent molecule. In PDB structures, also often refers to non-biochemical molecules.
Its children are almost always residues.
-
parent
¶ mdt.Molecule – the molecule this residue belongs to
-
chain
¶ Chain – the chain this residue belongs to
-
add
(atom, key=None)[source]¶ Add a child to this entity.
Raises: KeyError
– if an object with this key already existsParameters: - item (Entity or mdt.Atom) – the child object to add
- key (str) – Key to retrieve this item (default:
item.name
)
-
assign_template_bonds
()[source]¶ Assign bonds from bioresidue templates.
Only assigns bonds that are internal to this residue (does not connect different residues). The topologies here assume pH7.4 and may need to be corrected for other pHs
See also
moldesign.Chain.assign_biopolymer_bonds for assigning inter-residue bonds
Raises: ValueError
– ifresidue.resname
is not in bioresidue templatesKeyError
– if an atom in this residue is not recognized
-
atomnames
¶ Residue – synonym for
`self`
for for the sake of readability –`molecule.chains['A'].residues[123].atomnames['CA']`
-
backbone
¶ AtomList – all backbone atoms for nucleic and protein residues (indentified using PDB names); returns None for other residue types
-
code
¶ str – one-letter amino acid code or two letter nucleic acid code, or ‘?’ otherwise
-
copy
()[source]¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
is_3prime_end
¶ bool – this is the last base in a strand
Raises: ValueError
– if this residue is not a DNA base
-
is_5prime_end
¶ bool – this is the first base in a strand
Raises: ValueError
– if this residue is not a DNA base
-
is_c_terminal
¶ bool – this is the first residue in a peptide
Raises: ValueError
– if this residue is not an amino acid
-
is_monomer
¶ bool – this residue is not part of a biopolymer
-
is_n_terminal
¶ bool – this is the last residue in a peptide
Raises: ValueError
– if this residue is not an amino acid
-
is_standard_residue
¶ bool – this residue is a “standard residue” for the purposes of a PDB entry.
In PDB files, this will be stored using ‘ATOM’ if this is a standard residue and ‘HETATM’ records if not.
Note
We currently define “standard” residues as those whose 3 letter residue code appears in the
moldesign.data.RESIDUE_DESCRIPTIONS
dictionary. Although this seems to work well, we’d welcome a PR with a less hacky method.References
PDB format guide: http://www.wwpdb.org/documentation/file-format
-
markdown_summary
()[source]¶ Markdown-formatted information about this residue
Returns: markdown-formatted string Return type: str
-
next_residue
¶ - Residue – The next residue in the chain (in the C-direction for proteins, 3’
- direction for nucleic acids)
Raises: NotImplementedError
– If we don’t know how to deal with this type of biopolymerStopIteration
– If there isn’t a next residue (i.e. it’s a 3’- or C-terminus)
-
prev_residue
¶ Residue –
- The next residue in the chain (in the N-direction for proteins, 5’ direction for
- nucleic acids)
Raises: NotImplementedError
– If we don’t know how to deal with this type of biopolymerStopIteration
– If there isn’t a previous residue (i.e. it’s a 5’- or N-terminus)
-
resname
¶ str – Synonym for pdbname
-
sidechain
¶ AtomList – all sidechain atoms for nucleic and protein residues (defined as non-backbone atoms); returns None for other residue types
-
type
¶ str – Classification of the residue (protein, solvent, dna, water, unknown)
-
-
class
moldesign.
Chain
(pdbname=None, **kwargs)[source]¶ Biomolecular chain class - its children are almost always residues.
-
parent
¶ mdt.Molecule – the molecule this residue belongs to
-
chain
¶ Chain – the chain this residue belongs to
-
assign_biopolymer_bonds
()[source]¶ Connect bonds between residues in this chain.
See also
moldesign.Residue.assign_template_bonds
Raises: ValueError
– ifresidue.resname
is not in bioresidue templatesKeyError
– if an atom in this residue is not recognized
-
c_terminal
¶ moldesign.Residue – The chain’s C-terminus (or
None
if it does not exist)
-
copy
()[source]¶ Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.
Returns: list of copied atoms Return type: AtomList
-
fiveprime_end
¶ moldesign.Residue – The chain’s 5’ base (or
None
if it does not exist)
-
get_ligand
()[source]¶ Return a (single) ligand if it exists; raises ValueError if there’s not exactly one
This is a utility routine to get a single ligand from a chain. If there’s exactly one residue, it is returned. If not, ValueError is raised - use
Chain.unclassified_residues()
to get an iterator over all unclassified residues.Returns: ligand residue Return type: moldesign.Residue Raises: ValueError
– if the chain does not contain exactly one unclassifiable residue
-
n_terminal
¶ moldesign.Residue – The chain’s N-terminus (or
None
if it does not exist)
-
residues
¶ ChildList – list of residues in this chain
-
sequence
¶ str – this chain’s residue sequence with one-letter residue codes
-
threeprime_end
¶ moldesign.Residue – The chain’s 3’ base (or
None
if it does not exist)
-
type
¶ str – the type of chain - protein, DNA, solvent, etc.
This field returns the type of chain, classified by the following rules: 1) If the chain contains only one type of residue, it is given that classification
(so a chain containing only ions has type “ion”- If the chain contains a biopolymer + ligands and solvent, it is classified as a biopolymer (i.e. ‘protein’, ‘dna’, or ‘rna’). This is the most common case with .pdb files from the PDB.
- If the chain contains multiple biopolymer types, it will be given a hybrid classification (e.g. ‘dna/rna’, ‘protein/dna’) - this is rare!
- If it contains multiple kinds of non-biopolymer residues, it will be called “solvent” (if all non-bio residues are water/solvent/ion) or given a hybrid name as in 3)
-
-
class
moldesign.
MolecularProperties
(mol, **properties)[source]¶ Stores property values for a molecule. These objects will be generally created and updated by EnergyModels, not by users.
-
class
moldesign.
Molecule
(atomcontainer, name=None, bond_graph=None, copy_atoms=False, pdbname=None, charge=None, electronic_state_index=0)[source]¶ Molecule
objects store a molecular system, including atoms, 3D coordinates, molecular properties, biomolecular entities, and other model-specific information. Interfaces with simulation models take place through the molecule object.Molecule objects will generally be created by reading files or parsing other input; see, for example:
moldesign.read()
,moldesign.from_smiles()
,moldesign.from_pdb()
, etc.This constructor is useful, however for copying other molecular structures (see examples below).
Parameters: - atomcontainer (AtomContainer or AtomList or List[moldesign.Atom]) –
atoms that make up this molecule.
Note
If the passed atoms don’t already belong to a molecule, they will be assigned to this one. If they DO already belong to a molecule, they will be copied, leaving the original molecule untouched.
- name (str) – name of the molecule (automatically generated if not provided)
- bond_graph (dict) – dictionary specifying bonds between the atoms - of the form
{atom1:{atom2:bond_order, atom3:bond_order}, atom2:...}
This structure must be symmetric; we requirebond_graph[atom1][atom2] == bond_graph[atom2][atom1]
- copy_atoms (bool) – Create the molecule with copies of the passed atoms (they will be copied automatically if they already belong to another molecule)
- pdbname (str) – Name of the PDB file
- charge (units.Scalar[charge]) – molecule’s formal charge
- electronic_state_index (int) – index of the molecule’s electronic state
Examples
Use the
Molecule
class to create copies of other molecules and substructures thereof: >>> benzene = mdt.from_name(‘benzene’) >>> benzene_copy = mdt.Molecule(benzene, name=’benzene copy’)>>> protein = mdt.from_pdb('3AID') >>> carbon_copies = mdt.Molecule([atom for atom in protein.atoms if atom.atnum==6]) >>> first_residue_copy = mdt.Molecule(protein.residues[0])
Molecule instance attributes:
-
atoms
¶ AtomList – List of all atoms in this molecule.
-
bond_graph
¶ dict – symmetric dictionary specifying bonds between the atoms:
bond_graph = {atom1:{atom2:bond_order, atom3:bond_order}, atom2:...}
bond_graph[atom1][atom2] == bond_graph[atom2][atom1]
-
residues
¶ List[moldesign.Residue] – flat list of all biomolecular residues in this molecule
-
chains
¶ Dict[moldesign.Chain] – Biomolecular chains - individual chains can be accessed as
mol.chains[list_index]
ormol.chains[chain_name]
-
name
¶ str – A descriptive name for molecule
-
charge
¶ units.Scalar[charge] – molecule’s formal charge
-
constraints
¶ List[moldesign.geom.GeometryConstraint] – list of constraints
-
ndims
¶ int – length of the positions, momenta, and forces arrays (usually 3*self.num_atoms)
-
num_atoms
¶ int – number of atoms (synonym: natoms)
-
num_bonds
¶ int – number of bonds (synonym: nbonds)
-
positions
¶ units.Array[length] – Nx3 array of atomic positions
-
momenta
¶ units.Array[momentum] – Nx3 array of atomic momenta
-
masses
¶ units.Vector[mass] – vector of atomic masses
-
dim_masses
¶ units.Array[mass] – Nx3 array of atomic masses (for numerical convenience - allows you to calculate velocity, for instance, as
velocity = mol.momenta/mol.dim_masses
-
time
¶ units.Scalar[time] – current time in dynamics
-
energy_model
¶ moldesign.models.base.EnergyModelBase – Object that calculates molecular properties - driven by mol.calculate()
-
integrator
¶ moldesign.integrators.base.IntegratorBase – Object that drives movement of 3D coordinates in time, driven by mol.run()
-
is_biomolecule
¶ bool – True if this molecule contains at least 2 biochemical residues
Molecule methods and properties
See also methods offered by the mixin superclasses:
moldesign.molecules.AtomContainer
moldesign.molecules.MolPropertyMixin
moldesign.molecules.MolDrawingMixin
moldesign.molecules.MolSimulationMixin
moldesign.molecules.MolTopologyMixin
moldesign.molecules.MolConstraintMixin
moldesign.molecules.MolReprMixin
-
addatom
(newatom)[source]¶ Add a new atom to the molecule
Parameters: newatom (moldesign.Atom) – The atom to add (it will be copied if it already belongs to a molecule)
-
addatoms
(newatoms)[source]¶ Add new atoms to this molecule. For now, we really just rebuild the entire molecule in place.
Parameters: newatoms (List[moldesign.Atom]) –
-
bonds
¶ Iterator over all bonds in the molecule
Yields: moldesign.atoms.Bond – bond object
-
deletebond
(bond)[source]¶ Remove this bond from the molecule’s topology
Parameters: Bond – bond to remove
-
is_small_molecule
¶ bool – True if molecule’s mass is less than 500 Daltons (not mutually exclusive with
self.is_biomolecule
)
-
nbonds
¶ int – number of chemical bonds in this molecule
-
newbond
(a1, a2, order)[source]¶ Create a new bond
Parameters: - a1 (moldesign.Atom) – First atom in the bond
- a2 (moldesign.Atom) – Second atom in the bond
- order (int) – order of the bond
Returns: moldesign.Bond
-
num_bonds
int – number of chemical bonds in this molecule
-
velocities
¶ u.Vector[length/time] – Nx3 array of atomic velocities
- atomcontainer (AtomContainer or AtomList or List[moldesign.Atom]) –
-
class
moldesign.
Trajectory
(mol, unit_system=None, first_frame=False)[source]¶ A
Trajectory
stores information about a molecule’s motion and how its properties change as it moves.- A trajectory object contains
- a reference to the
moldesign.Molecule
it describes, and - a list of
Frame
objects, each one containing a snapshot of the molecule at a
- a reference to the
particular point in its motion.
Parameters: - mol (moldesign.Molecule) – the trajectory will describe the motion of this molecule
- unit_system (u.UnitSystem) – convert all attributes to this unit system (default:
moldesign.units.default
) - first_frame (bool) – Create the trajectory’s first
Frame
from the molecule’s current position
-
mol
¶ moldesign.Molecule – the molecule object that this trajectory comes from
-
frames
¶ List[Frame] – a list of the trajectory frames in the order they were created
-
info
¶ str – text describing this trajectory
-
unit_system
¶ u.UnitSystem – convert all attributes to this unit system
-
align_orbital_phases
(reference_frame=None)[source]¶ Try to remove orbital sign flips between frames. If reference_frame is not passed, we’ll start with frame 0 and align successive pairs of orbitals. If reference_frame is an int, we’ll move forwards and backwards from that frame number. Otherwise, we’ll try to align every orbital frame to those in reference_frame
Parameters: reference_frame (int or Frame) – Frame
containing the orbitals to align with (default: align each frame with the previous one)
-
apply_frame
(frame)[source]¶ Reconstruct the underlying molecule with the given frame. Right now, any data not passed is ignored, which may result in properties that aren’t synced up with each other ...
-
draw
(**kwargs)¶ TrajectoryViewer: create a trajectory visualization
Parameters: **kwargs (dict) – keyword arguments for ipywidgets.Box
-
draw3d
(**kwargs)[source]¶ TrajectoryViewer: create a trajectory visualization
Parameters: **kwargs (dict) – keyword arguments for ipywidgets.Box
-
new_frame
(properties=None, **additional_data)[source]¶ Create a new frame, EITHER from the parent molecule or from a list of properties
Parameters: Returns: frame number (0-based)
Return type:
-
num_frames
¶ int – number of frames in this trajectory
-
plot
(x, y, **kwargs)[source]¶ Create a matplotlib plot of property x against property y
Parameters: Returns: the lines that were plotted
Return type: List[matplotlib.lines.Lines2D]
-
rmsd
(atoms=None, reference=None)[source]¶ Calculate root-mean-square displacement for each frame in the trajectory.
The RMSD between times \(t\) and \(t0\) is given by
\(\text{RMSD}(t;t_0) =\sqrt{\sum_{i \in \text{atoms}} \left( \mathbf{R}_i(t) - \mathbf{R}_i(t_0) \right)^2}\),
where \(\mathbf{R}_i(t)\) is the position of atom i at time t.
Parameters: - atoms (list[mdt.Atom]) – list of atoms to calculate the RMSD for (all atoms in the
Molecule
) - reference (u.Vector[length]) – Reference positions for RMSD. (default:
traj.frames[0].positions
)
Returns: list of RMSD displacements for each frame in the trajectory
Return type: u.Vector[length]
- atoms (list[mdt.Atom]) – list of atoms to calculate the RMSD for (all atoms in the
-
slice_frames
(key, missing=None)[source]¶ Return an array of giving the value of
key
at each frame.Parameters: - key (str) – name of the property, e.g., time, potential_energy, annotation, etc
- missing – value to return if a given frame does not have this property
Returns: - vector containing the value at each frame, or the value given
in the
missing
keyword) (len= len(self) )
Return type: moldesign.units.Vector
-
moldesign.
add_hydrogen
(mol, ph=None, wait=True)[source]¶ Add hydrogens to saturate atomic valences.
Parameters: - mol (moldesign.Molecule) – Molecule to saturate
- ph (float) – Assign formal charges and protonation using pH model; if None (the default), neutral protonation will be assigned where possible.
Returns: New molecule with all valences saturated
Return type:
-
moldesign.
guess_bond_orders
(mol, wait=True)[source]¶ Use OpenBabel to guess bond orders using geometry and functional group templates.
Parameters: mol (moldesign.Molecule) – Molecule to perceive the bonds of Returns: New molecule with assigned bonds Return type: moldesign.Molecule
-
moldesign.
add_water
(mol, min_box_size=None, padding=None, ion_concentration=None, neutralize=True, positive_ion='Na+', negative_ion='Cl-', wait=True)[source]¶ Solvate a molecule in a water box with optional ions
Parameters: - mol (moldesign.Molecule) – solute molecule
- min_box_size (u.Scalar[length] or u.Vector[length]) – size of the water box - either
a vector of x,y,z dimensions, or just a uniform cube length. Either this or
padding
(or both) must be passed - padding (u.Scalar[length]) – distance to edge of water box from the solute in each dimension
- neutralize (bool) – add ions to neutralize solute charge (in addition to specified ion concentration)
- positive_ion (str) – type of positive ions to add, if needed. Allowed values (from OpenMM modeller) are Cs, K, Li, Na (the default) and Rb
- negative_ion (str) – type of negative ions to add, if needed. Allowed values (from OpenMM modeller) are Cl (the default), Br, F, and I
- ion_concentration (float or u.Scalar[molarity]) – ionic concentration in addition to whatever is needed to neutralize the solute. (if float is passed, we assume the number is Molar)
Returns: new Molecule object containing both solvent and solute
Return type:
-
moldesign.
assign_forcefield
(mol, **kwargs)[source]¶ run_tleap(mol, forcefields=None, parameters=None, engine=None, image=None, wait=True, jobname=None, display=True) Drives tleap to create a prmtop and inpcrd file. Specifically uses the AmberTools 16 tleap distribution.
Defaults are as recommended in the ambertools manual.
Parameters: - mol (moldesign.Molecule) – Molecule to set up
- forcefields (List[str]) – list of the names of forcefields to use (see AmberTools manual for descriptions)
- parameters (List[ExtraAmberParameters]) – (optional) list of amber parameters for non-standard residues
- engine (pyccc.Engine) – Engine to run this job on (default:
moldesign.compute.get_engine()
) - image (str) – URL for the docker image
- wait (bool) – if True, block until this function completes and return the function’s return value. Otherwise, return a job object immediately that can be queried later.
- jobname – argument for
moldesign.compute.compute.run_job()
- display (bool) – if True, show logging output for this job
References
Ambertools Manual, http://ambermd.org/doc12/Amber16.pdf. See page 33 for forcefield recommendations.
-
moldesign.
parameterize
(mol, charges='esp', ffname='gaff2', **kwargs)[source]¶ Parameterize
mol
, typically using GAFF parameters.This will both assign a forcefield to the molecule (at
mol.ff
) and produce the parameters so that they can be used in other systems (e.g., so that this molecule can be simulated embedded in a larger protein)Note
‘am1-bcc’ and ‘gasteiger’ partial charges will be automatically computed if necessary. Other charge types must be precomputed.
Parameters: - mol (moldesign.Molecule) –
- charges (str or dict) –
- what partial charges to use? Can be a dict (
{atom:charge}
) OR - a string, in which case charges will be read from
mol.properties.[charges name]
; typical values will be ‘esp’, ‘mulliken’, ‘am1-bcc’, etc. Use ‘zero’ to set all charges to 0 (for QM/MM and testing) - what partial charges to use? Can be a dict (
- ffname (str) – Name of the gaff-like forcefield file (default: gaff2)
Returns: - Parameters for the molecule; this object can be used to create
forcefield parameters for other systems that contain this molecule
Return type:
-
moldesign.
calc_am1_bcc_charges
(mol, **kwargs)[source]¶ Calculate am1 bcc charges
Parameters: mol (moldesign.Molecule) – assign partial charges to this molecule (they will be stored at mol.properties['am1-bcc']
)Note
This will implicity run an AM1 energy minimization before calculating the final partial charges. For more control over this process, use the
moldesign.models.SQMPotential
energy model to calculate the charges.Returns: AM1-BCC partial charges on each atom Return type: Mapping[moldesign.Atom, units.Scalar[charge]]
-
moldesign.
calc_gasteiger_charges
(mol, **kwargs)[source]¶ Calculate gasteiger charges
Parameters: mol (moldesign.Molecule) – assign partial charges to this molecule Returns: - gasteiger partial charges on each atom
- (they will be stored at
mol.properties['gasteiger']
)
Return type: Mapping[moldesign.Atom, units.Scalar[charge]]
-
moldesign.
assign_formal_charges
(mol, ignore_nonzero=True)[source]¶ Assign formal charges to C,N,O,F atoms in this molecule based on valence
Parameters: - mol (moldesign.Molecule) – Molecule to assign formal charges to. The formal charges of its atoms and its total charge will be adjusted in place.
- ignore_nonzero (bool) – If formal charge is already set to a nonzero value, ignore this atom
Note
This method ONLY applies to C,N, O and F, based on a simple valence model. These results should be manually inspected for consistency.
Raises: UnhandledValenceError
– for cases not handled by the simple valence modelReferences
- These assignments are illustrated by the formal charge patterns in
- http://www.chem.ucla.edu/~harding/tutorials/formalcharge.pdf
-
moldesign.
add_missing_data
(mol)[source]¶ Add missing hydrogens, bond orders, and formal charges to a structure (often from the PDB)
Specifically, this is a convenience function that runs:
mdt.guess_bond_orders
,mdt.add_hydrogen
, andmdt.assign_formal_charges
Note
This does NOT add missing residues to biochemical structures. This functionality will be available as
moldesign.add_missing_residues()
Parameters: mol (moldesign.Molecule) – molecule to clean Returns: cleaned version of the molecule Return type: moldesign.Molecule
-
moldesign.
guess_histidine_states
(mol)[source]¶ Attempt to assign protonation states to histidine residues.
Note
This function is highly unlikely to give accurate results! It is intended for convenience when histidine states can easily be guessed from already-present hydrogens or when they are judged to be relatively unimportant.
- This can be done simply by renaming HIS residues:
- If HE2 and HD1 are present, the residue is renamed to HIP
- If only HE2 is present, the residue is renamed to HIE
- Otherwise, the residue is renamed to HID (the most common form)
Parameters: mol (moldesign.Molecule) – molecule to change (in place)
-
moldesign.
build_bdna
(sequence, **kwargs)[source]¶ Uses Ambertools’ Nucleic Acid Builder to build a 3D double-helix B-DNA structure.
Parameters: - sequence (str) – DNA sequence for one of the strands (a complementary sequence will automatically be created)
- engine (pyccc.Engine) – Engine to run this job on (default:
moldesign.compute.get_engine()
) - image (str) – URL for the docker image
- wait (bool) – if True, block until this function completes and return the function’s return value. Otherwise, return a job object immediately that can be queried later.
- jobname – argument for
moldesign.compute.compute.run_job()
- display (bool) – if True, show logging output for this job
Returns: B-DNA double helix
Return type:
-
moldesign.
build_dna_helix
(sequence, helix_type='B', **kwargs)[source]¶ Uses Ambertools’ Nucleic Acid Builder to build a 3D DNA double-helix.
Parameters: - sequence (str) – DNA sequence for one of the strands (a complementary sequence will automatically be created)
- helix_type (str) – Type of helix - ‘A’=Arnott A-DNA ‘B’=B-DNA (from standard templates and helical params), ‘LB’=Langridge B-DNA, ‘AB’=Arnott B-DNA, ‘SB’=Sasisekharan left-handed B-DNA
- engine (pyccc.Engine) – Engine to run this job on (default:
moldesign.compute.get_engine()
) - image (str) – URL for the docker image
- wait (bool) – if True, block until this function completes and return the function’s return value. Otherwise, return a job object immediately that can be queried later.
- jobname – argument for
moldesign.compute.compute.run_job()
- display (bool) – if True, show logging output for this job
All helix types except ‘B’ are taken from fiber diffraction data (see the refernce for details)
Returns: B-DNA double helix Return type: moldesign.Molecule References
See NAB / AmberTools documentation: http://ambermd.org/doc12/Amber16.pdf, pg 771-2
-
moldesign.
build_assembly
(mol, assembly_name)[source]¶ Create biological assembly using a bioassembly specification.
This routine builds a biomolecular assembly using the specification from a PDB header (if present, this data can be found in the “REMARK 350” lines in the PDB file). Assemblies are author-assigned structures created by copying, translating, and rotating a subset of the chains in the PDB file.
Parameters: - mol (moldesign.Molecule) – Molecule with assembly data (assembly data will be created by the
PDB parser at
molecule.properties.bioassembly
) - assembly_name (str OR int) – id of the biomolecular assembly to build.
Returns: mol – molecule containing the complete assembly
Return type: Raises: AttributeError
– If the molecule does not contain any biomolecular assembly dataKeyError
– If the specified assembly is not present
- mol (moldesign.Molecule) – Molecule with assembly data (assembly data will be created by the
PDB parser at
-
class
moldesign.
ChemicalGraphViewer
(mol, carbon_labels=True, names=None, display=False, _forcebig=False, **kwargs)[source]¶ Create a JSON-format graph representing the chemical structure and draw it using the NBMolViz 2D widget.
Parameters: - mol (moldesign.molecules.AtomContainer) – A collection of atoms (eg a list of atoms, a residue, a molecule. etc)
- carbon_labels (bool) – If True, draw atom names for carbons
- names (List[str]) – (optional) a list of strings to label the atoms in the drawing
(default:
[atom.name for atom in mol.atoms]
) - display (bool) – immediately display this drawing
-
class
moldesign.
DistanceGraphViewer
(atoms, distance_sensitivity=(<Quantity(3.0, 'ang')>, <Quantity(7.0, 'ang')>), bond_edge_weight=1.0, minimum_edge_weight=0.2, nonbond_weight_factor=0.66, angstrom_to_px=22.0, charge=-300, **kwargs)[source]¶ Create a 2D graph that includes edges with 3D information. This gives a 2D chemical that shows contacts from 3D space.
Parameters: - mol (moldesign.molecules.AtomContainer) – A collection of atoms (eg a list of atoms, a residue, a molecule. etc)
- distance_sensitivity (Tuple[u.Scalar[length]]) – a tuple containing the minimum and
maximum 3D distances to create edges for (default:
(3.0*u.ang, 7.0*u.ang)
) - bond_edge_weight (float) – edge weight for covalent bonds
- nonbond_weight_factor (float) – scale non-covalent edge weights by this factor
- angstrom_to_px (int) – number of pixels per angstrom
- charge (int) – the force-directed layout repulsive “charge”
-
class
moldesign.
GeometryViewer
(mol=None, style=None, display=False, render=True, **kwargs)[source]¶ Viewer for static and multiple-frame geometries
Variables: mol – Buckyball molecule -
calc_orb_grid
(orbname, npts, framenum)[source]¶ Calculate orbitals on a grid
Parameters: Returns: orbital values on a grid
Return type: moldesign.viewer.VolumetricGrid
-
draw_atom_vectors
(vecs, rescale_to=1.75, scale_factor=None, opacity=0.85, radius=0.11, render=True, **kwargs)[source]¶ For displaying atom-centered vector data (e.g., momenta, forces) :param rescale_to: rescale to this length (in angstroms) (not used if scale_factor is passed) :param scale_factor: Scaling factor for arrows: dimensions of [vecs dimensions] / [length] :param render: render immediately :param kwargs: keyword arguments for self.draw_arrow
-
handle_selection_event
(selection)[source]¶ Deals with an external selection event :param selection:todo :return:
-