3. Biopolymer API

3.1. Residues

class moldesign.Residue(**kwargs)[source]

A biomolecular residue - most often an amino acid, a nucleic base, or a solvent molecule. In PDB structures, also often refers to non-biochemical molecules.

Its children are almost always residues.

parent

mdt.Molecule – the molecule this residue belongs to

chain

Chain – the chain this residue belongs to

add(atom, key=None)[source]

Add a child to this entity.

Raises:

KeyError – if an object with this key already exists

Parameters:
  • item (Entity or mdt.Atom) – the child object to add
  • key (str) – Key to retrieve this item (default: item.name )
assign_template_bonds()[source]

Assign bonds from bioresidue templates.

Only assigns bonds that are internal to this residue (does not connect different residues). The topologies here assume pH7.4 and may need to be corrected for other pHs

See also

moldesign.Chain.assign_biopolymer_bonds for assigning inter-residue bonds

Raises:
  • ValueError – if residue.resname is not in bioresidue templates
  • KeyError – if an atom in this residue is not recognized
atomnames

Residue – synonym for `self` for for the sake of readability – `molecule.chains['A'].residues[123].atomnames['CA']`

backbone

AtomList – all backbone atoms for nucleic and protein residues (indentified using PDB names); returns None for other residue types

code

str – one-letter amino acid code or two letter nucleic acid code, or ‘?’ otherwise

copy()[source]

Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.

Returns:list of copied atoms
Return type:AtomList
is_3prime_end

bool – this is the last base in a strand

Raises:ValueError – if this residue is not a DNA base
is_5prime_end

bool – this is the first base in a strand

Raises:ValueError – if this residue is not a DNA base
is_c_terminal

bool – this is the first residue in a peptide

Raises:ValueError – if this residue is not an amino acid
is_monomer

bool – this residue is not part of a biopolymer

is_n_terminal

bool – this is the last residue in a peptide

Raises:ValueError – if this residue is not an amino acid
is_standard_residue

bool – this residue is a “standard residue” for the purposes of a PDB entry.

In PDB files, this will be stored using ‘ATOM’ if this is a standard residue and ‘HETATM’ records if not.

Note

We currently define “standard” residues as those whose 3 letter residue code appears in the moldesign.data.RESIDUE_DESCRIPTIONS dictionary. Although this seems to work well, we’d welcome a PR with a less hacky method.

References

PDB format guide: http://www.wwpdb.org/documentation/file-format

markdown_summary()[source]

Markdown-formatted information about this residue

Returns:markdown-formatted string
Return type:str
next_residue
Residue – The next residue in the chain (in the C-direction for proteins, 3’
direction for nucleic acids)
Raises:
  • NotImplementedError – If we don’t know how to deal with this type of biopolymer
  • StopIteration – If there isn’t a next residue (i.e. it’s a 3’- or C-terminus)
prev_residue

Residue

The next residue in the chain (in the N-direction for proteins, 5’ direction for
nucleic acids)
Raises:
  • NotImplementedError – If we don’t know how to deal with this type of biopolymer
  • StopIteration – If there isn’t a previous residue (i.e. it’s a 5’- or N-terminus)
resname

str – Synonym for pdbname

sidechain

AtomList – all sidechain atoms for nucleic and protein residues (defined as non-backbone atoms); returns None for other residue types

type

str – Classification of the residue (protein, solvent, dna, water, unknown)

3.2. Chains

class moldesign.Chain(pdbname=None, **kwargs)[source]

Biomolecular chain class - its children are almost always residues.

parent

mdt.Molecule – the molecule this residue belongs to

chain

Chain – the chain this residue belongs to

assign_biopolymer_bonds()[source]

Connect bonds between residues in this chain.

See also

moldesign.Residue.assign_template_bonds

Raises:
  • ValueError – if residue.resname is not in bioresidue templates
  • KeyError – if an atom in this residue is not recognized
c_terminal

moldesign.Residue – The chain’s C-terminus (or None if it does not exist)

copy()[source]

Copy a group of atoms which may already have bonds, residues, and a parent molecule assigned. Do so by copying only the relevant entities, and creating a “mask” with deepcopy’s memo function to stop anything else from being copied.

Returns:list of copied atoms
Return type:AtomList
fiveprime_end

moldesign.Residue – The chain’s 5’ base (or None if it does not exist)

get_ligand()[source]

Return a (single) ligand if it exists; raises ValueError if there’s not exactly one

This is a utility routine to get a single ligand from a chain. If there’s exactly one residue, it is returned. If not, ValueError is raised - use Chain.unclassified_residues() to get an iterator over all unclassified residues.

Returns:ligand residue
Return type:moldesign.Residue
Raises:ValueError – if the chain does not contain exactly one unclassifiable residue
n_terminal

moldesign.Residue – The chain’s N-terminus (or None if it does not exist)

residues

ChildList – list of residues in this chain

sequence

str – this chain’s residue sequence with one-letter residue codes

threeprime_end

moldesign.Residue – The chain’s 3’ base (or None if it does not exist)

type

str – the type of chain - protein, DNA, solvent, etc.

This field returns the type of chain, classified by the following rules: 1) If the chain contains only one type of residue, it is given that classification

(so a chain containing only ions has type “ion”
  1. If the chain contains a biopolymer + ligands and solvent, it is classified as a biopolymer (i.e. ‘protein’, ‘dna’, or ‘rna’). This is the most common case with .pdb files from the PDB.
  2. If the chain contains multiple biopolymer types, it will be given a hybrid classification (e.g. ‘dna/rna’, ‘protein/dna’) - this is rare!
  3. If it contains multiple kinds of non-biopolymer residues, it will be called “solvent” (if all non-bio residues are water/solvent/ion) or given a hybrid name as in 3)