As of July 2023, the PDB archive has no CCD data items to describe whether an atom within a peptide residue is part of the amino acid backbone or part of the N- or C-terminal groups of the peptide residue. In addition, amino acid CCDs do not yet have standardised atom naming for the backbone atoms, which makes it difficult to identify which atoms form the N- and C-terminal groups.
The Peptide Residues Remediation therefore focuses on addressing the following:
This will ensure that the backbone and terminal atoms are labelled in polypeptide residues, and that backbone atom names are standardised in polypeptide residue CCD files.
Example files are available at github: https://github.com/wwPDB/backbone-extension
The new CCD data items to flag the backbone and terminal groups of the peptide residues will be stored in three new mmCIF items to the CCD category _chem_comp_atom, as follows:
Is the atom part of the backbone of the peptide residue?
Is the atom a N-terminal atom in the peptide residue?
Is the atom a C-terminal atom in the peptide residue?
Example: SER. Colour legend: blue: N-terminal flag; red: C-terminal flag; yellow: backbone flag
The following rules have been used to populate the backbone flag for all existing backbone atoms:
Example: PX1. The shortest path between the two termini, and labelled as backbone, is shown in yellow. The grey colour shows ambiguous terminal groups.
Example: MVA. The CN atom is not labelled as backbone.
The following rules have been used to populate the N- and C-terminal flags:
Example: ALT. Sulphur replaces the O atom, so it is labelled as C-terminal.
Example: B2A. Boron replaces C, so it is labelled as C-terminal.
The following rules have been used to update the atom naming of all existing backbone atoms (excluding chromophores and capping residues, see below):
Example: 0JT. The carbon with both an amino group and a side chain is called CA.
Example: BSE. The carbon with the side chain is called CA, even if it is different from the carbon having the amino group.
In some cases, side chain atom naming was inconsistent or clashed with the new backbone atom names, and has therefore been updated to align with the new annotation rules.
During the remediation process, many CCDs were identified to have missing or incorrect leaving atoms on the backbone. It was particularly common that the N- or C-terminal leaving atoms were not assigned. As part of the remediation, all missing or incorrect leaving atoms on the backbone have been fixed.
The annotation rules for chromophores and chromophore-like CCDs are slightly different to standard amino acid residues:
Example: CRO. The backbone heavy atoms are: N1, CA1, C1, N2, CA2, C2, O2, N3, CA3, C3, O3, OXT.
The annotation rules for N- and C-terminal cap residues CCDs are different to standard amino acid residues. These terminal caps are defined as: “Any CCD that only occurs on either the N- or C-terminal position of polypeptides and that does not have a standard amino acid backbone, but instead has a single group that connects them to the polypeptide sequence. This is usually a carboxyl group for N-terminal caps and an amino group for C-terminal caps.”
The rules for annotating the backbone and terminal flags for these CCDs are:
The peptide residues chemical component dictionary remediation project is part of the protein chemical modifications (PCMs) and post translational modifications (PTMs) remediation project, a wwPDB collaborative project carried out principally by PDBe at EMBL-EBI, and is funded by BBSRC grant number BB/V018779/1.