The Biologically Interesting Molecule Reference Dictionary (BIRD)

Overview

The Biologically Interesting molecule Reference Dictionary (BIRD) contains information about certain biologically interesting molecules in the PDB archive such as peptide-like antibiotics, peptide-like inhibitors, and common oligosaccharides.

BIRD is an external reference file (similar to the Chemical Component Dictionary) that provides information about the chemistry, biology, and structure of these molecules.

BIRD entries include molecular weight and formula, sequence and connectivity, descriptions of structural features and functional classification, natural source (if any), and external references to corresponding UniProt or Norine entries (if applicable).

The entire BIRD resource can be downloaded from the wwPDB FTP area: https://files.wwpdb.org/pub/pdb/data/bird/

BIRD is regularly reviewed for consistency and accuracy. The dictionary is updated each week with new definitions as the corresponding PDB entries are released in the PDB archive. It is used to uniformly annotate PDB entries containing these molecules.

The corresponding BIRD ID code only appears in the PDBx/mmCIF-formatted file of the entry.

BIRD Definition Details

All the components that comprise a BIRD molecule in a PDB entry are listed in the BIRD definition with complete linkages. For example, a modified gramicidin A molecule in PDB entry 1KQE is comprised of two truncated gramicidin A molecules linked by a non-polymer residue (succinic acid). This molecule is fully defined in BIRD (BIRD ID: PRD_000154).

A BIRD molecule may be represented as a polymer with sequence information, as a single ligand with chemical information, or as a branched entity (for oligosaccharides) in a PDB entry. The preferred representation is specified in the BIRD file, along with a representative PDB entry code. All PDB entries containing the same BIRD molecule will have uniform representation. An important feature of BIRD is to provide dual representation of both sequence and chemical information, regardless of the molecular representation in the PDB archive.

Contents of a BIRD entry:
  • Molecular presentation in PDB instances: polymer, single molecule, or branched
  • Chemical information such as formula and molecular weight
  • Structural feature (Type: _pdbx_molecule_features.type) and specific function (Class: _pdbx_molecule_features.class)
  • Molecular composition including a list of polymer and non-polymer components
  • Sequence information for all representations
  • Residue linkages within a polymer/branched chain as well as between a polymer/branched chain and non-polymer components
  • Source organism and sequence database reference if available (for naturally occurring BIRD molecules)

Example: information available about BIRD molecules found in the PDB archive is shown using the vancomycin family as an example. The four vancomycin molecules shown below share the same structural feature: glycopeptide (type) and the same functional classification: antibiotics (class). They all contain a peptide core with or without sugar decoration. Click on an image to see a larger version.

A
B
C
D
  • (A) Chloroorienticin A has a disaccharide and a monosaccharide decorating the peptide core (PRD_000203).
  • (B) Vancomycin has a disaccharide decorating the peptide core (PRD_000204).
  • (C) Vancomycin aglycon has only a peptide core (PRD_000206).
  • (D) Desvancosaminyl vancomycin is an intermediate in the vancomycin-biosynthesis pathway. It has one saccharide linked to the peptide core (PRD_000205).

Example of PRD definition for PRD_000203:

data_PRD_000203
##
_pdbx_reference_molecule.prd_id                      PRD_000203
_pdbx_reference_molecule.name                        "Chloroorienticin A"
_pdbx_reference_molecule.represent_as                polymer
_pdbx_reference_molecule.type                        Glycopeptide
_pdbx_reference_molecule.type_evidence_code          ?
_pdbx_reference_molecule.class                       Antibiotic
_pdbx_reference_molecule.class_evidence_code         ?
_pdbx_reference_molecule.formula                     "C73 H88 Cl2 N10 O26"
_pdbx_reference_molecule.chem_comp_id                ?
_pdbx_reference_molecule.formula_weight              1592.437
_pdbx_reference_molecule.release_status              REL
_pdbx_reference_molecule.replaces                    ?
_pdbx_reference_molecule.replaced_by                 ?
_pdbx_reference_molecule.compound_details
; CHLOROORIENTICIN A IS A TRICYCLIC GLYCOPEPTIDE, A MEMBER OF THE VANCOMYCIN
FAMILY. THE SCAFFOLD IS A HEPTAPEPTIDE  WITH THE CONFIGURATION D-D-L-D-D-L-L.
IT IS FURTHER GLYCOSYLATED BY ONE DISACCHARIDE AND ONE MONOSACCHARIDE HERE,
CHLOROORIENTICIN A IS REPRESENTED BY GROUPING TOGETHER  THE SEQUENCE (SEQRES)
AND THE THREE LIGANDS (HET) BGC AND TWO  RER.
;
_pdbx_reference_molecule.description
; CHLOROORIENTICIN A IS A TRICYCLIC GLYCOPEPTIDE, GLYCOSYLATED BY ONE
DISACCHARIDE ON RESIDUE 4  AND ONE MONOSACCHARIDE ON RESIDUE 6.
;
_pdbx_reference_molecule.representative_PDB_id_code  1gac
##
loop_
_pdbx_reference_entity_list.prd_id
_pdbx_reference_entity_list.ref_entity_id
_pdbx_reference_entity_list.component_id
_pdbx_reference_entity_list.type
_pdbx_reference_entity_list.details
PRD_000203  1  1  polymer "PEPTIDE LIKE SEQUENCE RESIDUES 1 TO 7"
PRD_000203  2  2  non-polymer  BGC
PRD_000203  3  3  non-polymer  RER
PRD_000203  3  4  non-polymer  RER
##
loop_
_pdbx_reference_entity_nonpoly.prd_id
_pdbx_reference_entity_nonpoly.ref_entity_id
_pdbx_reference_entity_nonpoly.name
_pdbx_reference_entity_nonpoly.chem_comp_id
PRD_000203  2  BETA-D-GLUCOSE  BGC
PRD_000203  3  VANCOSAMINE     RER
##
loop_
_pdbx_reference_entity_link.prd_id
_pdbx_reference_entity_link.link_id
_pdbx_reference_entity_link.link_class
_pdbx_reference_entity_link.ref_entity_id_1
_pdbx_reference_entity_link.entity_seq_num_1
_pdbx_reference_entity_link.comp_id_1
_pdbx_reference_entity_link.atom_id_1
_pdbx_reference_entity_link.ref_entity_id_2
_pdbx_reference_entity_link.entity_seq_num_2
_pdbx_reference_entity_link.comp_id_2
_pdbx_reference_entity_link.atom_id_2
_pdbx_reference_entity_link.value_order
_pdbx_reference_entity_link.component_1
_pdbx_reference_entity_link.component_2
_pdbx_reference_entity_link.details
PRD_000203  1  PN  1  4  GHP  O4   2  ?  BGC  C1  sing  1  2  ?
PRD_000203  2  PN  1  6  OMY  ODE  3  ?  RER  C1  sing  1  4  ?
PRD_000203  3  NN  2  ?  BGC  O2   3  ?  RER  C1  sing  2  3  ?
##
loop_
_pdbx_reference_entity_poly_link.prd_id
_pdbx_reference_entity_poly_link.ref_entity_id
_pdbx_reference_entity_poly_link.link_id
_pdbx_reference_entity_poly_link.atom_id_1
_pdbx_reference_entity_poly_link.comp_id_1
_pdbx_reference_entity_poly_link.entity_seq_num_1
_pdbx_reference_entity_poly_link.atom_id_2
_pdbx_reference_entity_poly_link.comp_id_2
_pdbx_reference_entity_poly_link.entity_seq_num_2
_pdbx_reference_entity_poly_link.value_order
_pdbx_reference_entity_poly_link.component_id
PRD_000203  1  1  C   MLU  1  N    OMZ  2  sing  1
PRD_000203  1  2  C   OMZ  2  N    ASN  3  sing  1
PRD_000203  1  3  OH  OMZ  2  C5   GHP  4  sing  1
PRD_000203  1  4  C   ASN  3  N    GHP  4  sing  1
PRD_000203  1  5  C   GHP  4  N    GHP  5  sing  1
PRD_000203  1  6  C3  GHP  4  OCZ  OMY  6  sing  1
PRD_000203  1  7  C3  GHP  5  CG1  3FG  7  sing  1
PRD_000203  1  8  C   GHP  5  N    OMY  6  sing  1
PRD_000203  1  9  C   OMY  6  N    3FG  7  sing  1
##
_pdbx_reference_entity_poly.prd_id         PRD_000203
_pdbx_reference_entity_poly.ref_entity_id  1
_pdbx_reference_entity_poly.db_code        NOR00692
_pdbx_reference_entity_poly.db_name        Norine
_pdbx_reference_entity_poly.type           peptide-like
##
_pdbx_reference_entity_sequence.prd_id            PRD_000203
_pdbx_reference_entity_sequence.ref_entity_id     1
_pdbx_reference_entity_sequence.type              peptide-like
_pdbx_reference_entity_sequence.NRP_flag          Y
_pdbx_reference_entity_sequence.one_letter_codes  LYNFFYF
##
loop_
_pdbx_reference_entity_poly_seq.prd_id
_pdbx_reference_entity_poly_seq.ref_entity_id
_pdbx_reference_entity_poly_seq.num
_pdbx_reference_entity_poly_seq.mon_id
_pdbx_reference_entity_poly_seq.parent_mon_id
_pdbx_reference_entity_poly_seq.hetero
_pdbx_reference_entity_poly_seq.observed
PRD_000203  1  1  MLU  LEU  N  Y
PRD_000203  1  2  OMZ  TYR  N  Y
PRD_000203  1  3  ASN  ASN  N  Y
PRD_000203  1  4  GHP  PHE  N  Y
PRD_000203  1  5  GHP  PHE  N  Y
PRD_000203  1  6  OMY  TYR  N  Y
PRD_000203  1  7  3FG  PHE  N  Y
##
loop_
_pdbx_reference_entity_src_nat.prd_id
_pdbx_reference_entity_src_nat.ref_entity_id
_pdbx_reference_entity_src_nat.ordinal
_pdbx_reference_entity_src_nat.taxid
_pdbx_reference_entity_src_nat.organism_scientific
_pdbx_reference_entity_src_nat.db_code
_pdbx_reference_entity_src_nat.db_name
PRD_000203  1  1  31958  "Amycolatopsis orientalis (previously
designated Norcardia orientalis and Streptomyces orientalis)"
?         ?
PRD_000203  1  2  31958  "AMYCOLATOPSIS ORIENTALIS"   NOR00681
Norine
##
loop_
_pdbx_prd_audit.prd_id
_pdbx_prd_audit.date
_pdbx_prd_audit.processing_site
_pdbx_prd_audit.action_type
PRD_000203  2012-02-08  RCSB  "Create molecule"
PRD_000203  2012-07-09  PDBe  "Other modification"
PRD_000203  2012-12-12  RCSB  "Initial release"
##