Preparing PDBx/mmCIF files for Depositing Structures

To better support the increasing complexity and size of data submitted to the PDB archive, the wwPDB Deposition & Biocuration system is based on the PDBx/mmCIF data dictionary and file format. The system accepts, processes and distributes PDBx/mmCIF data files.

Depositors are encouraged to use the PDBx/mmCIF format for coordinate files whenever possible.

Generating PDBx/mmCIF format files automatically

PDBx/mmCIF is the official working format of the wwPDB for coordinate files. It is flexible, extensible, and can accommodate structures of any size.

PDBx/mmCIF files ready for deposition are generated by most structure refinement programs. In the event where it is not possible to use a refinement program to generate PDBx/mmCIF files, the pdb_extract program may be used.

Additional information about the PDBx/mmCIF format can be found in this FAQ.

PDBx/mmCIF format is especially useful:

  • When a PDBx/mmCIF file is the output of a final refinement. The developers of REFMAC, Phenix, and Buster are involved in the development of the PDBx/mmCIF format, and these programs will output PDBx/mmCIF format files that can be deposited without additional modification.
  • When the structure to be deposited is large. In this context, a large structure is defined as having more than 99,999 atoms and/or more than 62 polymer chains. These are the restrictions of the traditional PDB format. PDBx/mmCIF has no atom number restriction and virtually no chain number restriction. Please consult the "Using pdb_extract" sections of this guide for more information on depositing large structures.
  • When it is useful to avoid manual entry of additional information via the deposition interface. In addition to converting PDB to PDBx/mmCIF format, pdb_extract can be used to add sequence and other information to a coordinate file prior to deposition.

a) Refinement packages (Preferred)

Depositors are encouraged to use the latest version of the refinement software packages to output up-to-date and mmCIF compliant deposition files.

Recent versions of refinement packages Phenix, REFMAC, and Buster generate PDBx/mmCIF files ready for deposition:

Phenix: Instructions are available at the Phenix website, https://www.phenix-online.org/documentation/overviews/xray-structure-deposition.html

CCP4: instructions are available for CCP4i2 (http://www.ccp4.ac.uk/deposition_ccp4i2) or

CCP4 Cloud (http://www.ccp4.ac.uk/deposition_ccp4cloud)

REFMAC (when using outside of CCP4i2 or CCP4 Cloud): To output a PDBx/mmCIF file from REFMAC, add a card that reads "pdbout format mmcif". REFMAC can also read a file by specifying a PDBx/mmCIF file as an HKLIN argument.

Buster: Instructions are available at the Buster website (https://www.globalphasing.com/buster/wiki/index.cgi?DepositionMmCif)

b) pdb_extract

For non-crystallography depositions, the pdb_extract program is available as an online interface and as a standalone command-line program to convert PDB format file to PDBx/mmcif format. It extracts and harvests data in PDBx/mmCIF format from structure determination programs. To prepare PDB format data files for use with pdb_extract and the wwPDB deposition tool, please follow the instructions at OneDep FAQ.