"Remediation of the protein data bank" archive describes the scope and methods of the wwPDB's Remedation Project
K. Henrick; Z. Feng; W. F. Bluhm; D. Dimitropoulos; J. F. Doreleijers; S. Dutta;
J. L. Flippen-Anderson; J. Ionides; C. Kamada; E. Krissinel; C. L. Lawson; J. L. Markley;
H. Nakamura; R. Newman; Y. Shimizu; J. Swaminathan; S. Velankar; J. Ory; E. L. Ulrich;
W. Vranken; J. Westbrook; R. Yamashita; H. Yang; J. Young; M. Yousufuddin; H. M. Berman
Nucleic Acids Research 2008 36(Database issue):D426-D433; doi:10.1093/nar/gkm937
Effective February 1, 2008, structure factor amplitudes/intensities (for crystal structures)
and restraints (for NMR structures) will a mandatory requirement for PDB deposition.
These data must be deposited at a member site of the Worldwide Protein Data Bank
RCSB PDB (www.pdb.org),
PDBj (www.pdbj.org), or
Data can be released as soon as they have been processed and approved. There is a
one-year limit on the length of time a structure and its experimental data can be put
on hold, including structures that are on hold until the associated paper is published (HPUB).
This policy was developed as a result of comments and recommendations from the PDB user community,
including the Commission on Biological Macromolecules of the International Union of Crystallography
and the NMR Task Force, and has been endorsed by the wwPDB Advisory Committee.
Questions relating to depositions should be sent to firstname.lastname@example.org.
The world wide Protein Data Bank Advisory Committee (wwPDBAC) met in Princeton, NJ on September 7, 2007. The
presentations by wwPDB members and the
report from the wwPDBAC are available.
After this meeting, a "Funding Forum" took place. At this forum, the wwPDB AC sought
advice on funding options for the continued operation of the wwPDB organization
from the representatives present from the agencies that fund the individual groups. A
report from that meeting is also available.
Realism about PDB
Nature Biotechnology 25, 845 - 846 (2007)
The PDB archive has been remediated by the wwPDB. It is available at
All data in the PDB archive reflects the new features incorporated as part of this wwPDB project, including standardized IUPAC nomenclature for chemical components. Users may have to download new software to view the files with the new nomenclature (e.g., RasMol, Chimera). Please see http://remediation.wwpdb.org/software.html for details.
A snapshot of the unremediated PDB archive (as of July 31, 2007) will be available at
Many thanks to the PDB community for all of the feedback on this project.
The PDB archive has been remediated and will be available starting August 1, 2007 from ftp://ftp.wwpdb.org.
All data in the PDB archive will reflect the new features incorporated as part of
this wwPDB project, including standardized IUPAC nomenclature for chemical components.
Users may have to download new software to view the files with the new nomenclature
(e.g., RasMol, Chimera). Please see http://remediation.wwpdb.org/software.html for details.
A snapshot of the unremediated PDB archive (as of July 31, 2007) will be available at ftp://ftp.rcsb.org.
An FAQ about this project and transition is available at http://www.wwpdb.org/docs.html.
Questions may also be sent to email@example.com.
Starting August 1, 2007, files processed and released into the archive by the
wwPDB sites will reflect the new features incorporated as part of the remediation project.
These files will follow the PDB Exchange Dictionary (PDBx) v1.045 and the
Protein Data Bank Contents Guide Version 3.1.
There is no change to how depositors submit their files. Any required changes in
nomenclature can be made automatically by the wwPDB during the annotation process.
The wwPDB (www.wwpdb.org) has collaborated on a project to remediate the PDB
archive and create a new set of corrected files.
The entire archive has been reviewed and remediated with the objectives of improving
the detailed chemical description of non-polymer and monomer chemical components;
standardizing atom nomenclature; updating sequence database references and taxonomies;
resolving any remaining differences between chemical and macromolecular sequences;
improving the representation of viruses; and verifying primary citation assignments.
In addition, the atom nomenclature for amino acids and nucleotides now conforms with IUPAC standards.
A new FTP server containing the remediated data has been set up for testing.
The access details for this site are provided at
The new ftp site will be updated weekly in concert with the current production site at
Both sites share the same directory structure. Starting May 1, the
remediated data will be served using gzip compression.
Your input is very important to us. PDB users are encouraged to test the remediated data
files between April and July 2007. The details of the final transition will be announced on this website.
Detailed information about this project can be found at
Comments about the files should be sent to firstname.lastname@example.org.
Major announcements will be made at the wwPDB website (http://www.wwpdb.org) as well as on the individual member websites.
The directory 20070102 includes the 40,933 experimentally-determined coordinate files
that were current (i.e., not obsolete) as of January 2, 2007. Coordinate data are available
in PDB, mmCIF, and XML formats. The date and time stamp of each file indicates the last time the file was modified.
Scripts are available to automatically download data:
Entries in the PDB archive have been processed by the three members of the wwPDB (RCSB PDB, PDBe, and PDBj).
wwPDB members work to annotate all data deposited to the PDB archive. Information
about data file formats, annotation procedures, and remediation efforts are described below.
Documentation for the different file formats for PDB data is available at
File Format Documentation
Entries in PDB format comply with the PDB Contents Guide v2.3 (July 1998).
Entries in mmCIF format comply with the PDB Exchange Dictionary v1.037 (January 2007).
Entries in XML format comply with the PDBML Schema v1.037 (January 2007).
Annotation procedures and policies are described at
There are some data items for which the processing procedures are ambiguous.
Over the course of the last 12 months, the annotation teams have worked to formalize
many aspects of PDB annotation policies and procedures. As a result, a consistent set
of annotation procedures are being defined.
Remediation project information is available at
All existing entries have been reviewed and errors have been corrected where possible.
One major change is that the atomic names will conform to IUPAC standards.
In addition, the chemical component dictionary has been updated and extended to
include more information about the chemical structures of each component.
The wwPDB Advisory Committee has reviewed and approved this effort.
Please consult this site to review test data files and the new dictionary.
The full new data set in PDB, mmCIF and XML formats will become available for review in April 2007.
Questions about these projects should be sent to email@example.com.