wwPDB 2008 News
Announcement: PDB Archive Version 3.15 to be Released
A new standardized version of the PDB archive will be available from
ftp://ftp.wwpdb.org in early 2009. All entries released prior
to December 2, 2008 will be re-released as PDB Format Version 3.15 files. This release will
overwrite all existing files. A snapshot of the archive before this release will be available from
For documentation, please see File Format Documentation.
Questions may be sent to email@example.com.
Announcement: New Releases to Follow Format Guide Version 3.20
Beginning December 2, 2008, all newly-released PDB entries will follow PDB File Format
Contents Guide Version 3.20 (PDF |
HTML). This format includes
site and assembly annotation, and supports the nomenclature introduced in 2007 in the
Chemical Component Dictionary. The Version 3.20 Changes Guide
highlights the lists changes in format from 3.1.
Please send any questions to firstname.lastname@example.org.
Announcement: Comprehensive Format Guide Version 3.2
During the past year, the wwPDB annotators have collaborated on a project
to clarify the details and procedures related to data processing and annotation.
The result is a PDB Contents Guide Version 3.2 that more fully describes
the PDB file format. This document is available as a PDF
and in HTML, and is accompanied by
a document highlighting these clarifications.
In the coming months, all files released by the wwPDB will follow the format as described in this document.
Details will be made available on this website and at www.wwpdb.org.
IUCr: wwPDB Exhibition Stand and Presentations
The wwPDB partners will be exhibiting at the XXI Congress & General
Assembly of the International Union of Crystallography (IUCr; August
23 - 31 in Osaka, Japan) at booth #14. Please stop by for website
demonstrations and to meet with wwPDB members from around the globe.
Helen M. Berman (RCSB PDB) will present a keynote lecture
entitled "What the Protein Data Bank tells us about the past,
present, and future of structural biology" on Sunday August 24.
On Saturday, August 30, John Westbrook (RCSB PDB) will present "Data Quality in the PDB Archive".
Download Statistics Available by Structure ID
Downloads from the PDB archive are one of the primary means of accessing scientific structure results.
While there are cross-links between the corresponding scientific publication and the PDB entry,
in many cases it is the structure file that is accessed and downloaded more frequently.
The wwPDB website has recently added statistics for FTP and HTTP (web) downloads and views for each PDB structure.
The high volume of data downloaded around the world underscores the importance of including informative,
accurate, and annotated PDB data in the archive. Data are available by month, starting from August 2007,
for each wwPDB site. These statistics can be accessed a number of ways:
A searchable database
can be searched by ID or group of IDs. Results can display the wwPDB site and month accessed,
and include a line chart illustrating FTP and HTTP activity over time.
Tables provide full and summary statistics. The summary table offers an overall view of activity.
For example, 18,051,769 FTP downloads and 4,122,104 HTTP downloads/web page views were made
across all wwPDB sites in July 2008. Full Download Complete Reports can be downloaded in CSV and TAB formats.
The Top 10 Download Statistics page
offers a quick look at structures being downloaded by the most recent month and overall
since August 2007. For example, 1crn was the #1 structure viewed and downloaded via HTTP
from August 2007 - June 2008. In this table, mouse over the PDB ID to view the structure title.
All download statistics are updated monthly, and collected on an aggregate, rather than
individual, basis. The wwPDB does not share server log information with third parties for marketing or other purposes.
To access these features, select Statistics>Downloads from the top menu bar at
The wwPDB website also offers links to member sites, documentation, news, and
deposition and processing statistics. Questions may be sent to email@example.com.
Workshop on Next Generation Validation Tools for the wwPDB
A meeting of the wwPDB X-ray Validation Task Force was held to collect recommendations
and develop consensus on additional validation that should be performed on PDB entries,
and to identify software applications to perform validation tasks.
The workshop was organized by Randy Read (Cambridge University), and sponsored by the
RCSB PDB & PDBe. Detailed information about the workshop is available at
Workshop on Next Generation Validation Tools for the wwPDB.
Recent wwPDB Papers
• wwPDB deposition tools, methods (including validation), and policies are described in
Data deposition and annotation at the Worldwide Protein Data Bank.
Shuchismita Dutta, Kyle Burkhardt, Ganesh J. Swaminathan, Takashi Kosada, Kim Henrick, Haruki Nakamura,
Helen M. Berman (2008) in Methods in Molecular Biology, vol. 426:
Structural Proteomics: High-Throughput Methods (Bostjan Kobe, Mitchell Guss, Thomas Huber, eds.), pp. 81-101.
• Issues relating to NMR depositions are discussed in
BioMagResBank (BMRB) as a partner in the Worldwide Protein Data Bank (wwPDB): new policies affecting biomolecular NMR depositions.
John L. Markley, Eldon L. Ulrich, Helen M. Berman, Kim Henrick, Haruki Nakamura, and Hideo Akutsu (2008)
J Biomol NMR 40(3): 153-155
PDB Archives More Than 50,000 Structures
With this week's update, the PDB archive reached a significant milestone in its 37-year history.
The 50,000th molecule structure was released into the archive, joining other
structures vital to pharmacology, bioinformatics, and education.
The worldwide Protein Data Bank (wwPDB) has seen the archive double in size since 2004.
The PDB was founded in 1971 with seven structures at Brookhaven National Laboratory.
Today, the wwPDB receives approximately 25 new experimentally-determined structures
from scientists each day for inclusion in the archive. More than 5 million files are
downloaded from the PDB archive every month. Users include structural biologists,
computational biologists, biochemists, and molecular biologists in academia,
government, and industry as well as educators and students.
It is estimated that the size of the PDB archive will triple to 150,000 structures by the year 2014.
The 50,000th structure was released a week after another milestone event--the publication
of the 100th edition of the Molecule of the Month.
Proteins, one of the main building blocks for living organisms, come in a variety of
shapes, with the form of a protein corresponding to its function. The structures housed in
the PDB demonstrate great diversity in size, complexity, and function, including:
- Insulin, the protein deficient in diabetic patients
- p53 tumor suppressor, a protein often implicated in cancer
- Anthrax toxin, the disease-causing protein made by anthrax
- Amyloid peptide, a protein implicated in Alzheimer's disease
- Influenza proteins, structures which may help scientists design medicines to combat the flu
- Prion proteins, misshapen proteins that are the cause of many diseases, including mad cow disease
Time-stamped Copies of PDB Archive Available via FTP
A time-stamped snapshot of the PDB archive (ftp://ftp.wwpdb.org)
as of January 7, 2008 has been added to ftp://snapshots.rcsb.org/.
Snapshots of the PDB have been archived annually since 2004. It is hoped that these snapshots
will provide readily identifiable data sets for research on the PDB archive.
The script at ftp://snapshots.rcsb.org/rsyncSnapshots.sh
may be used to make a local copy of a snapshot or sections of the snapshot.
The directory 20080107 includes the 48,161 experimentally-determined coordinate files
that were current as of January 7, 2008. Coordinate data are available in PDB, mmCIF, and XML formats.
The date and time stamp of each file indicates the last time the file was modified.
Announcement: Data Processing Versioning Procedures
Data in the PDB archive currently follow either PDB File Format
Version 3.0 or 3.1. This is indicated in REMARK 4 of the file.
Version 3.0 is the format used for files released as a result of the
Since August 1, 2007, all files processed and released into the
archive follow Version 3.1. When modifications are made to files
released prior to that date, they are then re-released in Version 3.1.
Version 3.1 differs in descriptions of the biological unit (REMARK
300/350), geometry (REMARK 500), atom/residues modeled as zero
occupancy (REMARK 475/480), non-polymer residues with missing atoms
(REMARK 610), and metal coordination (REMARK 620). Documentation
describing the differences between these versions is available at
Beginning March 4, 2008, it will be indicated in the REVDAT record
with the name "VERSN" when a Version 3.0 file is re-released as
For example, if the journal record is updated in an entry that still
follows Version 3.0, the REVDAT would appear as:
REVDAT 1 04-MAR-08 1ABC 1 JRNL VERSN
REVDAT 1 13-FEB-07 1ABC 0
There is no change to how depositors submit their files. Any required
changes in nomenclature can be made automatically by the wwPDB during
the annotation process.
Documentation about file formats and the Remediation Project is available at www.wwpdb.org.