Protein Data Bank: the single global archive for 3D macromolecular structure data has been published in the 2019 Database issue of Nucleic Acids Research. The paper was authored by "wwPDB Consortium" to underscore the importance of the global wwPDB partnership in managing the PDB archive.
This article describes the development of the PDB under the stewardship of wwPDB, such as new archival content, master format and dictionary and major remediation efforts. The publication also emphasizes the importance of continued community interactions, the scientific and technical challenges the PDB faces, and charts the road ahead.
The OneDep system for deposition, validation, and biocuration recently extended the range of metadata collected for structures solved by X-ray Free-Electron Laser (XFEL) and Serial Femtosecond Crystallography (SFX). Depositors can now provide details of the sample delivery, data measurement details such as focusing optics, pulse energy, frequency, and number of crystals used in new PDBx/mmCIF categories dedicated to XFEL and SFX experiments.
Authors of existing XFEL entries in the PDB archive will be contacted to provide additional information for their entries.
wwPDB has been certified by the CoreTrustSeal Board as a Trusted Digital Repository.
CoreTrustSeal was launched by the ICSU World Data System and the Data Seal of Approval as a unified global organization for certification of data repositories.
Requirements for this accreditation broadly cover the complete life cycle of PDB data management, project organization and oversight.
The wwPDB is dedicated to following CoreTrustSeal standards in order to sustain a freely accessible, single global archive of experimentally determined structure data for biological macromolecules as an enduring public good.
Contact authors submitting PDB data using OneDep are now required to provide their unique ORCiD identifiers with their deposition. This change will support wwPDB efforts to correctly attribute PDB structures to contact authors. At a later point, ORCiDs will be used to authenticate and reorganize access to deposition data within OneDep.
The wwPDB OneDep system for deposition, validation and biocuration will require contact authors to provide their unique ORCiD identifiers when preparing depositions later this summer. This change will enable wwPDB efforts to correctly attribute PDB structures to contact authors. At a later point, ORCiD will be used to authenticate and reorganize access to deposition data within OneDep.
The new version of OneDep improves the process of data replacement for PDB and EMDB entries, prior to their release. This updated process ensures that all new data are checked and validated before the new files are merged into the deposition session. A new validation report is produced at each data update. It must be carefully reviewed, and any highlighted issues should be inspected. The report must be accepted by the depositor prior to re-submission of the data to the archive. These changes ensure that any previous data is not accidentally removed in the process of replacement and also enables more efficient biocuration of the new files due to the reduction in file replacement errors. Any major data errors or data inconsistencies between versions are displayed for review by the depositor so that these can be rectified prior to resubmission. Overall, these changes should improve the quality of the data and the transparency of the file replacement process. If you have any questions or feedback regarding the new file replacement process then please log into your deposition session and contact wwPDB via the communication tab.
In 2018, The Office of Research Integrity (ORI) of the U.S. Department of Health and Human Services announced their final Research Misconduct Finding in the case of H.M. Krishna Murthy. It was found that Murthy reported Falsified and/or fabricated research in 10 journal publications and 12 corresponding PDB structures. While the ORI was gathering and evaluating evidence in this case, 5 Murthy structures in the PDB were obsoleted in accord with wwPDB policies in response to retraction of 4 journal publications. In response to a formal request from ORI received on April 23, 2018, the remaining 7 Murthy structures in the PDB were obsoleted, again in accord with wwPDB policies. ORI conduct of its investigations is designed to ensure due process for individuals accused of research misconduct, and strict confidentiality is maintained throughout. Only findings of research misconduct are made public.
In December 2009, the University of Alabama at Birmingham (UAB) announced that it planned to retract 12 PDB entries and 10 related publications authored by H.M. Krishna Murthy, in his capacity as Principal Investigator and UAB employee. Following wwPDB review of structures at the request of UAB and in accord with wwPDB policy, 5 of the structures were obsoleted upon retraction of the related publications by the journals. Following wwPDB review of structures at the request of UAB and in accord with wwPDB policy, the remaining 7 structures were obsoleted upon receipt of the ORI request.
A detailed PDB history of this case is available.
The PDB is an archival resource that stores, annotates, and disseminates structure models and their related experimental data. The wwPDB has convened expert, community-driven Validation Task Forces for X-ray (in 2008), NMR (in 2009), and (in collaboration with the EMDataBank) Cryo-EM (in 2010) to advise on the most suitable criteria to use for validating structure entries (model, experimental data, and fit of model to data) when they are deposited. Recommendations of these validation task forces have been implemented as part of the wwPDB OneDep system for deposition, annotation, and validation of PDB structures.
The results of these wwPDB validation procedures are captured in a report that is provided to depositors and can be transmitted by them to the journal to which the corresponding manuscript is submitted. Availability of such a report greatly facilitates assessment of the reliability of structural data and its interpretation by journal editors and referees alike. The wwPDB has urged journals publishing structural data on biological macromolecules to require submission of the wwPDB validation report together with the manuscript. The continuing mission of the wwPDB partners is to safeguard the integrity and improve the quality of the structural archive, with the support of the international structural biology community.
For additional information, see
Critical Assessment of Techniques for Protein Structure Prediction (CASP) are community experiments that aim to advance the state of the art in protein structure modeling. Every other year since 1994, CASP collects information on soon-to-be released experimental structures, passes on sequence data to the structure modeling community, and collects blind predictions of structure for assessment. About 100 modeling groups from around the world have participated. Results of CASP experiments are published in special issues of the journal Proteins (e.g., CASP12).
The success of CASP depends on the generosity of the structure determination community. We are now requesting targets for CASP13 experiment, which will launch at the beginning of May. The CASP community needs modeling targets over a wide range of difficulty, for modeling with and without the aid of templates. X-ray, NMR and cryo-EM structures are all welcome, with particular interest in membrane proteins, protein complexes, and cryo-EM structures. We are also extending CASP to include more modeling efforts assisted by sparse experimental data, in collaboration with experimental groups working within NMR, SAXS, SANS, crosslinking, and FRET techniques for which protein material is needed (of course this is not expected for most targets, but if available, it would be much appreciated!).
So, if you have anything suitable, we encourage you to mark your PDB deposition as a "CASP target" in wwPDB's OneDep deposition system. Alternatively, you can suggest your protein to CASP directly through the CASP13 target entry page.
For those of you who have not provided targets to CASP before, the procedure is simple and fast. We do not need the structure in advance of its PDB release, and if we are notified early enough (at least of three weeks before release, more is better) there need be no delay in structure release. More details are available.
CASP target providers are regularly invited to contribute to special issue papers, for example:
2018: Kryshtafovych A, Albrecht R, Baslé A, Bule P, Caputo AT, Carvalho AL, Chao KL, Diskin R, Fidelis K, Fontes CMGA, Fredslund F, Gilbert HJ, Goulding CW, Hartmann MD, Hayes CS, Herzberg O, Hill JC, Joachimiak A, Kohring GW, Koning RI, Lo Leggio L, Mangiagalli M, Michalska K, Moult J, Najmudin S, Nardini M, Nardone V, Ndeh D, Nguyen TH, Pintacuda G, Postel S, van Raaij MJ, Roversi P, Shimon A, Singh AK, Sundberg EJ, Tars K, Zitzmann N, Schwede T. (2018). Target highlights from the first post-PSI CASP experiment (CASP12, May-August 2016). Proteins 86 (S1), 27-50. doi: 10.1002/prot.25392. PMID: 28960539
2016: Kryshtafovych A, Moult J, Baslé A, Burgin A, Craig TK, Edwards RA, Fass D, Hartmann MD, Korycinski M, Lewis RJ, Lorimer D, Lupas AN, Newman J, Peat TS, Piepenbrink KH, Prahlad J, van Raaij MJ, Rohwer F, Segall AM, Seguritan V, Sundberg EJ, Singh AK, Wilson MA, Schwede T. (2016). Some of the most interesting CASP11 targets through the eyes of their authors. Proteins 84 (S1), 34-50. doi: 10.1002/prot.24942. PMID: 26473983
2014: Kryshtafovych A, Moult J, Bales P, Bazan JF, Biasini M, Burgin A, Chen C, Cochran FV, Craig TK, Das R, Fass D, Garcia-Doval C, Herzberg O, Lorimer D, Luecke H, Ma X, Nelson DC, van Raaij MJ, Rohwer F, Segall A, Seguritan V, Zeth K, Schwede T. (2014). Challenging the state-of-the-art in protein structure prediction: Highlights of experimental target structures for the 10th critical assessment of techniques for protein structure prediction experiment CASP10. Proteins 82 (S2), 26-42. doi: 10.1002/prot.24489. PMID: 24318984
2011: Kryshtafovych A, Moult J, Bartual SG, Bazan JF, Berman H, Casteel DE, Christodoulou E, Everett JK, Hausmann J, Heidebrecht T, Hills T, Hui R, Hunt JF, Seetharaman J, Joachimiak A, Kennedy MA, Kim C, Lingel A, Michalska K, Montelione GT, Otero JM, Perrakis A, Pizarro JC, van Raaij MJ, Ramelot TA, Rousseau F, Tong L, Wernimont AK, Young J, Schwede T. (2011). Target highlights in CASP9: Experimental target structures for the critical assessment of techniques for protein structure prediction. Proteins 79 (S10), 6-20. doi: 10.1002/prot.23196. PMID: 22020785
Thanks, CASP organizing committee: John Moult, University of Maryland, USA Krzysztof Fidelis, University of California, Davis, USA Andriy Kryshtafovych, University of California, Davis, USA Torsten Schwede, University of Basel, Switzerland
Get in touch: email@example.com More information: http://www.predictioncenter.org/casp13/index.cgi Submit a target: http://www.predictioncenter.org/casp13/targets_submission.cgi
Validation reports for all PDB structures have been updated to include new percentile statistics reflecting the state of the archive on December 31st 2017 and updated versions of third-party software: CCP4/Refmac (7.0 v44), Phenix (1.13) and Mogul (2018) and CSD archive (as539be). The LLDF statistic previously used to identify ligands that do not fit electron density well has been replaced by a combination of Real-space R-factor (RSR>0.4) and Real-space correlation coefficient (RSCC<0.8). The identification of standard amino acid or nucleotide residues that do not fit the electron density well has been corrected to take into account how reliably Refmac software reproduces the R-factors reported by authors. Documentation at wwpdb.org/validation has been updated to reflect these changes.
Updated reports are accessible from:
A copy of the previous version is archived at RCSB PDB and PDBj.
wwPDB validation reports provide an assessment of structure quality using widely accepted standards and criteria, recommended by community experts serving in Validation Task Forces. The wwPDB partners strongly encourage journal editors and referees to request them from authors as part of the manuscript submission and review process. The reports are date-stamped and display the wwPDB logo, and contain the same information, regardless of which wwPDB site processed the entry. Provision of wwPDB validation reports is already required by Nature, eLife, The Journal of Biological Chemistry, the International Union of Crystallography (IUCr) journals, FEBS journals, Journal of Immunology and Angew Chem Int Ed Engl as part of their manuscript-submission process.
Validation reports are also provided to depositors through OneDep validation, deposition and biocuration of structure data. The wwPDB partners encourage the use of the stand-alone validation server and the web service API at any time prior to data deposition. Depositors are required to review and accept the reports as part of the data submission process. Validation reports will continue to be developed and improved as we receive recommendations from the expert Validation Task Forces for X-ray, NMR, EM, experts on ligand validation, and as we collect feedback from depositors and users.
An article focused on the structure validation reports produced by wwPDB is available: Validation of Structures in the Protein Data Bank (2017) Structure 25: 1916-1927 doi: 10.1016/j.str.2017.10.009.
The journal Database has published an article describing Worldwide Protein Data Bank biocuration supporting open access to high-quality 3D structural biology data.
All data deposited to the PDB undergo critical review by wwPDB biocurators. Structural data submitted are examined for self-consistency, standardized using controlled vocabularies, cross-referenced with other biological data resources, and validated for scientific/technical accuracy.
Biocuration is integral to PDB data archiving, as it facilitates accurate, consistent, and comprehensive representation of biomolecular structure data, which in turn allows efficient and effective usage by research scientists, educators, students, and the curious public worldwide.
This paper describes the importance of biocuration for structural biology data deposited to the PDB, wwPDB biocuration processes, and the role of expert biocurators in sustaining a high-quality archive.
Worldwide Protein Data Bank biocuration supporting open access to high-quality 3D structural biology data Jasmine Y. Young, John D. Westbrook, Zukang Feng, Ezra Peisach, Irina Persikova, Raul Sala, Sanchayita Sen, John M. Berrisford, G. Jawahar Swaminathan, Thomas J. Oldfield, Aleksandras Gutmanas, Reiko Igarashi, David R. Armstrong, Kumaran Baskaran, Li Chen, Minyu Chen, Alice R. Clark, Luigi Di Costanzo, Dimitris Dimitropoulos, Guanghua Gao, Sutapa Ghosh, Swanand Gore, Vladimir Guranovic, Pieter M. S. Hendrickx, Brian P. Hudson, Yasuyo Ikegawa, Yumiko Kengaku, Catherine L. Lawson, Yuhe Liang, Lora Mak, Abhik Mukhopadhyay, Buvaneswari Narayanan, Kayoko Nishiyama, Ardan Patwardhan, Gaurav Sahni, Eduardo Sanz-García, Junko Sato, Monica R. Sekharan, Chenghua Shao, Oliver S. Smart, Lihua Tan, Glen van Ginkel, Huanwang Yang, Marina A. Zhuravleva, John L. Markley, Haruki Nakamura, Genji Kurisu, Gerard J. Kleywegt, Sameer Velankar, Helen M. Berman, Stephen K. Burley (2018) Database 2018: bay002 doi: 10.1093/database/bay002
A snapshot of the PDB archive (ftp://ftp.wwpdb.org) as of January 1, 2018 has been added to ftp://snapshots.wwpdb.org and ftp://snapshots.pdbj.org. Snapshots have been archived annually since January 2005 to provide readily identifiable data sets for research on the PDB archive.
The directory 20180101 includes the 136,472 experimentally-determined structure and experimental data available at that time. Atomic coordinate and related metadata are available in PDBx/mmCIF, PDB, and XML file formats. The date and time stamp of each file indicates the last time the file was modified. The snapshot is 1034 GB.