The journal Database has published an article describing Worldwide Protein Data Bank biocuration supporting open access to high-quality 3D structural biology data.
All data deposited to the PDB undergo critical review by wwPDB biocurators. Structural data submitted are examined for self-consistency, standardized using controlled vocabularies, cross-referenced with other biological data resources, and validated for scientific/technical accuracy.
Biocuration is integral to PDB data archiving, as it facilitates accurate, consistent, and comprehensive representation of biomolecular structure data, which in turn allows efficient and effective usage by research scientists, educators, students, and the curious public worldwide.
This paper describes the importance of biocuration for structural biology data deposited to the PDB, wwPDB biocuration processes, and the role of expert biocurators in sustaining a high-quality archive.
Worldwide Protein Data Bank biocuration supporting open access to high-quality 3D structural biology data Jasmine Y. Young, John D. Westbrook, Zukang Feng, Ezra Peisach, Irina Persikova, Raul Sala, Sanchayita Sen, John M. Berrisford, G. Jawahar Swaminathan, Thomas J. Oldfield, Aleksandras Gutmanas, Reiko Igarashi, David R. Armstrong, Kumaran Baskaran, Li Chen, Minyu Chen, Alice R. Clark, Luigi Di Costanzo, Dimitris Dimitropoulos, Guanghua Gao, Sutapa Ghosh, Swanand Gore, Vladimir Guranovic, Pieter M. S. Hendrickx, Brian P. Hudson, Yasuyo Ikegawa, Yumiko Kengaku, Catherine L. Lawson, Yuhe Liang, Lora Mak, Abhik Mukhopadhyay, Buvaneswari Narayanan, Kayoko Nishiyama, Ardan Patwardhan, Gaurav Sahni, Eduardo Sanz-García, Junko Sato, Monica R. Sekharan, Chenghua Shao, Oliver S. Smart, Lihua Tan, Glen van Ginkel, Huanwang Yang, Marina A. Zhuravleva, John L. Markley, Haruki Nakamura, Genji Kurisu, Gerard J. Kleywegt, Sameer Velankar, Helen M. Berman, Stephen K. Burley (2018) Database 2018: bay002 doi: 10.1093/database/bay002
A snapshot of the PDB archive (ftp://ftp.wwpdb.org) as of January 1, 2018 has been added to ftp://snapshots.wwpdb.org and ftp://snapshots.pdbj.org. Snapshots have been archived annually since January 2005 to provide readily identifiable data sets for research on the PDB archive.
The directory 20180101 includes the 136,472 experimentally-determined structure and experimental data available at that time. Atomic coordinate and related metadata are available in PDBx/mmCIF, PDB, and XML file formats. The date and time stamp of each file indicates the last time the file was modified. The snapshot is 1034 GB.