RESEARCH NEWS

Protein data bank archives 50,000th molecule

14 April 2008



The Protein Data Bank (PDB) based at Rutgers, The State University of New Jersey, and the University of California-San Diego (UCSD) has put the 50,000th molecule structure into its archive, joining other structures vital to pharmacology, bioinformatics and education.

The PDB is the single worldwide repository for the three-dimensional structures of large molecules and nucleic acids. This freely available online library allows biological researchers and students to study, store and share molecular information on a global scale.

Officially founded in 1971 with seven structures at Brookhaven National Laboratory, the archive is managed by a consortium called the worldwide Protein Data Bank (wwPDB).

Today, the PDB archive receives approximately 25 new experimentally determined structures from scientists each day – and more than 5 million files are downloaded from the PDB archive every month. Users include structural biologists, computational biologists, biochemists, and molecular biologists in academia, government and industry as well as educators and students.

‘Advances in science and technology have helped the archive grow by leaps and bounds in the last 10 years,’ said Helen M. Berman, director of the RCSB PDB and Rutgers Board of Governors professor of chemistry and chemical biology, noting that the size of the PDB has doubled in just the last three and a half years.

 ‘We are estimating that the PDB will not only double but triple to 150,000 structures by 2014,’ said Philip E. Bourne, associate director of the RCSB PDB and professor of pharmacology at the UCSD Skaggs School of Pharmacy and Pharmaceutical Sciences.

The RCSB PDB, based at Rutgers University and the UCSD, is responsible for releasing PDB entries into the archive after they have been reviewed and annotated. At Rutgers, RCSB PDB members annotate structures and develop the sophisticated infrastructure needed to handle these complex data. The primary FTP site is based at SDSC, which serves as the distribution point for users.

The RCSB PDB presents a comprehensive website and database that lets users search, analyse and visualise the structures of biological macromolecules and their relationships to sequence, function and disease. In addition, it features a 'Molecule of the Month' series, which recently published its 100th installment.

Proteins, one of the main building blocks for living organisms, come in a variety of shapes, with the form of a protein corresponding to its function. The structures housed in the PDB demonstrate great diversity in size, complexity and function, including: insulin, the protein deficient in diabetic patients; p53 tumor suppressor, a protein often implicated in cancer; anthrax toxin, the disease-causing protein made by anthrax; and amyloid peptide, a protein implicated in Alzheimer’s disease.

Related internet links

RCSB Protein Data Bank
San Diego Supercomputer Center
Rutgers, The State University of New Jersey