Program Description:

PROSESS is composed of two parts, a front-end web-interface (written in Perl and HTML) and a back-end for calculation (written in Java, Python, C and Fortran). As with most servers, PROSESS has a data entry page (Home), a Help page, an Input Format page, an Output Format page and a Contact page, each of which can be accessed through a menu bar located at the top of each page. The PROSESS server requires either a PDB formatted file (for newly determined structures) or a PDB accession number (for previously determined structures) as input. The PDB files may consist of a single protein structure or chain or an ensemble of structures (up to 100) from an NMR structure calculation. The maximum number of residues is 10,000. Users may optionally add or paste the protein’s sequence (in FASTA format), a chemical shift file (in BMRB or Shifty format), an NOE data file (in Xplor/CNS format), or any combination of the above. Detailed descriptions, along with examples of the allowable formats are given through hyperlinks to the PROSESS Input Format page.

The back-end for PROSESS consists of more than a dozen different programs, many of which were developed and extensively tested in our laboratory over the past 10 years. These include VADAR for coordinate, atomic packing, H-bond, secondary structure and geometric analysis, GeNMR for calculating non-covalent, threading and solvent energetics, ShiftX for calculating chemical shift correlations, RCI for correlating structure mobility to chemical shifts, PREDITOR for calculating torsion angle-chemical shifts agreement and SuperPose for evaluating structure similarities to known homologues. A number of other programs for calculating and comparing bond lengths, bond angles, H-bond planarity, volume variability and B-factor quality were also developed locally and added to the PROSESS back end. PROSESS also incorporates several other externally developed programs including MolProbity to assess atomic clashes, REDUCE to identify and His/Asn/Gln flips, Xplor-NIH to identify and quantify NOE restraint violations and NAMD to assess various energetic parameters.


Once the appropriate data files have been submitted, PROSESS returns an access hyperlink so that users may retrieve their output at a later time (data is securely stored on the site for up to 2 weeks). Alternately users can wait for the results to be presented on their computer screen. A typical PROSESS run takes 3-5 minutes. A screen-shot montage illustrating the typical output from a PROSESS run is shown in Figure 1. Every PROSESS output is divided into four “clickable” pages: 1) Global Structure Assessment; 2) Local (Per-residue) Structure Assessment; 3) Graphs and Figures and 4) Similarity Assessment. At the top of each output page is a summary of the protein structure providing the date of submission, name of the protein, number of residues, secondary structure content and other data. Below this summary is a set (4-6, depending on the input) of colored bars with numerical (0-10) and color-coded assessments of the structure quality. These red-amber-green (RAG) color bars are intended to provide users with a quick overview of the protein’s structure from the perspective of its 1) overall quality; 2) covalent and geometric quality; 3) non-covalent/packing quality; 4) torsion angle quality; 5) chemical shift quality and 6) NOE quality. Below the colored bars is an extensive set of tables.

Global Structure Assessment Page:

The Global Structure Assessment page lists more than 90 calculated parameters that are broadly grouped into five general categories (covalent, non-covalent, torsion, chemical shift and NOE). Each parameter is hyperlinked to a brief explanation of that parameter. The name of the program used to calculate that parameter is also provided. The value for the protein of interest is provided along with an expected value and a standard deviation determined from a set of 1000 high-resolution (<2.0 Å resolution) X-ray structures. If the calculated value is more than 2 standard deviations larger than the ideal value it is flagged with red comment. If the calculated value is more than 2 standard deviations below the ideal value it is flagged with a green comment. Values that are within acceptable limits (< 2 SD) are colored black. If an ensemble of NMR structures is provided the Global Structure Assessment page provides averages and standard deviations calculated over the full set of structures.

Local (Per-residue) Structure Assessment Page:

The Local Structure Assessment page provides tables that assess the residue-specific properties of the protein. Each residue is listed in a row and each property assessment is listed in a column. As with the Global Structure Assessment tables, short descriptions for each property or parameter are hyperlinked to the name of that parameter. Each column is also hyperlinked to a corresponding graph. Several sets of local structure assessments are provided including: 1) a residue-specific structure description (secondary structure, turns, H-bonds, etc.); 2) an evaluation of residue-specific accessible surface area and volumes; 3) an evaluation of residue-specific torsion angles (backbone and side chain); 4) an evaluation of residue-specific bond lengths and angles; 5) an evaluation of residue specific energies (Hbond, threading, covalent and non-covalent clashes); 6) an evaluation of residue specific chemical shift agreement(s); 7) a residue-specific evaluation of flexibility/mobility and 8) an evaluation of residue-specific NOE violations. Values that exceed normally allowable limits (as previously described by the programs that calculate these values) are colored red. If an ensemble of NMR structures is provided the Local Structure Assessment page calculates averages and standard deviations calculated over the full set of structures.

Similarity Assessment Page:

The Similarity Assessment page summarizes the results of BLAST searches of the protein sequence against the PDB. Those structures with Expect values <10-7 are listed, along with their resolution/Rfree values (if available). The calculated RMSD between the input structure and the related structures is calculated and displayed. Those structures that are significantly different from related structures (according to their RMSD and sequence identity) are flagged. The purpose of the Similarity Assessment page is to help users identify if their structure is already similar to something already solved and if it is, whether there may be structural differences that may be cause for concern.

Graphs and Figures Page:

The Graphs and Figures (G & F) page provides a variety of visual output that summarizes the results from both the Global and Local Structure Assessment pages. At the top of each G & F page is the standard set of red-amber-green bars. Below these synoptic bars is a set of thumbnail images and short titles so that users can navigate to different images, graphs or plots. Once clicked, the images expand to colorful, full-screen, high-resolution PNG images. Publication quality PNG graphics are used so that users can paste these results directly into papers or reports. The first set of thumbnails is a collection of Ramachandran plots that map the location of backbone torsion angles for all residues, for glycine-only residues, for proline-only residues and for pre-proline residues. These plots highlight the residues in the core, allowed and disallowed regions of Ramachandran space. In addition to these Ramachandran plots are numerous local structure assessment plots, typically shown as color-coded bar graphs, with the parameter value on the Y-axis and the sequence displayed on the X-axis. Each graph and axis is titled to allow for easy identification. In addition to these graphs and charts are a series of static ribbon diagrams generated via MolMol that highlight the structural location of any local torsion, bond, packing, shift or NOE violations. Interactive images of the color-coded structures are also available using the JMol applet.

References to Programs, Data Sources and Servers Used by PROSESS

Willard, L., Ranjan, A., Zhang, H., Monzavi, H., Boyko, R.F., Sykes, B.D. and Wishart, D.S. (2003) VADAR: a web server for quantitative evaluation of protein structure quality. Nucleic Acids Res. 31, 3316-3319.

Davis, I.W., Leaver-Fay, A., Chen, V.B., Block, J.N., Kapral, G.J., Wang, X., Murray, L.W., Arendall, W.B. 3rd, Snoeyink, J., Richardson, J.S. and Richardson, D.C. (2007) MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res. 35(Web Server issue), W375-383.

Berjanskii, M., Tang, P., Liang, J., Cruz, J.A., Zhou, J., Zhou, Y., Bassett, E., MacDonell, C., Lu, P., Lin, G. et al. (2009) GeNMR: a web server for rapid NMR-based protein structure determination. Nucleic Acids Res. 37(Web Server issue), W670-677.

Berjanskii, M.V. and Wishart, D.S. (2008) Application of the random coil index to studying protein flexibility. J. Biomol. NMR 40, 31-48.

Neal, S., Nip, A.M., Zhang, H. and Wishart, D.S. (2003) Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts. J Biomol. NMR. 26, 215-240.

Maiti, R., Van Domselaar, G.H., Zhang, H. and Wishart, D.S. (2004). SuperPose: a simple server for sophisticated structural superposition. Nucleic Acids Res. 32(Web Server issue), W590-594.

Berjanskii, M.V, Neal, S. and Wishart, D.S. (2006) PREDITOR: a web server for predicting protein torsion angle restraints. Nucleic Acids Res. 34, W63-69.

Schwieters, C.D., Kuszewski, J.J., Tjandra, N. and Clore, G.M. (2003) The Xplor-NIH NMR molecular structure determination package. J. Magn. Reson. 60, 65-73.

Phillips, J.C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R.D., Kale, L. and Schulten, K. (2005) Scalable molecular dynamics with NAMD. J. Comput. Chem. 26, 1781-1802.

Koradi, R., Billeter, M. and Wuthrich, K. (1996) MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graph. 14, 51-5.

Jmol:an open-source Java viewer for chemical structures in 3D.

Laskowski, R. A., MacArthur, M. W., Moss, D. S. and Thornton, J. M. (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26, 283-291