HPEPDOCK help

Help for using HPEPDOCK server

1. How to provide input for docked molecules

The HPEPDOCK server is for predicting the complex structure between a protein and a peptide through a hierarchical flexible peptide docking approach by fast conformational modeling and orientational sampling of peptides. Users need to provide inputs for the protein and the peptide to be docked. The HPEPDOCK server can accept four types of input for proteins and peptides:

Upload your pdb file in PDB format.
Provide your structure by PDB ID:ChainID (e.g. 3BFW:A).
Copy and paste your protein/peptide sequence in FASTA format.
Upload your protein/peptide sequence file in FASTA format

NOTE: Only ONE type of input is needed for each molecule.

If more than one type of input is provided, the first one will be used. For input as "PDB ID:ChainID", users can provide one single chain ID or multiple chain IDs. For example, "3BFW:A" stands for the chain A of the pdb file of 3BFW; "1AHW:AB" stands for the chains A and B of the pdb file of 1AHW. If only the sequence is provided, the server will automatically construct a model structure from a homologous template in the Protein Data Bank using an in-house modeling pipeline of HH Suite , Clustaw2, and MODELLER. As our pipeline is currently designed to model single-chain proteins, if the protein contains multiple chains, users are recommended to submit their own pdb file.

Define a disulfide bond in a cyclic peptide.

HPepDock now supports the docking of cyclic peptides formed by a disulfide bond between two cysteine (Cys) resdiues. For sequence input, the disulfide bond information is defined after each sequence like this.

SDCAFRGCLLD -ss 3 8

where "-ss 3 8" means that residues 3 and 8 form a disulfide bond in the peptide.

For pdb input, the disulfide bond information in the pdb file will be automatically identified by our HPepDdock protocol.

Currently, only one disulfide bond is supported for a cyclic peptide.

Docking mode for different peptide inputs.

For multiple-peptide input, there are two docking modes. One is the default "Pose prediction" mode, where the top 100 binding modes are output for each peptide; The other is the "Virtual screening" mode, where only the top-scored binding mode is output for each peptide and all the docked peptides are ranked according to their binding scores.

For single-peptide input, the default "Pose prediction" mode is used.

Multiple-peptide input in one job.

HPepDock now supports multiple-peptide input (up to 10 peptides) in one docking job.

For sequence input, multiple peptide sequences should be put in different lines with one sequence in one line like this.

GPTIEEVD
SLNYIIKVKE
SDVAFRGNLLD
ATVRTYSC
NFDNPVYRKT
ASVSA
GAANDENY
......

For pdb input, the coordinate file including multiple peptides should be saved in NMR-style like this.

MODEL 1
ATOM   4172  N   LYS A   1       6.968  16.518  15.137  1.00 12.07           N
ATOM   4173  CA  LYS A   1       6.459  17.792  14.561  1.00 10.15           C
ATOM   4174  C   LYS A   1       5.233  18.258  15.317  1.00 11.76           C
ATOM   4175  O   LYS A   1       4.501  17.454  15.881  1.00 11.37           O
ATOM   4176  CB  LYS A   1       6.274  17.708  13.031  1.00 13.63           C
ATOM   4177  CG  LYS A   1       5.366  16.537  12.622  1.00 14.29           C
......
ENDMDL
MODEL 2
ATOM      1  N   ALA A   1       7.279  -1.336  34.670  1.00 29.28           N1+
ATOM      2  CA  ALA A   1       8.285  -1.033  33.603  1.00 27.42           C
ATOM      3  C   ALA A   1       8.758   0.415  33.731  1.00 26.40           C
ATOM      4  O   ALA A   1       7.937   1.328  33.832  1.00 23.28           O
ATOM      5  CB  ALA A   1       7.652  -1.283  32.201  1.00 28.79           C
......
ENDMDL
......

NOTE: The runing time for a multiple-peptide input job will increase with the number of the peptides in the input. Therefore, users are highly recommended to specify the available binding site for the jobs with many peptides.

Non-standard amino acids in a peptide.

HPepDock now supports the peptide sequence including non-standard amino acids. The sequence input for such peptides should follow the 3-letter definitions like this

ARG-CYS-ARG-GLN-ARG-LYS-GLY-ARG-ARG-ILE-CYS-ILE-ARG-ILE-DPR-PRO

where the "DPR" is a non-standard amino acid. Users may check this residue list for all supported amino acids.

Number of peptide conformations to represent peptide flexibility.

By default, HPepDock considers the flexibility of a peptide by first generating an ensemble of 1000 peptide conformations for the peptide using our MODPEP program and then docking the generated peptide conformations against the receptor molecule. The more peptide conformations are generated for a peptide, the more extensively the flexibility of the peptide is considered, but the more time the docking job will take.

For a docking/screening job with multiple-peptide input (e.g. 10 peptides), users may choose a smaller number of peptide conformations to reduce the total docking time at the cost of some docking accuracy.

Rigid docking with user-provided peptide structures.

By default, HPEPDOCK conducts ensemble docking of multiple peptide conformations, which are generated by our MODPEP program, to consider peptide flexibility. Users may also choose to only perform rigid docking with their own peptide structure(s) by checking the "Rigid docking ..." check box.

2. How to specify the binding site [optional]

The HPEPDOCK supports both global docking and local docking. By default, the server performs global flexible-peptide docking to predict the binding complexes between a protein and a peptide, in which the binding site information can be automatically extracted from the complex structures in the Protein Data Bank if available. Therefore, no information about the binding site is required for the docking job.

However, the server also gives users an option to specify the binding site if users provide their protein input as a pdb file or structure, such that the predicted models will have a higher accuracy. Two types of binding site information can be provided.

Binding site reference pdb file.

Binding site residues on the receptor.

	195:A, 203-206:A, 108:B

the "number(s):chain" of provided residues should be consistent with those in the input protein structure or pdb file

NOTE: Users are highly recommended to specify the binding site if available.

3. How to obtain your HPEPDOCK results.

Once users submit their job, they will be redirected to a status web page showing the status of the job. The status page is automatically refreshed every 10 seconds until the job is finished. Users have three ways to obtain their docking results.

Keep the status page open until it shows the docking results when the job is finished.
Bookmark the status page and come back later to check the docking results.
Wait for the email notification if users provide a valid email address when they submit their job.

After the job is done, users will be redirected to the result page, from which they can download the following files

Receptor PDB file uploaded by users or constructed by the server from the FASTA sequence provided by users.
The individual peptide models of the top 20 predictions that were docked to the receptor.
The compressed packages for the top 10 binding models, the top 100 binding models, or all the docking results.

Since the top 10 binding models are normally deemed as the most important models, the result page also provides an interactive view of the top 10 models using the Jmol software, where the receptor and the peptide are colored differently. Users can choose to view any of the top 10 models or all together by different representations and styles.

The page also gives a summary of the rankings and docking scores for the top 10 binding models. However, it should be noted that the docking scores here do not reflect the real binding affinities, but a relative ranking of the peptide binding.

NOTE: The result page only displays a limited number of top models for visualization purpose. For the complete binding models, users should click to download all the docking results in a package through the result page.

In addition, if only a sequence is provided as the input for the protein, the page will also show the information about the modeled protein structure by homology modeling, including the used template, model quality, sequence alignment and sequence identity between the template and the input sequence.

NOTE: It is recommended that docking results should be downloaded as soon as possible once a docking job is done, as the job results will only be stored on our server for two weeks.

4. Analyze the docking results

The result web page only display a limited nubmer of top binding models for visualization purpose. For the complete docking results, users should download all the results in a package.

There are three important files in the package, which users can use to analyze the docking results.

rec_xxxxxx.pdb -- The user-input receptor structure
hpepdock_all.pdb -- All the docked binding models of all the peptides
hpepdock_all.out -- The binding scores and sequences for all the binding models in "hpepdock_all.pdb"

Users may use the Linux command "vim" to check the binding scores like this

    vim hpepdock_all.out

To visualize all the peptide binding models, users may use a molecular visualization program like UCSF Chimera (https://www.cgl.ucsf.edu/chimera/).

For example, users can first open the receptor pdb file "rec_xxxxxx.pdb" within UCSF Chimera, and then visualize the binding models in the "hpepdock_all.pdb" file one by one through the "Tools->Surface/Binding Analysis->ViewDock". For more information, users may check the documentions of UCSF Chimera.