School of Physics
Huazhong University of Science and Technology

Huang Laboratory

HOME RESEARCH PEOPLE PUBLICATIONS SOFTWARE LINKS CONTACT Job Opening

Welcome to
Huang Lab

About DeepHomo2

DeepHomo2 is a deep learning framework to predict inter-protein residue-residue contacts of homo-oligomer complexes

Download DeepHomo2.0

DeepHomo2 is freely available for academic and non-commercial users. [Download DeepHomo2]
The downloaded DeepHomo2 package includes the deepl-learning models for DeepHomo2.0 and DeepHomoSeq.

Required Environment

Phython 3.7 or later (https://www.python.org).

How to Install DeepHomo2.0

The installation of the DeepHomo2.0 package is very straightforward. Just download the DeepHomo2 package and unpacke it like this:

        tar -xzf DeepHomo2.tar.gz
        cd DeepHomo2

Then, you will see a shell script file named "deephomo2.sh" that can be used to predict the residue-residue contacts using a sinlge-line command.

However, in order to run "deephomo2.sh", serveral third-party packages/programs are required as follows.

Software requirements

1. Python package requirements:

2. HH-suite3 for producing MSA.

You should install your own HH-suite3 (https://github.com/soedinglab/hh-suite) and set the "HHsuite_root" parameter in the "deephomo2.sh" file. 3. Uniclust database for searching.

You should download tohe uniclust database (http://wwwuser.gwdg.de/~compbiol/uniclust/2020_03/) and set the "UniRef_database" parameter in the "deephomo2.sh" file. 4. CCMpred for DCA calculation.

You should install your own ccmpred (https://github.com/soedinglab/CCMpred) and set the "ccmpred" parameter in the "deephomo2.sh" file. 5. DSSP for calculating sencondary structure and solvent accessbility.

You should install your own dssp (https://swift.cmbi.umcn.nl/gv/dssp) and set the "dssp" parameter in the "deephomo2.sh" file. 6. ESM package and ESM-MSA pre-trained model for producing ESM-MSA features

7. LoadHHM.py

You should download "LoadHHM.py" from RaptorX-Contact (https://github.com/j3xugit/RaptorX-Contact) and put the file in the "bin/" directory of the DeepHomo2 package. 8. FFTW3

You need to make sure that the FFTW3 library is available in your Linux system. Otherwise, you may install it by typing "yum install fftw3" as root or download and install it manually (https://www.fftw.org/).

How to Use DeepHomo2

Running DeepHomo2 is very straightforward and can be as simple as this

        deephomo2.sh monomer.pdb -out contacts.out

where the "monomer.pdb" is the input monomer pdb file including the hydrogens, and the "contacts.out" is predicted inter-protein residue-residue contacts of its homo-dimer. For detailed information of DeepHomo2, just type "deephomo2.sh" to show the usage as follows.

USAGE: deephomo2.sh monomer.pdb [options]

Descriptions:
    monomer.pdb : input, the file of target strutcture(*.pdb)
    -cov        : the coverage of hhblits, default --> 0.4
    -ecut       : the e-value cutoff of hhblits, default --> 0.001
    -ncpu       : the number of cpu for hhblits, default --> 3
    -db         : the database for hhblits, default --> UniRef30_2020_03
    -lencut     : the cutoff of sequence length, default --> 500
    -model      : the pretrained model of DeepHomo2.0, default --> models/DeepHomo2.pkl
    -ntop       : output the top n predicted contacts, default --> all
    -out        : the output filename for predicted contacts, default --> contacts.out

Examples

Here is a demo to run DeepHomo2:

    cd example
    ../deephomo2.sh T0792.pdb -out T0792_contacts.out

where "T0792.pdb" is the AlphaFold2-predicted structure of the CASP target T0792. The predicted residue-residue contacts are saved in T0792_contacts.out, which looks like this

Number   ResNum1  ResName1 ResNum2  ResName2 Predicted_Score
1        54:A     THR      73:A     ARG      0.5018
2        55:A     ASP      73:A     ARG      0.5010
3        58:A     LEU      73:A     ARG      0.4979
4        66:A     GLU      76:A     ASN      0.4542
5        54:A     THR      58:A     LEU      0.4253
6        64:A     THR      76:A     ASN      0.4203
7        58:A     LEU      65:A     ALA      0.4146
8        58:A     LEU      74:A     ILE      0.4124
9        58:A     LEU      71:A     GLY      0.4088
10       58:A     LEU      63:A     VAL      0.4038
...

where the residue numbers and names are based on the user-input structure.

Citation:

Lin, P., Yan, Y., Huang, S.-Y. Improved protein-protein interaction prediction of homo-oligomeric complexes by Transformer-enhanced deep learning. (Submitted)

Other references

Ekeberg, M., Lovkvist, C., Lan, Y., Weigt, M., & Aurell, E. (2013). Improved contact prediction in proteins: Using pseudolikelihoods to infer Potts models. Physical Review E, 87(1), 012707. doi:10.1103/PhysRevE.87.012707
Balakrishnan, S., Kamisetty, H., Carbonell, J. G., Lee, S.-I., & Langmead, C. J. (2011). Learning generative models for protein fold families. Proteins, 79(4), 1061–78. doi:10.1002/prot.22934
Steinegger, M., Meier, M., Mirdita, M., Vohringer, H., Haunsberger, S. J., and Soding, J. (2019) HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics, 473. doi: 10.1186/s12859-019-3019-7
Mirdita, M., von den Driesch, L., Galiez, C., Martin, M. J., Soding, J., and Steinegger, M. (2017) Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 45(D1), D170-176. doi: 10.1093/nar/gkw1081
Rao R M, Liu J, Verkuil R, et al. MSA transformer[C]//International Conference on Machine Learning. PMLR, 2021: 8844-8856.
Wang S, Sun S, Li Z, et al. Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS computational biology, 2017, 13(1): e1005324.
Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature, 2021, 596(7873): 583-589.