School of Physics Huazhong University of Science and Technology |
|
Welcome to Huang Lab |
EM2NAAutomated RNA/DNA model building from cryo-EM maps using deep learningCopyright © 2024 Tao Li, Jiahua He, Sheng-You Huang and Huazhong University of Science and Technology Released under GNU General Public License Version 3 EM2NA is freely available for academic or non-commercial users. If you have any questions regarding EM2NA, please don't hesitate to contact us at huangsy@hust.edu.cn Reference:Li T, Cao H, He J, Huang SY. Automated detection and de novo structure modeling of nucleic acids from cryo-EM maps. Nat Commun. 2024 Oct 30;15(1):9367. doi: 10.1038/s41467-024-53721-4. Release Notes: 2024-10-01: Updated to v1.3. Include source codes for some binaries.2024-08-09: EM2NA is updated to v1.2. Fix incorrect N1/N9 positions. Output .cif instead of .pdb to support large ribosomes. Add time limit controls. Add map grid trunking stride controls. Precompile interp3d with libgfortran.so.3/4/5 requirements. Add pillow==10.2.0 in environment.yml (pillow==10.4.0 is conflict with numpy==1.20.x). 2024-06-03: It is noticed that EM2NA may fail on Ubuntu system due to the use of non-POSIX shell syntax. So we have updated EM2NA to v1.1 that should work on Ubuntu (and other Linux systems) now.
Download EM2NA
*EM2NA is developed for building DNA/RNA structures from raw cryo-EM maps of protein-DNA/RNA complexes or DNA/RNA systems. List part of files      EM2NA.sh: EM2NA main script program      preds.py: Python prediction program of EM2NA      frn.py: Pytorch implementation of Filter Response Normalization Layer used in EM2NA      interp3d.f90: Fortran source code for the interpolation of EM grid      scunet.py: Pytorch implementation of 3D Swin-Conv-UNet      utils.py: Python utilities      environment.yml: Required packages for Python virtual environment      lib/: Library of ideal DNA/RNA helix      bin/: Binary files
Install EM2NA         CentOS Linux 7.x (or other unix-based systems)          LKH-3 (3.0.6) (http://webhotel4.ruc.dk/~keld/research/LKH-3) The program can be easily installed according to the documentation on the websites. Program CSSR is already included under GPL v3. Quick installation of python and required online packages$ conda env create -f environment.yml $ pip install missing-packages or $ conda install missing-packages in your EM2NA environment.
NOTE: In order to run Python scripts and EM2NA properly, users should properly set the variables in EM2NA.sh :
  1. Set "activate" to path of conda activator, for example
activate="/home/taoli/anaconda3/bin/activate" EM2NA_env="em2na" . If the environment is built with a different name, users should modify "EM2NA_env" accordingly
  3. Set "LKH_dir" to path of LKH-3, for example LKH_dir="/home/taoli/LKH-3.0.6"   4. Set "EM2NA_home" to path of EM2NA, for example EM2NA_home="/home/taoli/EM2NA_v1.3" In addition to online packages, the interpolation program "interp3d.f90" should be built as a python package 'interp3d' using f2py in the conda virtual environment of EM2NA. Users can either 1. build interp3d from source or 2. use our precompiled interp3d.1. build interp3d from source (recommended, requires gfortran installed and libgfortran)1.1 check where your gfortran is $ which gfortran $ conda activate em2na
$ f2py -c ./interp3d.f90 -m interp3d \
--f90exec=/path/to/your/gfortran --f77exec=/path/to/your/gfortran
2. use our precompiled interp3d 2.1 By default, we already provided a compiled interp3d package with libgfortran.so.3 requirement in EM2NA home directory. We have compiled another 2 versions of interp3d that requires libgfortran.so.4 or libgfortran.so.5 in directory lib_interp3d/. Check the .so support information for all the 3 verions: $ ldd lib_interp3d/libgfortran*/* $ cp lib_interp3d/libgfortran4/interp3d.cpython-38-x86_64-linux-gnu.so .
How to Run EM2NARunning EM2NA is very straightforward with one command like this: $ /path/to/EM2NA/EM2NA.sh input_map.mrc output_dir [Options]         Required arguments:                  input_map.mrc: File name of input EM density map in MRC2014 format.                  out_dir: Directory of the outputs (all intermediate or output files are written in this directory). The built RNA/DNA structures are saved in the "output.cif" file.         Options:                 --seq  input_seq.fasta:    Input sequence(s) in .fasta format                 --contour  contour:    Contour level of input map. (default: '1e-6')                 -g  GPU_ID:    ID(s) of GPU devices to use. e.g. '0' for GPU #0, and '2,3,6' for GPUs #2, #3, and #6. (default: '0')                 -b BATCH_SIZE:    Number of boxes input in one batch. (default: 40)                 --natype  NA_TYPE:    Nucleic-acid type ['DNA', 'RNA', or 'AUTO'], if 'AUTO', automatically detected by program. (default: 'AUTO')                 --usecpu:    Run deep learning predictin on CPU instead of GPU.                 --ncpu  NCPU:    Number of cpus to use to accelarate local maxima detection. (default: '4')                 --keep_temp_files  :    Specify to keep the temp files including predicted atom probability maps. By default, for a memory friendly usage, EM2NA cleans up all temp files. Please reduce the BATCH_SIZE if CUDA runs out of memory. Examples$ /path/to/EM2NA/EM2NA.sh emd_0586.map output --seq 6O1D.fasta -g "0"
Try the following command to model for EMD-26856 (PDB 7UXA). $ /path/to/EM2NA/EM2NA.sh emd_26856.map output --seq 7UXA.fasta -g "0"
|