School of Physics Huazhong University of Science and Technology |
|
Welcome to Huang Lab |
About EMReady2Improvement of cryo-EM maps for protein and nucleic acid using heterogeneity-aware deep learning![]() Copyright © 2024 Hong Cao, Jiahua He, Tao Li, Sheng-You Huang and Huazhong University of Science and Technology EMReady2 is freely available for academic or non-commercial users. For a commercial license, please contact us at huangsy@hust.edu.cn Citation of the following references should be included in any publication that uses the data or results generated by EMReady2:1. Hong Cao*, Jiahua He*, Tao Li, and Sheng-You Huang. Improvement of cryo-EM maps for protein and nucleic acid using heterogeneity-aware deep learning. In submittion (2024). 2. He J, Li T, Huang S-Y.* Improvement of cryo-EM maps by simultaneous local and non-local deep learning. Nature Communications, 2023;14:3217. [link]
Download EMReady2Click here to download EMReady v2.2.2 new, Please send your valuable feedbacks to us at huangsy@hust.edu.cn.
Release Notes:2025-03-07:We have optimized the compatibility of some packages in environment.yml and replaced the original interp3d.f90 with interp3d.py. This ensures that users won't miss the opportunity to explore EMReady2 due to configuration issues.2024-09-23:We noticed that when installing the emready2_env environment, the default version of setuptools installed was too high, which caused errors on some systems when using f2py to compile interp3d. We have now specified the version of setuptools in the environment.yml. Click here to download EMReady v2.2.1. 2024-09-02:We have noticed that there might be instances where the --interp_back option was not included when running EMReady2, leading to the need for an additional run of the EMReady2 program to obtain the interpolated back EMReady2-processed map. To address this, we have provided a separate script, interp_back.py, which performs the inverse interpolation function.Click here to download EMReady v2.2. 2024-07-11: We have noticed EMReady2 may be run failed on Ubuntu system due to the use of non-POSIX shell syntax. So we have updated EMReady2 to v2.1 that should work on Ubuntu (and other Linux systems) now. Click here to download EMReady v2.1. 2024-07-06: 1. The EMReady v2 is now supporting the map improvement for both proteins and nucleic acids. Additionally, EMReady v2 now supports the input of 2-10 Angstrom cryo-EM and STA type density maps.    a. The pixel size of the output map is set to be the same as that of the input map    b. A mask option has also been added to allow users to select or exclude some map regions.    c. The algorithm is more robust to weak density signals. List of files      EMReady2.sh: The main program wrapped in a shell script to run EMReady2      pred.py: Python script of EMReady2      frn.py: Pytorch implementation of Filter Response Normalization Layer used in EMReady2      interp3d.f90: Fortran source code for the interpolation of EM grid      scunet.py: Pytorch implementation of 3D Swin-Conv-UNet used in EMReady2      utils.py: Python utilities used in EMReady2      interp_back.py: Python script for inverse interpolation      model_state_dicts/:              model_state_dicts/model_grid_size_1.0.pth: the trained model with 1.0 Angstrom grid size              model_state_dicts/model_grid_size_0.5.pth: the trained model with 0.5 Angstrom grid size      environment.yml: Required packages for Python virtual environment of EMReady2
Notes:Compared to EMReady, EMReady v2 has added many new functions, which require the installation of more Python packages. Running it in the EMReady environment will cause errors; please set up a new environment.
Install EMReady2Quick installation of required online packages$ conda env create -f environment.yml Details of required online packagesPython (3.9.12) (https://www.python.org)      Python package requirements:          pytorch (2.0.0) (https://pytorch.org)          pytorch-cuda (11.7) (https://pytorch.org)          biopython (1.81) (https://biopython.org/)          numpy (1.24.2) (https://www.numpy.org)          einops (0.6.1) (https://einops.rocks/)          mrcfile (1.3.0) (https://github.com/ccpem/mrcfile)          timm(0.9.2) (https://github.com/rwightman/pytorch-image-models)          tqdm (4.60.0) (https://github.com/tqdm/tqdm)NOTE: In order to run Python scripts properly, users should properly set the variables in EMReady2.sh:     1. Set "EMReady_home" to the root directory of EMReady2, for example, if EMReady2 is unzipped to "/home/hcao/data/EMReady2", setEMReady_home="/home/hcao/data/EMReady2"
     2. Set "activate" to path of conda activator, for example
activate="/home/hcao/data/anaconda3/bin/activate" EMReady_env="emready2_env" . If the environment is built with a different name, users should modify "EMReady_env" accordingly
In addition to online packages, the interpolation program "interp3d.f90" should be built as a python package 'interp3d' using f2py in the conda virtual environment of EMReady2$ conda activate emready2_env
$ f2py -c ./interp3d.f90 -m interp3d
$ sudo apt-get install gfortran
     Alternatively, users can also install gfortran via conda (in the same conda environment of EMReady2):
$ conda install -c conda-forge gfortran==11.4
$ f2py -c ./interp3d.f90 -m interp3d --fcompiler=gnu95 --f77exec=/path/to/gfortran \
--f90exec=/path/to/gfortran
How to Run EMReady2$ ./EMReady2.sh in_map.mrc out_map.mrc [Options] Notes:1. Users can specify a larger STRIDE of sliding window (default=12) to reduce the number of overlapping boxes to calculate. If users run out of memory, they may set it to a larger value. Howerver, since the size of the overlapping boxes is 48×48×48, so the maximum value of stride is 48. However, the STRIDE should not be too large, otherwise some inconsistencies among sliding boxes will be introduced to the processed map. In most cases, the default value (12) is recommended. For larger density maps, a stride value of 24 is a decent choice. 2. By default, EMReady2 will run on GPU(s). Users can adjust the BATCH_SIZE according to the VRAM of their GPU. Empirically, an NVIDIA A100 with 40 GB VRAM can afford a BATCH_SIZE of 200. Users can run EMReady2 on CPUs by setting --use_cpu. But this may take very long time for large density maps. 3. Users can provide a mask map in MRC2014 format by option -m, where the contour threshold to binarize the mask map can be specified by option -c. Alternatively, the mask map can also be generated from a given input structure by option -p, where the mask will be generated around the heavy atoms within a radius specified by option -r. Users can also inversely apply the mask by option --inverse. In addition, users can save mask by option -mo. 4. There are two EMReady2 models trained at two different grid sizes: 0.5 Angstrom and 1.0 Angstrom. Depending on the grid size of the input map, the corresponding model will be automatically selected. Specifically, if the grid size of the input map is less than 1.0 Angstrom, the model with 0.5 Angstrom grid size will be used; otherwise, the model with 1.0 Angstrom grid size will be used. 5. During the processing, the grid size of the input map will be interpolated to 0.5 or 1.0 Angstrom, depending on the model used. By default, the grid size of the output processed map is 0.5 or 1.0 Angstrom.However, users can choose to interpolate the grid size of the output processed map back to the original size by option --interp, while the processed map at EMReady2 model's grid size (0.5 or 1.0 Angstrom) will also be saved in the same directory as \*_grid_size_0.5.mrc or \*_grid_size_1.0.mrc. 6. Use python interp_back.py -i input_map -o output_map -f reference_map to perform the inverse interpolation of the EMReady2-processed map. Here, input_map is the path to the non-interpolated EMReady2-processed map, output_map is the path to the inversely interpolated EMReady2-processed map, and reference_map is the path to the original map input to EMReady2.Examples![]() ![]() |