This package provides an implementation of the inference pipeline of AlphaFold v2.1.1 This is a completely new model that was entered in CASP14 and published in Nature. For simplicity, we refer to this model as AlphaFold throughout the rest of this document. https://github.com/deepmind/alphafold
To use Alphafold with CUDA, you will also need to accept license for cuDNN library. You will find the link on https://metavo.metacentrum.cz/cs/myaccount/licence.html.
Application is prepared as singularity image and downloaded datasets with example scripts to run with PBS in Metacentrum modules. Location:
There are four models with two speed/quality tradeoff. All combinations of these parameters are prepared in example scripts in Metacentrum.
You can control which AlphaFold model to run by adding the --model_preset= flag. We provide the following models:
monomer: This is the original model used at CASP14 with no ensembling.
monomer_casp14: This is the original model used at CASP14 with num_ensemble=8, matching our CASP14 configuration. This is largely provided for reproducibility as it is 8x more computationally expensive for limited accuracy gain (+0.1 average GDT gain on CASP14 domains).
monomer_ptm: This is the original CASP14 model fine tuned with the pTM head, providing a pairwise confidence measure. It is slightly less accurate than the normal monomer model.
multimer: This is the AlphaFold-Multimer model. To use this model, provide a multi-sequence FASTA file. In addition, the UniProt database should have been downloaded.
You can control MSA speed/quality tradeoff by adding --db_preset=reduced_dbs or --db_preset=full_dbs to the run command. We provide the following presets:
reduced_dbs: This preset is optimized for speed and lower hardware requirements. It runs with a reduced version of the BFD database. It requires 8 CPU cores (vCPUs), 8 GB of RAM, and 600 GB of disk space.
full_dbs: This runs with all genetic databases used at CASP14.
Tips and useful information
Example of run with different models and speed/quality tradeoff with example file seq.fasta, multimer with multi.fasta. This test shows difference of RAM consuming a length of run with the same test. Results form one run, data could be updated.
|model||speed/quality||RAM||Duration||cluster - GPU|
|monomer||full_dbs||185 GB||28 min||glados - RTX2080 - 8GB|
|monomer||reduced_dbs||153 GB||36 min||glados - RTX2080 - 8GB|
|monomer_casp14||full_dbs||197 GB||35 min||zia - A100 - 40GB|
|monomer_casp14||reduced_dbs||38 GB||36 min||zia - A100 - 40GB|
|monomer_ptm||full_dbs||190 GB||36 min||gita - RTX2080 Ti - 11GB|
|monomer_ptm||reduced_dbs||55 GB||32 min||gita - RTX2080 Ti - 11GB|
|multimer||full_dbs||119 GB||75 min||zia - A100 - 40GB|
|multimer||reduced_dbs||40 GB||73 min||zia - A100 - 40GB|