CD-HIT

Z MetaCentrum
Skočit na navigaci Skočit na vyhledávání

Description

CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences. It is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset.

License

GPLv2

Usage

Upcoming modulesystem change alert!

Due to large number of applications and their versions it is not practical to keep them explicitly listed at our wiki pages. Therefore an upgrade of modulefiles is underway. A feature of this upgrade will be the existence of default module for every application. This default choice does not need version number and it will load some (usually latest) version.

You can test the new version now by adding a line

source /cvmfs/software.metacentrum.cz/modulefiles/5.1.0/loadmodules

to your script before loading a module. Then, you can list all versions of cdhit and load default version of cdhit as

module avail cdhit/ # list available modules
module load cdhit   # load (default) module


If you wish to keep up to the current system, it is still possible. Simply list all modules by

module avail cdhit

and choose explicit version you want to use.

Documentation

Documentation is at http://weizhong-lab.ucsd.edu/cd-hit/ref.php .

Homepage

http://weizhong-lab.ucsd.edu/cd-hit/