RepeatExplorer

Z MetaCentrum
Přejít na: navigace, hledání


Description

RepeatExplorer is a computational pipeline for discovery and characterization of repetitive sequences in eukaryotic genomes. The pipeline uses high-throughput genome sequencing data as an input and performs graph-based clustering analysis of sequence read similarities to identify repetitive elements within analyzed samples. The analysis principles were described in Novak et al. (2010) and examples of its application can be found in a number of published papers (see Appendix). It should be noted that although the repeat identification algorithm generally works for any genome, some parts of the pipeline (e.g. protein domain-based classification of mobile elements) were primarily developed for application to plant genomics. However, there is a possibility to supply a custom repeat database to improve sensitivity in classification of non-plant repeats.

Availability

Portal or script version, and older version from 11.10.2013

Licence

GNU GPL version 3

Use

Galaxy portal https://galaxy-elixir.cerit-sc.cz/ (wiki https://wiki.metacentrum.cz/wiki/Galaxy ) Script version from the portal

module add repeatexplorerREportal
seqclust_cmd.py -h

Older script version:

module add repeatexplorer
seqclust_cmd.py -h

Documentation

http://repeatexplorer.umbr.cas.cz/static/html/help/manual.html

Papers about RE: http://www.biomedcentral.com/1471-2105/11/378 and http://bioinformatics.oxfordjournals.org/content/29/6/792

Program manager

meta@cesnet.cz

Homepage

http://repeatexplorer.umbr.cas.cz/