cWGAP Help

Circos_img100




Genome Annotation

The file format input must be Genbank (.gbk or .gb) format post annotation (automatic or manual). We have used Prokka for in-house use on our bacteria, and runs flawlessly for rapid annotation of prokaryotic genomes in cWGAP. Here is a list of possible other annotation pipelines. Although we use the FASTA sequence from the Genbank to run alignments - we use the Genbank annotations for the gene names. Your mileage may vary on results because of the irregularities. It is best to stay consistent (same annotation pipeline on all genomes). Often when cWGAP fails, it is because of a poor annotation.

TL;DR: We have thoroughly tested cWGAP on annotations from Prokka and Genbank. Any other annotations your mileage may vary (Genbank formats widely vary).

Here is a list of annotation services we know of, although there are many available:


Program Description Install? OS?
Prokka Prokka is a software tool for the rapid annotation of prokaryotic genomes. A typical 4 Mbp genome can be fully annotated in less than 10 minutes on a quad-core computer, and scales well to 32 core SMP systems. It produces GFF3, GBK and SQN files that are ready for editing in Sequin and ultimately submitted to Genbank/DDJB/ENA. Yes, Linux
RAST Rapid Annotation using Subsystem Technology. The NMPDR, SEED-based, prokaryotic genome annotation service. No, web-based
PGAP NCBI developed an automatic annotation pipeline that it runs on all finished genomes before submitting to NCBI Genbank - this is a very reliable method. No, only NCBI
BG7 Annotation system specially designed for bacterial and NGS data. Yes, JAR (any)
MAKER MAKER is a portable and easily configurable genome annotation pipeline. Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. Yes, Linux
DIYA About Do-It-Yourself Annotator (DIYA) is a modular and configurable open source pipeline framework, used for the rapid annotation of microbial genome sequences. Yes, Linux
Ergatis Ergatis is by far the most powerful, but overkill to install if you are only doing one genome. Yes, Linux