proGenomes v3 is an update of proGenomes (Nucleic Acids Res doi: 10.1093/nar/gkw989) and proGenomes v2 (Nucleic Acids Res doi: 10.1093/nar/gkz1002).
Verion 3.0 provides over 900,000 consistently annotated bacterial and archaeal genomes. Genomes were screened for high completeness and low contamination using checkM and GUNC Taxonomic annotations are provided as species clusters (Mende et al., Nature Methods, 2013) as well as GTDB and NCBI taxonomy. Functional annotations of over 4 billion genes are provided as eggNOG orthologous groups (Huerta-Cepas et al., NAR, 2016).
We further provide a set of 40 universal, single-copy genes for each of the genomes (Cicarelli et al., Science. 2006; Sorek et al., Science, 2007) to support phylogenetic studies of the genomes.
Additionally, more than 40,000 representative genomes covering all species clusters are available for direct download and these can be used for the annotation of metagenomics datasets, large scale phylogenetics and other comparative approaches. Within representative genomes, we also provide habitat specific sets.
Start exploring proGenomes by searching for a taxonomic group or species cluster or an individual genome.
Previous versions of the database are available at the following addresses: http://progenomes1.embl.de/ and http://progenomes2.embl.de/
proGenomes v3 is free for academic and non-commercial use. For commercial use or customized versions, please contact biobyte solutions GmbH.
We hope you find the database user-friendly and easy to use. However, if you encounter any problems or have questions, please