Genome Taxonomy Database

Genome Taxonomy Database
Content
Data types
captured
Proposed prokaryotic nomenclature, phylogenomic data
Contact
Research centerAustralian Centre for Ecogenomics, University of Queensland
Authors
  • Phil Hugenholtz
  • Maria Chuvochina
  • Christian Rinke
Primary citationPMID 30148503
Release date2018
Access
Websitegtdb.ecogenomic.org
Download URLgtdb.ecogenomic.org/downloads
Web service URLgtdb.ecogenomic.org/tree
Miscellaneous
LicenseCC BY-SA 4.0
Version09-RS220 (24th April 2024)
Curation policymixed

The Genome Taxonomy Database (GTDB) is an online database that maintains information on a proposed nomenclature of prokaryotes, following a phylogenomic approach based on a set of conserved single-copy proteins. In addition to resolving paraphyletic groups, this method also reassigns taxonomic ranks algorithmically, updating names in both cases.[1] Information for archaea was added in 2020,[2] along with a species classification based on average nucleotide identity.[3] Each update incorporates new genomes as well as automated and manual curation of the taxonomy.[4]

An open-source tool called GTDB-Tk is available to classify draft genomes into the GTDB hierarchy.[5] The GTDB system, via GTDB-Tk, has been used to catalogue not-yet-named bacteria in the human gut microbiome and other metagenomic sources.[6][7]

The GTDB is incorporated into the Bergey's Manual of Systematics of Archaea and Bacteria in 2019 as its phylogenomic resource.[8]

Methodology

The genomes used to construct the phylogeny are obtained from NCBI (RefSeq and Genbank), and GTDB releases are indexed to RefSeq releases, starting with release 76. Importantly and increasingly, this dataset includes draft genomes of uncultured microorganisms obtained from metagenomes and single cells, ensuring improved genomic representation of the microbial world. All genomes are independently quality controlled using CheckM before inclusion in GTDB.[9]

Genomes first undergo gene calling to extract genes. The taxonomy is based on trees inferred with FastTree from an aligned concatenated set of 120 single copy marker proteins for Bacteria under a WAG model, and with IQ-TREE from a concatenated set of 53 (since RS207; 122 before) marker proteins for Archaea under the PMSF model. Additional marker sets are also used to cross-validate tree topologies including concatenated ribosomal proteins and ribosomal RNA genes.[9] The relative evolutionary divergence (RED) metric, which determines the taxonomic ranks used, is derived from the two main trees by the PhyloRank program.[1]

Species are deliminated using average nucleotide identity and alignment fraction, both calculated by skani. For species existing in a previous release, GTDB compares the quality and position of two genomes and may decide to switch to a new species representative genome.[9]

Taxomony comes from the following sources:

GTDB personnel curates the taxonomy from the aforementioned sources by checking them against the results of PhyloRank and the tree.

  • The tree node corresponding to a taxon name may have a RED inappropriate for its rank. The name may either be moved onto another node or (by changing the Latin suffix) into a different rank.[1]
    • Splitting may happen on the level of species or genera if the divergence turns out too high. Doing so creates new taxa.[3]
  • The taxon may turn out to be polyphyletic. The curator first restricts the taxon to the clade containing its type material. A new taxon is created for each of the other clades.[1]

For the each new taxon, the curators try to find a proposed name in literature for it. If there is no name proposed, the taxon is given a placeholder name by adding a suffix to the original name, e.g. Lactobacillus gasseri_A. After "Z" comes "AA".[1]

Contents of the database

Each release contains:[10]

  • Taxonomy tables containing the assignment of all included genome assemblies to the phylum-to-species taxonomy. (One per domain.)
  • Files containing the metadata given to each genome assembly, including original taxonomy from NCBI, original strain identifier, GTDB taxonomy, quality estimates, and presence of important genes (tRNA and rRNA). (One per domain.)
  • Species tree Newick files containing the species-representative genomes (1 per species), built as described in the previous section. (One per domain.)
  • For species-representative genomes:
    • alignments of marker genes identified from these genomes
    • file containing one 16S rRNA sequence from each species
    • tarballs containing amino acid and nucleotide versions of all predicted genes in these genomes
    • tarball containing the full contents of all these genomes
  • For all genomes that pass quality check:
    • alignments of marker genes identified from these genomes
    • file containing all 16S rRNA sequences identified from these genomes
  • Auxiliary files; see the full FILE_DESCRIPTIONS.txt。

The web interface displays a tree based on the taxonomy (not the entire Newick file), down to the genome assembly level. Each genome assembly has a page detailing its metadata and a history of how it's classified in each GTDB release. There is a search functionality.

Effects on the accepted taxonomy

GTDB "has now become an important resource for prokaryotic taxonomy". Both its species tree and elements of its methodology are used by taxonomists to improve the current, accepted taxonomy under the Prokaryotic Code. For example, a taxonomist may make references to the GTDB tree on top of their own phylogenetic tree to further support a taxonomic proposal.[11]

There has been even more ambitious proposals to import large parts of the database into the accepted taxonomy. A 2022 article in the IJSEM, written by third-party authors, proposes to assign names based on meaningless Latin syllable to over 65 thousand GTDB taxa,[12] though none of these names have made their way into the LPSN. A 2023 article by the GTDB team proposes to import 223 higher-order taxa into the Prokaryotic Code system and 49 under the SeqCode system.[13] Many of the names published under the under the Prokaryotic Code have already been validated.[14] (The SeqCode requires registration of the names for valid publication, which has also been done.)

See also

References

  1. ^ a b c d e f g h Parks, DH; Chuvochina, M; Waite, DW; Rinke, C; Skarshewski, A; Chaumeil, PA; Hugenholtz, P (November 2018). "A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life" (PDF). Nature Biotechnology. 36 (10): 996–1004. bioRxiv 10.1101/256800. doi:10.1038/nbt.4229. PMID 30148503. S2CID 52093100.
  2. ^ Rinke, Christian; Chuvochina, Maria; Mussig, Aaron J.; Chaumeil, Pierre-Alain; Davín, Adrián A.; Waite, David W.; Whitman, William B.; Parks, Donovan H.; Hugenholtz, Philip (21 June 2021). "A standardized archaeal taxonomy for the Genome Taxonomy Database" (PDF). Nature Microbiology. 6 (7): 946–959. doi:10.1038/s41564-021-00918-8. ISSN 2058-5276. PMID 34155373. S2CID 235595884.
  3. ^ a b Parks, DH; Chuvochina, M; Chaumeil, PA; Rinke, C; Mussig, AJ; Hugenholtz, P (September 2020). "A complete domain-to-species taxonomy for Bacteria and Archaea". Nature Biotechnology. 38 (9): 1079–1086. bioRxiv 10.1101/771964. doi:10.1038/s41587-020-0501-8. PMID 32341564. S2CID 216560589.
  4. ^ For information on each update, see relevant change logs. For notable, paper-worthy changes, see "Cite GTDB" section on the About page.
  5. ^ Chaumeil, PA; Mussig, AJ; Hugenholtz, P; Parks, DH (15 November 2019). "GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database". Bioinformatics. 36 (6): 1925–1927. doi:10.1093/bioinformatics/btz848. PMC 7703759. PMID 31730192.
  6. ^ Almeida, Alexandre; Nayfach, Stephen; Boland, Miguel; Strozzi, Francesco; Beracochea, Martin; Shi, Zhou Jason; Pollard, Katherine S.; Sakharova, Ekaterina; Parks, Donovan H.; Hugenholtz, Philip; Segata, Nicola; Kyrpides, Nikos C.; Finn, Robert D. (20 July 2020). "A unified catalog of 204,938 reference genomes from the human gut microbiome". Nature Biotechnology. 39 (1): 105–114. doi:10.1038/s41587-020-0603-3. PMC 7801254. PMID 32690973.
  7. ^ Nayfach, Stephen; et al. (9 November 2020). "A genomic catalog of Earth's microbiomes". Nature Biotechnology. 39 (4): 499–509. doi:10.1038/s41587-020-0718-6. PMC 8041624. PMID 33169036.
  8. ^ "Incorporation of Phylogenomics into BMSAB". Bergey's Manual Trust.
  9. ^ a b c "METHODS.txt (GTDB release 220)". data.gtdb.ecogenomic.org. 2024.
  10. ^ "220.0/FILE_DESCRIPTIONS.txt".
  11. ^ Gupta, Radhey S.; Patel, Sudip; Saini, Navneet; Chen, Shu (1 November 2020). "Robust demarcation of 17 distinct Bacillus species clades, proposed as novel Bacillaceae genera, by phylogenomics and comparative genomic analyses: description of Robertmurraya kyonggiensis sp. nov. and proposal for an emended genus Bacillus limiting it only to the members of the Subtilis and Cereus clades of species". International Journal of Systematic and Evolutionary Microbiology. 70 (11): 5753–5798. doi:10.1099/ijsem.0.004475.
  12. ^ Pallen, MJ; Rodriguez-R, LM; Alikhan, NF (September 2022). "Naming the unnamed: over 65,000 Candidatus names for unnamed Archaea and Bacteria in the Genome Taxonomy Database" (PDF). International Journal of Systematic and Evolutionary Microbiology. 72 (9). doi:10.1099/ijsem.0.005482. PMID 36125864.
  13. ^ Chuvochina, M; Mussig, AJ; Chaumeil, PA; Skarshewski, A; Rinke, C; Parks, DH; Hugenholtz, P (17 January 2023). "Proposal of names for 329 higher rank taxa defined in the Genome Taxonomy Database under two prokaryotic codes". FEMS microbiology letters. 370. doi:10.1093/femsle/fnad071. PMC 10408702. PMID 37480240.
  14. ^ Oren, Aharon; Göker, Markus (1 February 2024). "Validation List no. 215. Valid publication of new names and new combinations effectively published outside the IJSEM". International Journal of Systematic and Evolutionary Microbiology. 74 (1). doi:10.1099/ijsem.0.006173.

Further reading

  • AnnoTree - third-party tool for visualization of genome annotations using the GTDB (R95 or R214) species tree.

Read other articles:

阿爾伐城演出中的阿爾伐城,2005年组合原文名Alphaville昵称Forever Young音乐类型 合成器流行 新浪潮 出道地点 西德明斯特活跃年代1982年-唱片公司 Atlantic Metropolis(英语:Metropolis Records) Polydor Warner Bros. 网站alphaville.info现任成员 Marian Gold(英语:Marian Gold) David Goodes Jakob Kiersch Carsten Brocker Alexandra Merl 已离开成员 Bernhard Lloyd(英语:Bernhard Lloyd) Frank Mertens(英语:Frank Mertens

 

هذه المقالة تحتاج للمزيد من الوصلات للمقالات الأخرى للمساعدة في ترابط مقالات الموسوعة. فضلًا ساعد في تحسين هذه المقالة بإضافة وصلات إلى المقالات المتعلقة بها الموجودة في النص الحالي. (ديسمبر 2018) بوينغ طراز 10معلومات عامةالنوع طائرة هجوم بريالمهام ضربة جوية التطوير والتصنيع

 

Motor de arranque eléctrico. Un motor de arranque, denominado burro de arranque en algunas partes de Hispanoamérica, es un dispositivo que se utiliza para girar un motor de combustión interna para iniciar el funcionamiento de éste por su propia potencia. Los motores de arranque pueden ser eléctricos, neumáticos, hidráulicos, o ser de combustión interna de pequeño tamaño para, por ejemplo, motores muy grandes, o motores diésel en maquinaria agrícola o de excavación.[1]​ Los ...

Essex-class aircraft carrier of the US Navy For other ships with the same name, see USS Tarawa. USS Tarawa underway in December 1952 History United States NameTarawa NamesakeBattle of Tarawa BuilderNorfolk Naval Shipyard Laid down1 March 1944 Launched12 May 1945 Commissioned8 December 1945 Decommissioned30 June 1949 Recommissioned3 February 1951 Decommissioned13 May 1960 Reclassified CVA-40, 1 October 1952 CVS-40, 10 January 1955 AVT-12, May 1961 Stricken1 June 1967 FateScrapped, 3 October 19...

 

Yuan Raya大元Dà Yuánᠳᠠᠢᠶᠤᠸᠠᠨᠤᠯᠤᠰ1271–1368 BenderaWilayah pengaruh Dinasti Yuan pada tahun 1294Provinsi di masa Dinasti YuanStatusKhanat Kekaisaran MongolIbu kotaDadu (sekarang Beijing)Bahasa yang umum digunakanBahasa MongolMandarinAgama Buddhisme (Tiongkok dan Tibet), Taoisme, Konfusianisme, Kepercayaan tradisional TiongkokPemerintahanMonarki dengan hak pilih terbatasKaisar-Khagan • 1260–1294 Kublai Khan• 1333–1370 Ukhaatu Khan Menteri...

 

Cet article est une ébauche concernant une localité kazakhe. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. Pavlodar Павлодар Héraldique Drapeau Administration Pays Kazakhstan Oblys Pavlodar Raïon Pavlodar Maire Bakauov, Bulat Zhumabekovich Code postal 140000 — 140017 Indicatif téléphonique (+7) 7182 Démographie Population 367 254 hab.[1] (2023) Densité 918 hab./km2 Géographie C...

هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (أبريل 2019) باول كول   معلومات شخصية الميلاد 9 مايو 1992 (31 سنة)  غرايموث  مواطنة نيوزيلندا[1]  الحياة العملية المهنة لاعب إسكواش  [لغات أخرى]‏  الريا...

 

أولاد رحو بنعيسى تقسيم إداري البلد المغرب  الجهة الشرق الإقليم الناظور الدائرة لوطا الجماعة القروية بني وكيل أولاد امحند المشيخة بني وكيل أولاد محند السكان التعداد السكاني 1902 نسمة (إحصاء 2004)   • عدد الأسر 328 معلومات أخرى التوقيت ت ع م±00:00 (توقيت قياسي)[1]،  وت ع م+01...

 

Stream in the US state of Missouri Map of the St. Francis River watershed showing Little River. The Castor/Whitewater headwaters (darker shade on the map) were historically part of the St. Francis watershed via Little River but are now diverted to the Mississippi by the Headwater Diversion Channel. The Castor River is divided into the Upper Castor River and the Lower Castor River by the Headwater Diversion Channel.[1] The Upper Castor rises in the southern corner of Ste. Genevieve Cou...

2018 fantasy book by Cassandra Clare Queen of Air and Darkness (The Dark Artifices #3) First editionAuthorCassandra ClareCover artistCliff NielsenCountryUnited StatesLanguageEnglishSeriesThe Dark ArtificesGenreFantasyPublisherMargaret K. McElderry BooksPublication dateDecember 4, 2018Media typeHardcover, Paperback, EbookPages912ISBN9781471116704Preceded byLord of Shadows  Queen of Air and Darkness is the third and final book in The Dark Artifices trilogy by Cassandra Clare...

 

Tercera División de España - Grupo XVI Datos generalesSede  La Rioja Navarra País VascoPrimera edición Temporada 2004-05Última edición Temporada 2020-21Organizador Federación Riojana de FútbolDatos estadísticosParticipantes 22Ascenso Segunda División B (2004-2020) Segunda División RFEF (2020-21)[1]​Descenso Regional Preferente de La RiojaClasificación a Copa del ReyCopa RFEF Cronología Grupo XVI Grupo XVI [editar datos en Wikidata] El Grupo XVI de T...

 

Tunisian politician and diplomat (1927–2009) Habib Bourguiba Jr. in 1961 Habib Bourguiba Jr. (Arabic: الحبيب بورقيبة الابن, romanized: al-Ḥabīb Būrqībah al-Ibn; 9 April 1927 – 28 December 2009) was a Tunisian diplomat and politician. Biography Jean-Claude Habib Bourguiba was the son of Habib Bourguiba, who became the first President of Tunisia in 1957, and of his first wife Mathilde Lorrain (later Moufida Bourguiba). He was appointed Tunisia's ambassador to Fra...

Some of this article's listed sources may not be reliable. Please help this article by looking for better, more reliable sources. Unreliable citations may be challenged or deleted. (December 2022) (Learn how and when to remove this template message) Zoo in Basel, Switzerland Zoo BaselZoo Basel logo47°32′50″N 7°34′44″E / 47.547336°N 7.578764°E / 47.547336; 7.578764Date openedJuly 3, 1874LocationBasel, SwitzerlandLand area32.12 acres (13.00 ha)No. of ani...

 

Men's association football league in Hong Kong Football leagueHong Kong Premier LeagueFounded2014; 9 years ago (2014)CountryHong KongConfederationAFCNumber of teams11Level on pyramid1Relegation toHong Kong First DivisionDomestic cup(s)Hong Kong FA CupSapling CupHong Kong Senior Challenge ShieldInternational cup(s)AFC Champions LeagueAFC CupCurrent championsKitchee (6th title) (2022–23)Most championshipsKitchee (6 titles)TV partnerson.cc RTHKWebsitehkfa.comCurrent: 2023–2...

 

Karier Indonesia ProduksiPT. PAL Mulai dibuat 20 Desember 2019 Diluncurkan 20 April 2022 Status Masih bertugas Karakteristik umum Berat benaman 460 ton Panjang 60 meter Lebar 8,10 meter Draught4,85 meter Kecepatan 28 knot Awak kapal 55 orang KRI Panah di dalam galangan kapal PT. PAL KRI Panah (626) adalah KCR-60M keenam yang dibangun di galangan kapal BUMN yakni PT. PAL. Desain dari KRI Panah ini adalah versi peningkatan dari Kapal cepat rudal kelas Sampari. Yang istimewa dari KCR Keenam ini ...

County of Indiana, United States County in Indiana, United StatesPulaski County, IndianaCountyPulaski County Courthouse in WinamacLocation in the state of IndianaIndiana's location in the U.S.Coordinates: 41°02′N 86°41′W / 41.033°N 86.683°W / 41.033; -86.683Country United StatesState IndianaEstablishedFebruary 7, 1835Named forCount Casimir PulaskiCounty seatWinamacLargest townWinamac(population and total area)IncorporatedMunicipalities Four towns Fran...

 

1998 video gameTotal Annihilation: Battle TacticsCover art for the original releaseDeveloper(s)Cavedog Entertainment[2]Publisher(s)GT InteractiveProducer(s)Kellyn BeckDesigner(s)Richard W. SmithSeriesTotal AnnihilationPlatform(s)WindowsReleaseNA: July 20, 1998[1]Genre(s)Real-time strategyMode(s)Single-player, multiplayer Total Annihilation: Battle Tactics is the second expansion pack for the real-time strategy video game Total Annihilation, released on June 30, 1998 in the Uni...

 

Singapore Changi Airport新加坡樟宜机场Lapangan Terbang Changi Singapuraசிங்கப்பூர் சர்வதேச விமான நிலையம் Der Kontrollturm, im Hintergrund das Jewel Singapur (Singapur) Singapur Kenndaten ICAO-Code WSSS IATA-Code SIN Koordinaten 1° 21′ 1″ N, 103° 59′ 40″ O1.3501888888889103.994433333337Koordinaten: 1° 21′ 1″ N, 103° 59′ 40″ O Höhe über MSL 7 ...

Censo de los Estados Unidos de 1790 Primera página de una copia del censo, publicada en 1793.Información generalTipo de censo Censo de poblaciónLugar Estados UnidosFecha de realización 2 de agosto de 1790Autoridad responsable Cuerpo de Alguaciles de Estados UnidosCosto de realización 44 000 USDDatos de poblaciónPoblación 3 929 214 hab.Región más poblada Virginia (747 610 hab.)Región menos poblada Territorio del Suroeste (35 691 hab.)Ciudad má...

 

American football player (1902–1923) Jack TriceNo. 37PositionTackleMajorAnimal husbandryPersonal informationBorn:(1902-05-12)May 12, 1902Hiram, Ohio, U.S.Died:October 8, 1923(1923-10-08) (aged 21)Ames, Iowa, U.S.Height6 ft 0 in (1.83 m)Weight215 lb (98 kg)Career historyCollegeIowa State (1922–1923)High schoolEast Technical John G. Trice (May 12, 1902 – October 8, 1923) was an American college football player who became the first African-American athlet...

 

Strategi Solo vs Squad di Free Fire: Cara Menang Mudah!