Two Weeks in the World 2020 dataset

The Two Weeks in the World (TWIW) 2020 dataset consists of 3000+ bacterial isolates with metadata and genomes. Amongst other things, the dataset is being used to shed light on the “true” burden of disease, as samples have been collected randomly rather than targeting specific species and/or source types. These data reveal a clear and consistent dominance of urological pathogens, amongst which Escherichia coli is the most dominant. Regionally (referring to the World Bank regions), there are differences in the representation of different source types, as well as source-pathogen correlations. While Escherichia coli from urine samples is the dominant source-group correlation, Staphylococci from skin/wound infections are the second biggest source-pathogen grouping. Skin/wound infections, however, represent a higher relative representation of gram negatives in the SSA, SA and Mena regions, than elsewhere.

In general, it is a diverse dataset with 153 species and 46 genera represented. Majority groups have been made, however, showing that 90% of the samples can be found within 9 genera when grouping at genus-level, or 27 species when grouping at species-level. In comparison, the species and genera of bacterial pathogens targeted by GLASS represent only approximately 60% of the dataset.

The dataset has been used to identify resistance-conferring genes in the bacterial genomes. The distribution of the genes found within different regions does not differ much, although region NA stands out. This is, however, probably due to high prevalence of gram-positive isolates compared to all other regions. A total of 17,396 genes were identified, and the majority of these confer resistance to aminoglycosides and betalactams.

For betalactams specifically, there is a high level of species-specificity in the identified genes, and only 14 genes are found to be global. Only one of these is also universal with regards to the species in which it was found, namely blaTEM-1B.

