machine learning with Kmers

Presentation from the Research Group for Genomic Epidemiology – 04 April 2022

Validating machine learning models with in-vitro experiments

Machine learning is an automated pattern detection method. It is able to predict unknown phenotypes based on past experiences. Machine learning could be useful to detect novel resistance genes by showing the correlations between the genes and resistance phenotypes. However, these novel findings should be interpreted carefully as some of these genes could be found important by the model due to the artificial correlations. In the presentation, we discussed how could the catB3-2 gene be found correlated with tobramycin resistance as a consequence of linkage disequilibrium with the aac(6’)-Ib-cr gene, which is mainly responsible for the tobramycin resistance.

This presentation continues with a short introduction of the other research projects focusing on the validation of previous machine learning based tools such as PlasmidHostFinder (https://cge.cbs.dtu.dk/services/PlasmidHostFinder-1.0/). Overall, the main take-home messages from the presentation were that important features should be interpreted with caution and in-vitro experiments are required for validating machine learning models.

Derya Aytan's presentation