Collaboration is the key to scientific breakthroughs within data science. The Novo Nordisk Foundation has therefore awarded a total of DKK 45 million for two major research collaborations that will seek answers to important scientific questions across disciplines and geographical units.
This will be achieved through the Data Science Collaborative Research Programme, which aims to support excellent and ambitious ideas stemming from data science – a research area that includes artificial intelligence, machine learning and managing large data sets.
One project, Machine Learning Methods for Data-driven Discovery of Antibiotic Resistance Plasmid Dissemination and Evolution, seeks to improve understanding of how antibiotic resistance occurs. Specifically, the researchers will use artificial intelligence and machine learning to acquire more knowledge on plasmids, tiny DNA fragments that can transfer genetic properties, including antibiotic resistance, between bacteria.
Finding out more about plasmids is not that straightforward. First, it is important to identify whether a fragment is a plasmid or part of the bacteria’s own DNA. This is time-consuming and requires a lot of computing power. Developing technologies that can identify the key patterns and structures to ascertain where the bacteria stop and the plasmids start is therefore very advantageous. Currently, no such technology exists, and a first step therefore involves developing an advanced data model that can help to identify the plasmids.
Once the data model is in place, the researchers can begin to seek important knowledge on how antibiotic resistance and other positive and negative properties are exchanged at the microbial level.
Søren Sørensen, Professor at the Department of Biology of the University of Copenhagen, is leading the project. He compares the research to a jigsaw puzzle, in which all the pieces currently look much too similar to tell how they fit together and what the puzzle portrays.
“We have to assemble a puzzle in which the plasmid pieces are a challenge because we know so little about them. Their known properties are so similar that we end up with many identical pieces that we cannot place. As we gain more knowledge about the plasmids, they will differ in appearance so we can place them correctly,” explains Søren Sørensen, who will seek this new information in collaboration with researchers from the Novo Nordisk Foundation Center for Protein Research at the University of Copenhagen and Bielefeld University in Germany.
Machine learning tailored to the life sciences
The second project, Center for Basic Machine Learning Research in Life Science, will carry out collaborative research on using machine learning to solve fundamental problems in the life sciences. The Center will be based at the Department of Biology of the University of Copenhagen and will include leading researchers within machine learning from the Department of Applied Mathematics and Computer Science of the Technical University of Denmark and the Department of Computer Science of the University of Copenhagen.
Most of us are familiar with machine learning in connection with analysing pictures, text and speech through services that many large technology companies have made commonplace. Machine learning has not yet achieved the same traction within the life sciences, and the new Center will therefore take the initiative to discover new patterns and create new scientific results based on data originating in the life sciences.
Data in the life sciences are often missing and noisy, which makes using machine learning hard. One purpose of the research collaboration is to find better methods for developing data representations so that they become more suitable for machine learning. The Center will therefore develop fundamental machine learning algorithms and methods that will be especially well-suited to the life sciences. For example, they must be able to manage the uncertainty inherent in noisy data.
“Data come from vastly different sources. Having models that can crunch different types of data and integrate them into a single model is therefore important. We will also strive to incorporate uncertainty into the models and thus give them extra value within the life sciences,” explains Ole Winther, Professor, Department of Biology, University of Copenhagen, who has been awarded the grant to establish the new Center.
The grants have been awarded in open competition through the Data Science Collaborative Research Programme and are two of the 12 projects receiving funding in 2020 through the Foundation’s new Data Science Initiative. The Foundation has allocated DKK 410 million to the initiative in 2020–2022. The 2021 programmes open at the end of December 2020.
“The two grants we have awarded for research collaborations demonstrate the importance of researchers across institutions and with different types of expertise collaborating to develop and expand research. International competition is fierce in data science, and exploiting the synergy that arises by integrating competencies instead of keeping Denmark’s experts in individual silos is especially important. In this case, 1 + 1 actually results in more than 2,” says Lene Oddershede, Senior Vice President, Natural & Technical Sciences, Novo Nordisk Foundation.
Data Science Collaborative Research Programme grants in 2020
Machine Learning Methods for Data-driven Discovery of Antibiotic Resistance Plasmid Dissemination and Evolution, Søren Sørensen, Professor, Department of Biology, University of Copenhagen: DKK 14,983,392
Center for Basic Machine Learning Research in Life Science, Ole Winther, Professor, Department of Biology, University of Copenhagen: DKK 29,984,002
Further information
Sabina Askholm Larsen, Communications Partner, +45 2367 3226, [email protected]
Søren Sørensen, Professor, University of Copenhagen, +45 5182 7007, [email protected]
Ole Winther, Professor, University of Copenhagen, +45 3011 3583, [email protected]