Winter School in Mathematical & Computational Biology 1-5 July 2019 Brisbane
Technological improvements have allowed for the collection of data from different molecular compartments (e.g. gene expression, protein abundance) resulting in multiple omics data from the same set of biospecimens or individuals (e.g. transcriptomics, proteomics). We propose to adopt a systems biology holistic approach by statistically integrating data from multiple biological compartments. Such approach provides improved biological insights compared with traditional single omics analyses, as it allows to take into account interactions between omics layers.
In this talk, I will present a dimension reduction multivariate method called DIABLO, which addresses data integration challenges, such as the complexity and sheer size of the datasets, each with few samples and many molecules, and the heterogeneous nature of data measured on different scales and technological platforms. DIABLO is a hypothesis-free method that constructs combinations of variables (e.g. cytokines, transcripts, proteins, metabolites) that are maximally correlated across data types to identify a minimal subset of markers – a multi-omics signature. This signature can highlight novel findings but is also the starting point to network modelling. DIABLO is not limited to a data-driven analysis, and can also handle pathway-based analysis, or a mix of knowledge- and data- driven analyses.
I will illustrate the use of DIABLO in studies we have analysed for bulk omics, microbiome, and single cells.
The slides can be found at this link, in the tab ‘Presentation’.