skip to main content

Integration of heterogeneous data: a multi-omics application

Vasconcelos, Ana Gabriela Pereira De

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística 2020-08-25

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    Integration of heterogeneous data: a multi-omics application
  • Autor: Vasconcelos, Ana Gabriela Pereira De
  • Orientador: Soler, Julia Maria Pavan
  • Assuntos: Análise Multivariada; Dados Multi-Ômicos; Fatorização De Matrizes; Generalized Low Rank Models; Matrix Factorization; Multi-Omics; Multivariate Analysis
  • Notas: Dissertação (Mestrado)
  • Descrição: Nowadays, a huge amount of data has being collected in different research areas, such as public health, agriculture, marketing, so high-dimension databases are becoming very common to encounter. More specifically, with the advance of technology many biological information are now available at low costs -- data from genome, miRNA (MicroRNA), mRNA (messenger RNA), gene expression, protein, methylation, lipids, metabolism, phenotypes and so on. Several different studies have been done individually with each type of data, but more recently there is an increasingly interest in integrating different data to gather more information. However, many classical methodologies used to this end assume the data matrix to be completed and numerical. Therefore, the heterogeneity of dataset with different variable types is not considered. Alternatively, the Generalized Low Rank Models (GLRM) is a tool capable of dealing with large datasets of heterogeneous data. Although its use is destined for a single database, this projects shows that it is flexible enough to handle abstract data, from different sources, by using different loss functions, adequate to each variable type. GLRM is a very powerful method that can deal with problems from different natures, but it is very recent, so its potential to work with multi-omics is still being discovered. In this context, the present work introduces GLRM and explores its possibilities for dimensionality reduction on supervised and unsupervised analysis using simulated and real multi-omics datasets.
  • DOI: 10.11606/D.45.2020.tde-01092020-164939
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística
  • Data de criação/publicação: 2020-08-25
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.