skip to main content

Designing convolutional neural network architectures based on dynamical system concepts

Ferreira, Martha Dais

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Ciências Matemáticas e de Computação 2019-02-26

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    Designing convolutional neural network architectures based on dynamical system concepts
  • Autor: Ferreira, Martha Dais
  • Orientador: Mello, Rodrigo Fernandes de
  • Assuntos: Aprendizado Profundo; Sistemas Dinâmicos; Redes Neurais Convolucionais; Teoria Do Aprendizado Estatístico; Falsos Vizinhos Mais Próximos Em Imagens; Dynamical Systems; Deep Learning; Convolutional Neural Networks; Statistical Learning Theory; Image-Based False Nearest Neighbors
  • Notas: Tese (Doutorado)
  • Descrição: Technology advances have motivated the production and storage of large amounts of data and, consequently, the need for processing them out in order to support decision making. In this context, Deep Learning (DL) has emerged and provided major advances to solve complex supervised tasks through the direct manipulation of raw data content, such as images, audios and videos. Convolutional Neural Networks (CNN) are among the state-of-the-art strategies in DL, confirming relevant performance results in tasks of different domains. Currently, the design of CNN architectures is one of the major challenges involved in the practical use of DL, since it requires considerable knowledge about the application domain, linear and nonlinear algebraic transformations. Architectures are either manually designed, using empirical procedures, or with the support of evolutionary algorithms, an option that excessively consumes computational resources while analyzing candidate solutions. In addition to the architecture design, the possibility of overfitting has attracting the scientific community to study the effect of such complex models and whether they produce some memorization effect on training sets. Those two main challenges motivated this PhD thesis which brings up a proposal to support the automatic design of CNN architectures based on Dynamical System (DS) concepts. Initially, CNN architectures were algebraically formulated, allowing to take conclusions on the relationships of CNN input data organizations and spatial immersions from DS, leading to the development of an immersion tool called Image-based False Nearest Neighbors (IFNN). IFNN estimates the convolutional mask sizes and helps in the process of finding the adequate number of convolutional units per CNN layer by taking advantage of well-known effects caused by the reconstruction of phase spaces. This tool is based on the False-Nearest Neighbors (FNN) method, typically used to estimate the minimal embedding dimension to represent recurrence patterns of time series. Experiments confirm that architectures designed with the support of IFNN mostly usually produce results similar to deeper (and thus more complex) architectures. Based on those experiments, we concluded that IFNN supports the design of simpler, shallower (in the sense of depth) but yet efficient CNN architectures, which are faster to train and provide tighter learning guarantees according to the Statistical Learning Theory (SLT) thus requiring smaller training samples. Finally, the CNN architectures after IFNN were analyzed based on their Shattering coefficients in attempt to verify their relative complexities, and most essentially the cardinalities of their spaces of admissible functions, a.k.a. biases.
  • DOI: 10.11606/T.55.2019.tde-26042019-082539
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Ciências Matemáticas e de Computação
  • Data de criação/publicação: 2019-02-26
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.