skip to main content
Primo Advanced Search
Primo Advanced Search Query Term
Primo Advanced Search prefilters

Hierarchical classification on batch and streaming data with applications to entomology

Parmezan, Antonio Rafael Sabino

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Ciências Matemáticas e de Computação 2022-02-25

Acesso online

  • Título:
    Hierarchical classification on batch and streaming data with applications to entomology
  • Autor: Parmezan, Antonio Rafael Sabino
  • Orientador: Batista, Gustavo Enrique de Almeida Prado Alves
  • Assuntos: Aprendizado De Máquina; Fluxo De Dados; Mudança De Conceito; Classificação Hierárquica; Aprendizado Em Lote; Concept Drift; Batch Learning; Hierarchical Classification; Machine Learning; Data Stream
  • Notas: Tese (Doutorado)
  • Descrição: Traditional supervised machine learning algorithms conduct data classification in a flat way, i.e., they seek to associate each example with a class belonging to a finite, devoid of structural dependencies and usually small, set of classes. However, there are more challenging problems in which classes can be divided or grouped into subclasses or superclasses, respectively. This structural dependency between classes demands the application of methods prepared to deal with hierarchical classification. An algorithm for hierarchical classification considers the structural information embedded in the class hierarchy and uses it to decompose the original problems feature space into subproblems with fewer classes. Such decomposition reduces the complexity of the classification function as well as the prediction error. This thesis advances the state-ofthe-art by proposing novel algorithms for hierarchical classification considering two learning paradigms: (i) batch, where learning takes place offline employing a sample of fixed-size examples (ideally) coming from a stationary probability distribution. Each observation within the sample is independently and identically distributed; and (ii) streaming, in which learning is performed online from a usually uninterrupted and ordered sequence of examples available, at various update rates and without human intervention, by systems or devices. The features that describe the streaming examples may drift over time due to the non-stationary nature of the environment in which they are. In this context, the main contributions of this thesis include: (i) the most extensive and comprehensive study ever done to understand the impact of climatic-environmental conditions on the bee and wasp wing-beat frequencies. From the practical standpoint, the work builds base components for (online) (hierarchical) classification of flying insects; (ii) a method that combines local approaches to quickly and efficiently obtain a hierarchical decision model that faithfully represents the music genre identification scenario. We also validated the approach on hymenopteran data; (iii) a reference process that uses optical sensors and hierarchical classifiers to identify pollinating flying insects under natural field conditions. The results obtained provided answers to ten research questions; (iv) the first algorithm for hierarchical classification of data streams. It is based on nearest neighbors and works incrementally; (v) a framework and (vi) a collection of methods for hierarchical labeling of streaming data.
  • DOI: 10.11606/T.55.2022.tde-03102022-171351
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Ciências Matemáticas e de Computação
  • Data de criação/publicação: 2022-02-25
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.