skip to main content

Recognition of online handwritten mathematical expressions using contextual information

Aguilar, Frank Dennis Julca

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística 2016-04-29

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    Recognition of online handwritten mathematical expressions using contextual information
  • Autor: Aguilar, Frank Dennis Julca
  • Orientador: Hirata, Nina Sumiko Tomita
  • Assuntos: Informação Contextual; Parsing De Grafos; Reconhecimento De Expressões Matemáticas; Contextual Information; Graph Parsing; Mathematical Expression Recognition
  • Notas: Tese (Doutorado)
  • Descrição: Online handwritten mathematical expressions consist of sequences of strokes, usually collected through a touch screen device. Automatic recognition of online handwritten mathematical expressions requires solving three subproblems: symbol segmentation, symbol classification, and structural analysis (that is, the identification of spatial relations, as subscript or superscript, between symbols). A main issue in the recognition process is ambiguity at symbol or relation levels that often leads to several likely interpretations of an expression. Some methods treat the recognition problem as a pipeline process, in which symbol segmentation and classification is followed by structural analysis. A main drawback of such methods is that they compute symbol level interpretations without considering structural information, which is essential to solve ambiguities. To cope with this drawback, more recent methods adapt string parsing techniques to drive the recognition process. As string grammars were originally designed to model linear arrangements of objects (like in text, where symbols are arranged only through left-to-right relations), non-linear arrangements of mathematical symbols (given by the multiple relation types of mathematics) are modeled as compositions of production rules for linear structures. Then, parsing an expression involves searching for linear structures in the expression that are consistent with the structure of the production rules. This last step requires the introduction of constraints or assumptions, such as stroke input order or vertical and horizontal alignments, to linearize the expression components. These requirements not only limit the effectiveness of the methods, but also make difficult their extension to include new expression structures. In this thesis, we model the recognition problem as a graph parsing problem. The graph-based description of relations in the production rules allows direct modeling of non-linear mathematical structures. Our parsing algorithm determines recursive partitions of the input strokes that induce graphs matching the production rule graphs. To mitigate the computational cost, we constrain the possible partitions to graphs derived from sets of symbol and relation hypotheses, calculated using previously trained classifiers. A set of labels that indicate likely interpretations is associated to each symbol and relation hypothesis, and treatment of ambiguity at symbol and relation levels is left to the parsing process. The parsing algorithm builds a forest in which each tree corresponds to an interpretation coherent with the grammar. We define a score function, optimized through training data, that associates a cost to each tree. We then select a tree with minimum cost as result. Experimental evaluation shows that the proposed method is more accurate than several state of the art methods. Even though graph parsing is a computationally expensive process, the use of symbol and relation hypotheses to constrain the search space is able to effectively reduce complexity, allowing practical application of the process. Furthermore, since the proposed parsing algorithm does not make direct use of structural particularities of mathematical expressions, it has potential to be adapted for other two-dimensional object recognition problems. As a secondary contribution of this thesis, we have proposed a framework to automatize the process of building handwritten mathematical expression datasets. The framework has been implemented in a computer system and used to generate part of the samples used in the experimental part of this thesis.
  • DOI: 10.11606/T.45.2016.tde-25072016-164800
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística
  • Data de criação/publicação: 2016-04-29
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.