skip to main content

End-to-end system for extracting and interpreting testual information of interest from identify documents images.

Gutiérrez Menéndez, José Carlos

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Escola Politécnica 2019-05-14

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    End-to-end system for extracting and interpreting testual information of interest from identify documents images.
  • Autor: Gutiérrez Menéndez, José Carlos
  • Orientador: Bressan, Graça
  • Assuntos: Documentos; Identificação; Reconhecimento De Texto; Reconhecimento E Classificação Da Entidade Nomeada; Identification Documents; Named Entity Recognition And Classification; Text Recognition
  • Notas: Dissertação (Mestrado)
  • Notas Locais: Programa Engenharia Elétrica
  • Descrição: Identity documents (ID) are one of the primary sources for obtaining information about a citizen. The center of many applications within the administrative and service sectors is the extraction of the data contained in ID cards. Therefore, in this research is proposed the implementation of an automated system able to extract and interpret the textual information from identity documents images. The proposed end-to-end system allows the automation of a registration or verification process that requires the acquisition of information about a citizen using his identity documents. The system obtained through this research is considered as an end-to-end system since it covers every stage of the information of interest extraction process from IDs images. Different to the template-based systems, the proposed system uses a semantic attribution algorithm that allows to classify and attribute meaning to the information from IDs according to its semantics. This research is the first comprehensive description of a complete information extraction system to process IDs that describes from image processing to named entity recognition. To evaluate the performance of the research were proposed different metrics based on the internal functions of the system. The final evaluation shows satisfactory results showing that the end-to-end system is capable of extracting and interpreting textual information from identity documents images without prior knowledge of their layouts.
  • DOI: 10.11606/D.3.2019.tde-19112019-100543
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Escola Politécnica
  • Data de criação/publicação: 2019-05-14
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.