skip to main content

Solving University entrance assessment using information retrieval

Silveira, Igor Cataneo

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística 2018-07-05

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    Solving University entrance assessment using information retrieval
  • Autor: Silveira, Igor Cataneo
  • Orientador: Mauá, Denis Deratani
  • Assuntos: Enem; Multiple-Choice Question Answering; Recuperação De Informação; Enem; Information Retrieval
  • Notas: Dissertação (Mestrado)
  • Descrição: Answering questions posed in natural language is a key task in Artificial Intelligence. However, producing a successful Question Answering (QA) system is challenging, since it requires text understanding, information retrieval, information extraction and text production. This task is made even harder by the difficulties in collecting reliable datasets and in evaluating techniques, two pivotal points for machine learning approaches. This has led many researchers to focus on Multiple-Choice Question Answering (MCQA), a special case of QA where systems must select the correct answers from a small set of alternatives. One particularly interesting type of MCQA is solving Standardized Tests, such as Foreign Language Proficiency exams, Elementary School Science exams and University Entrance exams. These exams provide easy-to-evaluate challenging multiple-choice questions of varying difficulties about large, but limited, domains. The Exame Nacional do Ensino Médio (ENEM) is a High School level exam taken every year by students all over Brazil. It is widely used by Brazilian universities as an entrance exam and is the world\'s second biggest university entrance examination in number of registered candidates. This exam consists in writing an essay and solving a multiple-choice test comprising questions on four major topics: Humanities, Language, Science and Mathematics. Questions inside each major topic are not segmented by standard scholar disciplines (e.g. Geography, Biology, etc.) and often require interdisciplinary reasoning. Moreover, the previous editions of the exam and their solutions are freely available online, making it a suitable benchmark for MCQA. In this work we automate solving the ENEM focusing, for simplicity, on purely textual questions that do not require mathematical thinking. We formulate the problem of answering multiple-choice questions as finding the candidate-answer most similar to the statement. We investigate two approaches for measuring textual similarity of candidate-answer and statement. The first approach addresses this as a Text Information Retrieval (IR) problem, that is, as a problem of finding in a database the most relevant document to a query. Our queries are made of statement plus candidate-answer and we use three different corpora as database: the first comprises plain-text articles extracted from a dump of the Wikipedia in Portuguese language; the second contains only the text given in the question\'s header and the third is composed by pairs of question and correct answer extracted from ENEM assessments. The second approach is based on Word Embedding (WE), a method to learn vectorial representation of words in a way such that semantically similar words have close vectors. WE is used in two manners: to augment IR\'s queries by adding related words to those on the query according to the WE model, and to create vectorial representations for statement and candidate-answers. Using these vectorial representations we answer questions either directly, by selecting the candidate-answer that maximizes the cosine similarity to the statement, or indirectly, by extracting features from the representations and then feeding them into a classifier that decides which alternative is the answer. Along with the two mentioned approaches we investigate how to enhance them using WordNet, a structured lexical database where words are connected according to some relations like synonymy and hypernymy. Finally, we combine different configurations of the two approaches and their WordNet variations by creating an ensemble of algorithms found by a greedy search. This ensemble chooses an answer by the majority voting of its components. The first approach achieved an average of 24% accuracy using the headers, 25% using the pairs database and 26.9% using Wikipedia. The second approach achieved 26.6% using WE indirectly and 28% directly. The ensemble achieved 29.3% accuracy. These results, slightly above random guessing (20%), suggest that these techniques can capture some of the necessary skills to solve standardized tests. However, more sophisticated techniques that perform text understanding and common sense reasoning might be required to achieve human-level performance.
  • DOI: 10.11606/D.45.2018.tde-04112018-225438
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística
  • Data de criação/publicação: 2018-07-05
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.