skip to main content

On the use of control- and data-ow in fault localization

Ribeiro, Henrique Lemos

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Escola de Artes, Ciências e Humanidades 2016-08-19

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    On the use of control- and data-ow in fault localization
  • Autor: Ribeiro, Henrique Lemos
  • Orientador: Chaim, Marcos Lordello
  • Assuntos: Engenharia De Software; Fluxo De Controle; Fluxo De Dados; Localização De Defeitos; Control-Flow; Data-Flow; Fault Localization; Software Engineering
  • Notas: Dissertação (Mestrado)
  • Descrição: Testing and debugging are key tasks during the development cycle. However, they are among the most expensive activities during the development process. To improve the productivity of developers during the debugging process various fault localization techniques have been proposed, being Spectrum-based Fault Localization (SFL), or Coverage-based Fault Localization (CBFL), one of the most promising. SFL techniques pinpoints program elements (e.g., statements, branches, and definition-use associations), sorting them by their suspiciousness. Heuristics are used to rank the most suspicious program elements which are then mapped into lines to be inspected by developers. Although data-flow spectra (definition-use associations) has been shown to perform better than control-flow spectra (statements and branches) to locate the bug site, the high overhead to collect data-flow spectra has prevented their use on industry-level code. A data-flow coverage tool was recently implemented presenting on average 38% run-time overhead for large programs. Such a fairly modest overhead motivates the study of SFL techniques using data-flow information in programs similar to those developed in the industry. To achieve such a goal, we implemented Jaguar (JAva coveraGe faUlt locAlization Ranking), a tool that employ control-flow and data-flow coverage on SFL techniques. The effectiveness and efficiency of both coverages are compared using 173 faulty versions with sizes varying from 10 to 96 KLOC. Ten known SFL heuristics to rank the most suspicious lines are utilized. The results show that the behavior of the heuristics are similar both to control- and data-flow coverage: Kulczynski2 and Mccon perform better for small number of lines investigated (from 5 to 30 lines) while Ochiai performs better when more lines are inspected (30 to 100 lines). The comparison between control- and data-flow coverages shows that data-flow locates more defects in the range of 10 to 50 inspected lines, being up to 22% more effective. Moreover, in the range of 20 and 100 lines, data-flow ranks the bug better than control-flow with statistical significance. However, data-flow is still more expensive than control-flow: it takes from 23% to 245% longer to obtain the most suspicious lines; on average data-flow is 129% more costly. Therefore, our results suggest that data-flow is more effective in locating faults because it tracks more relationships during the program execution. On the other hand, SFL techniques supported by data-flow coverage needs to be improved for practical use at industrial settings
  • DOI: 10.11606/D.100.2016.tde-18102016-092654
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Escola de Artes, Ciências e Humanidades
  • Data de criação/publicação: 2016-08-19
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.