skip to main content
Primo Search
Search in: Busca Geral

On the generation, structure, and semantics of grammar patterns in source code identifiers

Newman, Christian D. ; AlSuhaibani, Reem S. ; Decker, Michael J. ; Peruma, Anthony ; Kaushik, Dishant ; Mkaouer, Mohamed Wiem ; Hill, Emily

The Journal of systems and software, 2020-12, Vol.170, p.110740, Article 110740 [Periódico revisado por pares]

Elsevier Inc

Texto completo disponível

Citações Citado por
  • Título:
    On the generation, structure, and semantics of grammar patterns in source code identifiers
  • Autor: Newman, Christian D. ; AlSuhaibani, Reem S. ; Decker, Michael J. ; Peruma, Anthony ; Kaushik, Dishant ; Mkaouer, Mohamed Wiem ; Hill, Emily
  • Assuntos: Identifier naming ; Part-of-speech tagging ; Program comprehension ; Software maintenance ; Source code analysis
  • É parte de: The Journal of systems and software, 2020-12, Vol.170, p.110740, Article 110740
  • Descrição: Identifiers make up a majority of the text in code. They are one of the most basic mediums through which developers describe the code they create and understand the code that others create. Therefore, understanding the patterns latent in identifier naming practices and how accurately we are able to automatically model these patterns is vital if researchers are to support developers and automated analysis approaches in comprehending and creating identifiers correctly and optimally. This paper investigates identifiers by studying sequences of part-of-speech annotations, referred to as grammar patterns. This work advances our understanding of these patterns and our ability to model them by (1) establishing common naming patterns in different types of identifiers, such as class and attribute names; (2) analyzing how different patterns influence comprehension; and (3) studying the accuracy of state-of-the-art techniques for part-of-speech annotations, which are vital in automatically modeling identifier naming patterns, in order to establish their limits and paths toward improvement. To do this, we manually annotate a dataset of 1,335 identifiers from 20 open-source systems and use this dataset to study naming patterns, semantics, and tagger accuracy. •Establish common, and meaningful, naming structures found in identifiers.•Determine the effectiveness of state-of-the-art part of speech tagging on identifiers.•Highlight the multi-functional use of nouns as both adjective and noun in source code.•Discuss informal naming conventions, such as using plurals for list identifiers.•Explore infrequent, but important, naming structures found in identifiers.
  • Editor: Elsevier Inc
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.