skip to main content
Primo Advanced Search
Primo Advanced Search Query Term
Primo Advanced Search prefilters

Bloom Filter Trie -- A Data Structure for Pan-Genome Storage

Holley, Guillaume ; Wittler, Roland ; Stoye, Jens

Algorithms in Bioinformatics, 2015, Vol.9289, p.217-230 [Periódico revisado por pares]

Germany: Springer Berlin / Heidelberg

Texto completo disponível

Citações Citado por
  • Título:
    Bloom Filter Trie -- A Data Structure for Pan-Genome Storage
  • Autor: Holley, Guillaume ; Wittler, Roland ; Stoye, Jens
  • Assuntos: Algorithms & data structures ; Bloom Filter ; Discrete mathematics ; Hash Function ; Life sciences: general issues ; Real Dataset ; Simulated Dataset ; Suffix Tree
  • É parte de: Algorithms in Bioinformatics, 2015, Vol.9289, p.217-230
  • Descrição: High throughput sequencing technologies have become fast and cheap in the past years. As a result, large-scale projects started to sequence tens to several thousands of genomes per species, producing a high number of sequences sampled from each genome. Such a highly redundant collection of very similar sequences is called a pan-genome. It can be transformed into a set of sequences “colored” by the genomes to which they belong. A colored de-Bruijn graph (C-DBG) extracts from the sequences all colored k-mers, strings of length k, and stores them in vertices. In this paper, we present an alignment-free, reference-free and incremental data structure for storing a pan-genome as a C-DBG: the Bloom Filter Trie. The data structure allows to store and compress a set of colored k-mers, and also to efficiently traverse the graph. Experimental results prove better performance compared to another state-of-the-art data structure.
  • Títulos relacionados: Lecture Notes in Computer Science
  • Editor: Germany: Springer Berlin / Heidelberg
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.