Idioma:

Bloom Filter Trie -- A Data Structure for Pan-Genome Storage

Holley, Guillaume ; Wittler, Roland ; Stoye, Jens

Algorithms in Bioinformatics, 2015, Vol.9289, p.217-230 [Periódico revisado por pares]

Texto completo disponível

Citações Citado por

Título:
Bloom Filter Trie -- A Data Structure for Pan-Genome Storage
Autor: Holley, Guillaume ; Wittler, Roland ; Stoye, Jens
Assuntos: Algorithms & data structures ; Bloom Filter ; Discrete mathematics ; Hash Function ; Life sciences: general issues ; Real Dataset ; Simulated Dataset ; Suffix Tree
É parte de: Algorithms in Bioinformatics, 2015, Vol.9289, p.217-230
Descrição: High throughput sequencing technologies have become fast and cheap in the past years. As a result, large-scale projects started to sequence tens to several thousands of genomes per species, producing a high number of sequences sampled from each genome. Such a highly redundant collection of very similar sequences is called a pan-genome. It can be transformed into a set of sequences “colored” by the genomes to which they belong. A colored de-Bruijn graph (C-DBG) extracts from the sequences all colored k-mers, strings of length k, and stores them in vertices. In this paper, we present an alignment-free, reference-free and incremental data structure for storing a pan-genome as a C-DBG: the Bloom Filter Trie. The data structure allows to store and compress a set of colored k-mers, and also to efficiently traverse the graph. Experimental results prove better performance compared to another state-of-the-art data structure.
Títulos relacionados: Lecture Notes in Computer Science
Editor: Germany: Springer Berlin / Heidelberg
Idioma: Inglês

Voltar para lista de resultados

Anterior Resultado 15 Avançar Ir para próxima página