skip to main content
Visitante
Meu Espaço
Minha Conta
Sair
Identificação
This feature requires javascript
Tags
Revistas Eletrônicas (eJournals)
Livros Eletrônicos (eBooks)
Bases de Dados
Bibliotecas USP
Ajuda
Ajuda
Idioma:
Inglês
Espanhol
Português
This feature required javascript
This feature requires javascript
Primo Search
Busca Geral
Busca Geral
Acervo Físico
Acervo Físico
Produção Intelectual da USP
Produção USP
Search For:
Clear Search Box
Search in:
Busca Geral
Or hit Enter to replace search target
Or select another collection:
Search in:
Busca Geral
Busca Avançada
Busca por Índices
This feature requires javascript
Tipo de recurso
criteria input
qualquer lugar do registro
no título
como autor
no assunto
Data de publicação
lsr01
lsr02
lsr03
lsr04
Orientador
Show Results with:
no título
Show Results with:
qualquer lugar do registro
no título
como autor
no assunto
Data de publicação
lsr01
lsr02
lsr03
lsr04
Orientador
Mostra resultados com:
criteria input
que contêm minhas palavras de busca
com a frase exata
começa com
Mostra resultados com:
Índice
criteria input
E
OU
NÃO
This feature requires javascript
Statistical power for cluster analysis
Dalmaijer, Edwin S ; Nord, Camilla L ; Astle, Duncan E
BMC bioinformatics, 2022-05, Vol.23 (1), p.205-205, Article 205
[Periódico revisado por pares]
England: BioMed Central Ltd
Texto completo disponível
Citações
Citado por
Exibir Online
Detalhes
Resenhas & Tags
Mais Opções
Nº de Citações
This feature requires javascript
Enviar para
Adicionar ao Meu Espaço
Remover do Meu Espaço
E-mail (máximo 30 registros por vez)
Imprimir
Link permanente
Referência
EasyBib
EndNote
RefWorks
del.icio.us
Exportar RIS
Exportar BibTeX
This feature requires javascript
Título:
Statistical power for cluster analysis
Autor:
Dalmaijer, Edwin S
;
Nord, Camilla L
;
Astle, Duncan E
Assuntos:
Algorithms
;
Centroids
;
Cluster Analysis
;
Clustering
;
Computational biology
;
Covariance
;
Datasets
;
Dimensionality reduction
;
Humans
;
Hypotheses
;
Latent class analysis
;
Latent profile analysis
;
Medical research
;
Methods
;
Mixtures
;
Modelling
;
Multivariate analysis
;
Normal Distribution
;
Ovaries
;
Polycystic ovary syndrome
;
Reduction
;
Sample Size
;
Separation
;
Simulation
;
Software
;
Statistical methods
;
Statistical power
;
Statistics
;
Subgroups
É parte de:
BMC bioinformatics, 2022-05, Vol.23 (1), p.205-205, Article 205
Notas:
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Descrição:
Cluster algorithms are gaining in popularity in biomedical research due to their compelling ability to identify discrete subgroups in data, and their increasing accessibility in mainstream software. While guidelines exist for algorithm selection and outcome evaluation, there are no firmly established ways of computing a priori statistical power for cluster analysis. Here, we estimated power and classification accuracy for common analysis pipelines through simulation. We systematically varied subgroup size, number, separation (effect size), and covariance structure. We then subjected generated datasets to dimensionality reduction approaches (none, multi-dimensional scaling, or uniform manifold approximation and projection) and cluster algorithms (k-means, agglomerative hierarchical clustering with Ward or average linkage and Euclidean or cosine distance, HDBSCAN). Finally, we directly compared the statistical power of discrete (k-means), "fuzzy" (c-means), and finite mixture modelling approaches (which include latent class analysis and latent profile analysis). We found that clustering outcomes were driven by large effect sizes or the accumulation of many smaller effects across features, and were mostly unaffected by differences in covariance structure. Sufficient statistical power was achieved with relatively small samples (N = 20 per subgroup), provided cluster separation is large (Δ = 4). Finally, we demonstrated that fuzzy clustering can provide a more parsimonious and powerful alternative for identifying separable multivariate normal distributions, particularly those with slightly lower centroid separation (Δ = 3). Traditional intuitions about statistical power only partially apply to cluster analysis: increasing the number of participants above a sufficient sample size did not improve power, but effect size was crucial. Notably, for the popular dimensionality reduction and clustering algorithms tested here, power was only satisfactory for relatively large effect sizes (clear separation between subgroups). Fuzzy clustering provided higher power in multivariate normal distributions. Overall, we recommend that researchers (1) only apply cluster analysis when large subgroup separation is expected, (2) aim for sample sizes of N = 20 to N = 30 per expected subgroup, (3) use multi-dimensional scaling to improve cluster separation, and (4) use fuzzy clustering or mixture modelling approaches that are more powerful and more parsimonious with partially overlapping multivariate normal distributions.
Editor:
England: BioMed Central Ltd
Idioma:
Inglês
Links
View this record in MEDLINE/PubMed
This feature requires javascript
This feature requires javascript
Voltar para lista de resultados
Anterior
Resultado
2
Avançar
This feature requires javascript
This feature requires javascript
Buscando em bases de dados remotas. Favor aguardar.
Buscando por
em
scope:(USP_VIDEOS),scope:("PRIMO"),scope:(USP_FISICO),scope:(USP_EREVISTAS),scope:(USP),scope:(USP_EBOOKS),scope:(USP_PRODUCAO),primo_central_multiple_fe
Mostrar o que foi encontrado até o momento
This feature requires javascript
This feature requires javascript